Source N: VoteView Member Ideology Scores Processing

Source N: VoteView Member Ideology Scores Processing#

Overview

This notebook processes VoteView Member Ideology scores from the 119th U.S. Congress to generate the Source N dataset used in the Bridge Grades methodology to measure ideological distance from the political center.

Source N: VoteView Member Ideology Scores - Measures absolute distance from ideological center for both House and Senate

The ideology scores generated here will be merged into the master district-level dataset for final Bridge Grade calculations, providing context for evaluating bipartisan behavior relative to ideological positioning.

Data Sources#

Input Files#

HS119_members.csv - VoteView Member Ideology data for 119th Congress
119th_Congress_*.csv - Congressional metadata with bioguide IDs and chamber assignments

Data Source Details#

Source: VoteView (Data Type: Member Ideology; Chamber: Both; Congress: 119th; CSV)
Congress: 119th U.S. Congress
Download Date: August 8, 2025
Coverage: Member ideology scores for all House and Senate members
Key Metric: nominate_dim1 scores measuring ideological position

Outputs#

Source N: House Ideology#

File: bridge_grade_source_n_house_ideology.csv

Columns:

Name: Legislator’s full name
bioguide_id: Unique legislator identifier
ideology_dist: Absolute distance from ideological center (raw score)

Interpretation: Higher distance values indicate legislators further from the political center. Higher percentiles indicate legislators who are more ideologically extreme relative to their House colleagues.

Source N: Senate Ideology#

File: bridge_grade_source_n_senate_ideology.csv

Columns:

Name: Senator’s full name
bioguide_id: Unique legislator identifier
ideology_dist: Absolute distance from ideological center (raw score)

Interpretation: Higher distance values indicate senators further from the political center. Higher percentiles indicate senators who are more ideologically extreme relative to their Senate colleagues.

Technical Requirements#

Dependencies#

pandas: Data manipulation and analysis
numpy: Numerical operations
scipy.stats: Statistical functions for normal CDF calculations
warnings: Warning suppression for pandas operations

Performance Notes#

Absolute value transformation ensures non-negative ideology distance scores
Chamber-specific percentile calculations account for different ideological distributions
Missing value imputation preserves data completeness
Normal CDF provides standardized percentile rankings

Data Quality#

Data Integrity Notes#

VoteView ideology scores are based on roll-call voting patterns
Absolute distance transformation removes directional bias
Missing value imputation with global mean maintains data completeness
Chamber-specific percentiles ensure fair comparison within each body

Key Features#

House Coverage: All 431 House members with ideology scores
Senate Coverage: All 100 Senate members with ideology scores
Standardized Format: Consistent output structure for both chambers

Ideology Score Interpretation#

Raw Scores: Range from 0 (centrist) to 1 (extremely ideological)
Absolute Distance: Measures how far from center regardless of left/right direction
Percentiles: Rank legislators within their chamber (0-100 scale)
Higher Values: Indicate more ideologically extreme positions

Notebook Walkthrough: Source N - VoteView Member Ideology Scores#

This notebook prepares the Member Ideology scores downloaded from VoteView for use in Bridge Grades. We:

Download the HS119_members.csv file from VoteView (Data Type: Member Ideology; Chamber: Both; Congress: 119th; CSV)
- URL: https://voteview.com/data
- Date downloaded: August 8, 2025
Clean and normalize the raw nominate_dim1 scores (absolute distance from ideological center).
Split into House and Senate subsets, merge into our district/member metadata, and compute percentile scores.
Export two CSVs:
- bridge_grade_source_n_house_ideology.csv
- bridge_grade_source_n_senate_ideology.csv

import pandas as pd
import numpy as np
from scipy.stats import norm
import glob
import warnings
from pandas.errors import SettingWithCopyWarning
warnings.simplefilter(action='ignore', category=SettingWithCopyWarning)

Load & Inspect Raw Ideology Data#

# Load VoteView ideology scores for 119th Congress
ideology = pd.read_csv(
    '../Data/Source N/Input files/HS119_members.csv',
    usecols=['bioguide_id','nominate_dim1']
).copy()

# Preview
ideology.head(10)

Clean & take absolute value from the Raw Scores#

Absolute Value & Missing-Value Diagnostics#

# Convert to absolute distance from center
ideology['N_dim1'] = ideology['nominate_dim1'].abs()

# Report missing bioguide_id or score
missing_biog = ideology['bioguide_id'].isna().sum()
missing_score = ideology['N_dim1'].isna().sum()
print(f"Missing bioguide_id: {missing_biog}, Missing scores: {missing_score}")

Missing bioguide_id: 0, Missing scores: 2

ideology[ideology['N_dim1'].isna()] # The two missing legislators are actually two non-voting members from territories who will not be added to the final score.

	bioguide_id	nominate_dim1	N_dim1
302	K000404	NaN	NaN
346	H001103	NaN	NaN

Fill Missing Scores & Drop Originals#

# Because these two missing people are non-voting members, we will remove the two rows from the dataframe
ideology = ideology[ideology['bioguide_id'].notna()]

# rename the column N_dim 1 to ideology_dist
# Rename for clarity
ideology.rename(columns={'N_dim1':'ideology_dist'}, inplace=True)

Split into House & Senate#

We will merge into our master metadata later; for now we separate to compute percentile scores per chamber.

# Read in the 119th Congress data with bioguide ids
files = sorted(glob.glob("../Data/Source C-D-E-F/Input files/119th_Congress_*.csv"))
latest = files[-1]
df_119 = pd.read_csv(latest)
df_bioguide = df_119.copy()
df_bioguide = df_bioguide[['Name','bioguide_id','Chamber']]
df_bioguide.head()

	Name	bioguide_id	Chamber
0	Mark B. Messmer	M001233	House
1	Delia C. Ramirez	R000617	House
2	Tim Sheehy	S001232	Senate
3	Ben Ray Luján	L000570	Senate
4	Josh Hawley	H001089	Senate

# Merge chamber info onto ideology
ideology = ideology.merge(df_bioguide, on='bioguide_id', how='left')

# Split sets
ideo_house  = ideology.query("Chamber=='House'").copy()
ideo_senate = ideology.query("Chamber=='Senate'").copy()

Compute Percentile Scores & Export#

House Percentiles & Export#

# Deduplicate if needed
ideo_house = ideo_house.drop_duplicates(subset='bioguide_id')

# Select and save
house_out = ideo_house[['Name','bioguide_id','nominate_dim1','ideology_dist']]
house_out.to_csv(
    '../Data/Source N/Output files/bridge_grade_source_n_house_ideology.csv',
    index=False
)

house_out.head(10)

	Name	bioguide_id	nominate_dim1	ideology_dist
0	Mike D. Rogers	R000575	0.379	0.379
1	Terri A. Sewell	S001185	-0.400	0.400
2	Gary J. Palmer	P000609	0.675	0.675
3	Barry Moore	M001212	0.645	0.645
4	Dale W. Strong	S001220	0.615	0.615
5	Shomari Figures	F000481	-0.369	0.369
6	Robert B. Aderholt	A000055	0.405	0.405
7	Nicholas J. Begich	B001323	0.612	0.612
10	Paul A. Gosar	G000565	0.703	0.703
11	David Schweikert	S001183	0.553	0.553

Senate Percentiles & Export#

# Deduplicate
ideo_senate = ideo_senate.drop_duplicates(subset='bioguide_id')

# Select and save
senate_out = ideo_senate[['Name','bioguide_id','nominate_dim1','ideology_dist']]
senate_out.to_csv(
    '../Data/Source N/Output files/bridge_grade_source_n_senate_ideology.csv',
    index=False
)

senate_out.head(10)

	Name	bioguide_id	nominate_dim1	ideology_dist
442	Tommy Tuberville	T000278	0.936	0.936
443	Katie Boyd Britt	B001319	0.585	0.585
444	Lisa Murkowski	M001153	0.204	0.204
445	Dan Sullivan	S001198	0.476	0.476
446	Ruben Gallego	G000574	-0.359	0.359
447	Mark Kelly	K000377	-0.257	0.257
448	John Boozman	B001236	0.427	0.427
449	Tom Cotton	C001095	0.580	0.580
450	Adam B. Schiff	S001150	-0.353	0.353
451	Alex Padilla	P000145	-0.422	0.422