Source N: VoteView Member Ideology Scores Processing#
Overview
This notebook processes VoteView Member Ideology scores from the 119th U.S. Congress to generate the Source N dataset used in the Bridge Grades methodology to measure ideological distance from the political center.
Source N: VoteView Member Ideology Scores - Measures absolute distance from ideological center for both House and Senate
The ideology scores generated here will be merged into the master district-level dataset for final Bridge Grade calculations, providing context for evaluating bipartisan behavior relative to ideological positioning.
Data Sources#
Input Files#
HS119_members.csv
- VoteView Member Ideology data for 119th Congress119th_Congress_*.csv
- Congressional metadata with bioguide IDs and chamber assignments
Data Source Details#
Source: VoteView (Data Type: Member Ideology; Chamber: Both; Congress: 119th; CSV)
Congress: 119th U.S. Congress
Download Date: August 8, 2025
Coverage: Member ideology scores for all House and Senate members
Key Metric:
nominate_dim1
scores measuring ideological position
Outputs#
Source N: House Ideology#
File: bridge_grade_source_n_house_ideology.csv
Columns:
Name
: Legislator’s full namebioguide_id
: Unique legislator identifierideology_dist
: Absolute distance from ideological center (raw score)
Interpretation: Higher distance values indicate legislators further from the political center. Higher percentiles indicate legislators who are more ideologically extreme relative to their House colleagues.
Source N: Senate Ideology#
File: bridge_grade_source_n_senate_ideology.csv
Columns:
Name
: Senator’s full namebioguide_id
: Unique legislator identifierideology_dist
: Absolute distance from ideological center (raw score)
Interpretation: Higher distance values indicate senators further from the political center. Higher percentiles indicate senators who are more ideologically extreme relative to their Senate colleagues.
Technical Requirements#
Dependencies#
pandas
: Data manipulation and analysisnumpy
: Numerical operationsscipy.stats
: Statistical functions for normal CDF calculationswarnings
: Warning suppression for pandas operations
Performance Notes#
Absolute value transformation ensures non-negative ideology distance scores
Chamber-specific percentile calculations account for different ideological distributions
Missing value imputation preserves data completeness
Normal CDF provides standardized percentile rankings
Data Quality#
Data Integrity Notes#
VoteView ideology scores are based on roll-call voting patterns
Absolute distance transformation removes directional bias
Missing value imputation with global mean maintains data completeness
Chamber-specific percentiles ensure fair comparison within each body
Key Features#
House Coverage: All 431 House members with ideology scores
Senate Coverage: All 100 Senate members with ideology scores
Standardized Format: Consistent output structure for both chambers
Ideology Score Interpretation#
Raw Scores: Range from 0 (centrist) to 1 (extremely ideological)
Absolute Distance: Measures how far from center regardless of left/right direction
Percentiles: Rank legislators within their chamber (0-100 scale)
Higher Values: Indicate more ideologically extreme positions
Notebook Walkthrough: Source N - VoteView Member Ideology Scores#
This notebook prepares the Member Ideology scores downloaded from VoteView for use in Bridge Grades. We:
Download the HS119_members.csv file from VoteView (Data Type: Member Ideology; Chamber: Both; Congress: 119th; CSV)
Date downloaded: August 8, 2025
Clean and normalize the raw
nominate_dim1
scores (absolute distance from ideological center).Split into House and Senate subsets, merge into our district/member metadata, and compute percentile scores.
Export two CSVs:
bridge_grade_source_n_house_ideology.csv
bridge_grade_source_n_senate_ideology.csv
import pandas as pd
import numpy as np
from scipy.stats import norm
import glob
import warnings
from pandas.errors import SettingWithCopyWarning
warnings.simplefilter(action='ignore', category=SettingWithCopyWarning)
Load & Inspect Raw Ideology Data#
# Load VoteView ideology scores for 119th Congress
ideology = pd.read_csv(
'../Data/Source N/Input files/HS119_members.csv',
usecols=['bioguide_id','nominate_dim1']
).copy()
# Preview
ideology.head(10)
Clean & take absolute value from the Raw Scores#
Absolute Value & Missing-Value Diagnostics#
# Convert to absolute distance from center
ideology['N_dim1'] = ideology['nominate_dim1'].abs()
# Report missing bioguide_id or score
missing_biog = ideology['bioguide_id'].isna().sum()
missing_score = ideology['N_dim1'].isna().sum()
print(f"Missing bioguide_id: {missing_biog}, Missing scores: {missing_score}")
Missing bioguide_id: 0, Missing scores: 2
ideology[ideology['N_dim1'].isna()] # The two missing legislators are actually two non-voting members from territories who will not be added to the final score.
bioguide_id | nominate_dim1 | N_dim1 | |
---|---|---|---|
302 | K000404 | NaN | NaN |
346 | H001103 | NaN | NaN |
Fill Missing Scores & Drop Originals#
# Because these two missing people are non-voting members, we will remove the two rows from the dataframe
ideology = ideology[ideology['bioguide_id'].notna()]
# rename the column N_dim 1 to ideology_dist
# Rename for clarity
ideology.rename(columns={'N_dim1':'ideology_dist'}, inplace=True)
Split into House & Senate#
We will merge into our master metadata later; for now we separate to compute percentile scores per chamber.
# Read in the 119th Congress data with bioguide ids
files = sorted(glob.glob("../Data/Source C-D-E-F/Input files/119th_Congress_*.csv"))
latest = files[-1]
df_119 = pd.read_csv(latest)
df_bioguide = df_119.copy()
df_bioguide = df_bioguide[['Name','bioguide_id','Chamber']]
df_bioguide.head()
Name | bioguide_id | Chamber | |
---|---|---|---|
0 | Mark B. Messmer | M001233 | House |
1 | Delia C. Ramirez | R000617 | House |
2 | Tim Sheehy | S001232 | Senate |
3 | Ben Ray Luján | L000570 | Senate |
4 | Josh Hawley | H001089 | Senate |
# Merge chamber info onto ideology
ideology = ideology.merge(df_bioguide, on='bioguide_id', how='left')
# Split sets
ideo_house = ideology.query("Chamber=='House'").copy()
ideo_senate = ideology.query("Chamber=='Senate'").copy()
Compute Percentile Scores & Export#
House Percentiles & Export#
# Deduplicate if needed
ideo_house = ideo_house.drop_duplicates(subset='bioguide_id')
# Select and save
house_out = ideo_house[['Name','bioguide_id','nominate_dim1','ideology_dist']]
house_out.to_csv(
'../Data/Source N/Output files/bridge_grade_source_n_house_ideology.csv',
index=False
)
house_out.head(10)
Name | bioguide_id | nominate_dim1 | ideology_dist | |
---|---|---|---|---|
0 | Mike D. Rogers | R000575 | 0.379 | 0.379 |
1 | Terri A. Sewell | S001185 | -0.400 | 0.400 |
2 | Gary J. Palmer | P000609 | 0.675 | 0.675 |
3 | Barry Moore | M001212 | 0.645 | 0.645 |
4 | Dale W. Strong | S001220 | 0.615 | 0.615 |
5 | Shomari Figures | F000481 | -0.369 | 0.369 |
6 | Robert B. Aderholt | A000055 | 0.405 | 0.405 |
7 | Nicholas J. Begich | B001323 | 0.612 | 0.612 |
10 | Paul A. Gosar | G000565 | 0.703 | 0.703 |
11 | David Schweikert | S001183 | 0.553 | 0.553 |
Senate Percentiles & Export#
# Deduplicate
ideo_senate = ideo_senate.drop_duplicates(subset='bioguide_id')
# Select and save
senate_out = ideo_senate[['Name','bioguide_id','nominate_dim1','ideology_dist']]
senate_out.to_csv(
'../Data/Source N/Output files/bridge_grade_source_n_senate_ideology.csv',
index=False
)
senate_out.head(10)
Name | bioguide_id | nominate_dim1 | ideology_dist | |
---|---|---|---|---|
442 | Tommy Tuberville | T000278 | 0.936 | 0.936 |
443 | Katie Boyd Britt | B001319 | 0.585 | 0.585 |
444 | Lisa Murkowski | M001153 | 0.204 | 0.204 |
445 | Dan Sullivan | S001198 | 0.476 | 0.476 |
446 | Ruben Gallego | G000574 | -0.359 | 0.359 |
447 | Mark Kelly | K000377 | -0.257 | 0.257 |
448 | John Boozman | B001236 | 0.427 | 0.427 |
449 | Tom Cotton | C001095 | 0.580 | 0.580 |
450 | Adam B. Schiff | S001150 | -0.353 | 0.353 |
451 | Alex Padilla | P000145 | -0.422 | 0.422 |