Source N: VoteView Member Ideology Scores Processing#

Overview

This notebook processes VoteView Member Ideology scores from the 119th U.S. Congress to generate the Source N dataset used in the Bridge Grades methodology to measure ideological distance from the political center.

  • Source N: VoteView Member Ideology Scores - Measures absolute distance from ideological center for both House and Senate

The ideology scores generated here will be merged into the master district-level dataset for final Bridge Grade calculations, providing context for evaluating bipartisan behavior relative to ideological positioning.

Data Sources#

Input Files#

  • HS119_members.csv - VoteView Member Ideology data for 119th Congress

  • 119th_Congress_*.csv - Congressional metadata with bioguide IDs and chamber assignments

Data Source Details#

  • Source: VoteView (Data Type: Member Ideology; Chamber: Both; Congress: 119th; CSV)

  • Congress: 119th U.S. Congress

  • Download Date: August 8, 2025

  • Coverage: Member ideology scores for all House and Senate members

  • Key Metric: nominate_dim1 scores measuring ideological position


Outputs#

Source N: House Ideology#

File: bridge_grade_source_n_house_ideology.csv

Columns:

  • Name: Legislator’s full name

  • bioguide_id: Unique legislator identifier

  • ideology_dist: Absolute distance from ideological center (raw score)

Interpretation: Higher distance values indicate legislators further from the political center. Higher percentiles indicate legislators who are more ideologically extreme relative to their House colleagues.

Source N: Senate Ideology#

File: bridge_grade_source_n_senate_ideology.csv

Columns:

  • Name: Senator’s full name

  • bioguide_id: Unique legislator identifier

  • ideology_dist: Absolute distance from ideological center (raw score)

Interpretation: Higher distance values indicate senators further from the political center. Higher percentiles indicate senators who are more ideologically extreme relative to their Senate colleagues.


Technical Requirements#

Dependencies#

  • pandas: Data manipulation and analysis

  • numpy: Numerical operations

  • scipy.stats: Statistical functions for normal CDF calculations

  • warnings: Warning suppression for pandas operations

Performance Notes#

  • Absolute value transformation ensures non-negative ideology distance scores

  • Chamber-specific percentile calculations account for different ideological distributions

  • Missing value imputation preserves data completeness

  • Normal CDF provides standardized percentile rankings


Data Quality#

Data Integrity Notes#

  • VoteView ideology scores are based on roll-call voting patterns

  • Absolute distance transformation removes directional bias

  • Missing value imputation with global mean maintains data completeness

  • Chamber-specific percentiles ensure fair comparison within each body

Key Features#

  • House Coverage: All 431 House members with ideology scores

  • Senate Coverage: All 100 Senate members with ideology scores

  • Standardized Format: Consistent output structure for both chambers

Ideology Score Interpretation#

  • Raw Scores: Range from 0 (centrist) to 1 (extremely ideological)

  • Absolute Distance: Measures how far from center regardless of left/right direction

  • Percentiles: Rank legislators within their chamber (0-100 scale)

  • Higher Values: Indicate more ideologically extreme positions


Notebook Walkthrough: Source N - VoteView Member Ideology Scores#

This notebook prepares the Member Ideology scores downloaded from VoteView for use in Bridge Grades. We:

  • Download the HS119_members.csv file from VoteView (Data Type: Member Ideology; Chamber: Both; Congress: 119th; CSV)

  • Clean and normalize the raw nominate_dim1 scores (absolute distance from ideological center).

  • Split into House and Senate subsets, merge into our district/member metadata, and compute percentile scores.

  • Export two CSVs:

    • bridge_grade_source_n_house_ideology.csv

    • bridge_grade_source_n_senate_ideology.csv

import pandas as pd
import numpy as np
from scipy.stats import norm
import glob
import warnings
from pandas.errors import SettingWithCopyWarning
warnings.simplefilter(action='ignore', category=SettingWithCopyWarning)

Load & Inspect Raw Ideology Data#

# Load VoteView ideology scores for 119th Congress
ideology = pd.read_csv(
    '../Data/Source N/Input files/HS119_members.csv',
    usecols=['bioguide_id','nominate_dim1']
).copy()

# Preview
ideology.head(10)

Clean & take absolute value from the Raw Scores#

Absolute Value & Missing-Value Diagnostics#

# Convert to absolute distance from center
ideology['N_dim1'] = ideology['nominate_dim1'].abs()

# Report missing bioguide_id or score
missing_biog = ideology['bioguide_id'].isna().sum()
missing_score = ideology['N_dim1'].isna().sum()
print(f"Missing bioguide_id: {missing_biog}, Missing scores: {missing_score}")
Missing bioguide_id: 0, Missing scores: 2
ideology[ideology['N_dim1'].isna()] # The two missing legislators are actually two non-voting members from territories who will not be added to the final score.
bioguide_id nominate_dim1 N_dim1
302 K000404 NaN NaN
346 H001103 NaN NaN

Fill Missing Scores & Drop Originals#

# Because these two missing people are non-voting members, we will remove the two rows from the dataframe
ideology = ideology[ideology['bioguide_id'].notna()]

# rename the column N_dim 1 to ideology_dist
# Rename for clarity
ideology.rename(columns={'N_dim1':'ideology_dist'}, inplace=True)

Split into House & Senate#

We will merge into our master metadata later; for now we separate to compute percentile scores per chamber.

# Read in the 119th Congress data with bioguide ids
files = sorted(glob.glob("../Data/Source C-D-E-F/Input files/119th_Congress_*.csv"))
latest = files[-1]
df_119 = pd.read_csv(latest)
df_bioguide = df_119.copy()
df_bioguide = df_bioguide[['Name','bioguide_id','Chamber']]
df_bioguide.head()
Name bioguide_id Chamber
0 Mark B. Messmer M001233 House
1 Delia C. Ramirez R000617 House
2 Tim Sheehy S001232 Senate
3 Ben Ray Luján L000570 Senate
4 Josh Hawley H001089 Senate
# Merge chamber info onto ideology
ideology = ideology.merge(df_bioguide, on='bioguide_id', how='left')

# Split sets
ideo_house  = ideology.query("Chamber=='House'").copy()
ideo_senate = ideology.query("Chamber=='Senate'").copy()

Compute Percentile Scores & Export#

House Percentiles & Export#

# Deduplicate if needed
ideo_house = ideo_house.drop_duplicates(subset='bioguide_id')

# Select and save
house_out = ideo_house[['Name','bioguide_id','nominate_dim1','ideology_dist']]
house_out.to_csv(
    '../Data/Source N/Output files/bridge_grade_source_n_house_ideology.csv',
    index=False
)
house_out.head(10)
Name bioguide_id nominate_dim1 ideology_dist
0 Mike D. Rogers R000575 0.379 0.379
1 Terri A. Sewell S001185 -0.400 0.400
2 Gary J. Palmer P000609 0.675 0.675
3 Barry Moore M001212 0.645 0.645
4 Dale W. Strong S001220 0.615 0.615
5 Shomari Figures F000481 -0.369 0.369
6 Robert B. Aderholt A000055 0.405 0.405
7 Nicholas J. Begich B001323 0.612 0.612
10 Paul A. Gosar G000565 0.703 0.703
11 David Schweikert S001183 0.553 0.553

Senate Percentiles & Export#

# Deduplicate
ideo_senate = ideo_senate.drop_duplicates(subset='bioguide_id')

# Select and save
senate_out = ideo_senate[['Name','bioguide_id','nominate_dim1','ideology_dist']]
senate_out.to_csv(
    '../Data/Source N/Output files/bridge_grade_source_n_senate_ideology.csv',
    index=False
)
senate_out.head(10)
Name bioguide_id nominate_dim1 ideology_dist
442 Tommy Tuberville T000278 0.936 0.936
443 Katie Boyd Britt B001319 0.585 0.585
444 Lisa Murkowski M001153 0.204 0.204
445 Dan Sullivan S001198 0.476 0.476
446 Ruben Gallego G000574 -0.359 0.359
447 Mark Kelly K000377 -0.257 0.257
448 John Boozman B001236 0.427 0.427
449 Tom Cotton C001095 0.580 0.580
450 Adam B. Schiff S001150 -0.353 0.353
451 Alex Padilla P000145 -0.422 0.422