Projects

Every research dive, from data pulls to published findings.

published

Flood Map Reality Gap

Quantifying where FEMA flood maps have fallen behind reality — 79.4M daily precipitation observations, 1,099 county map dates, and a lag index that scores where outdated maps meet worsening rainfall.

climatefemanoaapublic-dataflood-riskgeospatial
in-progress

US Real Estate Deep Dive

Comprehensive market analysis across 26,297 ZIP codes using data from Zillow, Redfin, Census, FRED, and BLS. 13 research entries covering affordability, investment opportunities, price trends, and market dynamics.

real-estatepublic-datamarket-analysisinvestmentaffordability
published

US Banking Concentration

A data-driven look at the structure of the US banking system — 4,408 FDIC-insured banks, $25.5 trillion in assets, and the striking concentration of financial power in a handful of institutions.

bankingfinanceconcentrationpublic-datafdic
scaffolded

SEC EDGAR Full-Text Filings Mining

Mining 10-K risk factors, 8-K filings, and proxy statements to track language contagion across the S&P 500 — when novel risk phrases first appear and how they spread industry by industry.

secedgarnlprisk-factorspublic-datatext-analysis
scaffolded

Form 4 Insider Trading Patterns

Parsing every SEC Form 4 filing to detect cluster buying and selling behavior, 10b5-1 plan timing anomalies, and director interlock contagion across company boards.

secinsider-tradingform4public-datainvesting
scaffolded

Corporate Jet Tracker

Combining FAA aircraft registration data with ADS-B flight tracking to map which corporate jets visited which airports — a leading indicator for M&A, executive hiring, and PE deal sourcing.

faaadsbmaprivate-equityphysical-signals
scaffolded

USPTO Patent Assignment Movements

Tracking patent ownership changes to reveal distressed IP sales before bankruptcy, NPE acquisition patterns, and which university research actually gets commercialized.

patentsusptoipbankruptcypublic-data
scaffolded

GitHub Maintainer Reality Check

Using GH Archive and the GitHub API to determine whether 'open source' projects are actually corporate-staffed, and producing a maintainer reality index for the top 500 npm and PyPI packages.

githubopen-sourcemetadatanpmpypi
scaffolded

Wikipedia Edit Wars and Pageview Prophecy

Analyzing Wikipedia's full edit history and hourly pageview counts to detect edit war patterns and pageview anomalies that sometimes precede news events by hours.

wikipediametadatanlpnewspublic-data
scaffolded

PACER Federal Court Filings

Using the CourtListener/RECAP archive to analyze patent venue migration post-TC Heartland, MDL formation timing, and the rise of Southern District of Texas as the new mega-bankruptcy venue.

pacercourtspatentsbankruptcypublic-data
scaffolded

FCC Political Ad Spending

Extracting and structuring political ad order data from FCC Public Inspection Files — more granular than FEC data, showing which station, which time slot, and which dollar amount for every political buy.

fccpolitical-adselectionsocrpublic-data
in-progress

IRS Form 990 Nonprofit Network Mapping

Mapping grant networks, executive compensation, and board interlocks across US nonprofits using IRS Form 990 data — who funds whom, and which directors sit at the center of the donor class network.

irsnonprofits990grantspublic-data
scaffolded

USDA Food Environment + Health Outcomes

A rigorous causal study of how food environment changes affect health outcomes, using Dollar General county entry as a natural experiment with difference-in-differences estimation.

usdacdcfood-accesshealthcausal-inferencepublic-data
scaffolded

CMS Open Payments — Pharma Money to Doctors

Joining CMS Open Payments (every pharma payment to a physician >$10) with Medicare Part D prescribing data to quantify the dose-response relationship between payments received and drugs prescribed.

cmspharmamedicaresunshine-actpublic-datahealth
scaffolded

BLS JOLTS + Job Postings for AI Displacement

Tracking skill demand decay curves for AI-displaced occupations and growth curves for AI-adjacent roles, using BLS data and job posting indices to produce a skill half-life chart for 20 occupations.

blsailaborjob-marketpublic-data
in-progress

Money Map — US Capital Concentration

A graph-first analysis of US capital: where money sits, where it flows, and which entities exert disproportionate control over money they don't technically own.

financeconcentrationfdicsecirspublic-datanetwork-analysis
scaffolded

ClinicalTrials.gov + FDA Adverse Events

Cross-referencing clinical trial registrations with FDA adverse event reports to find where public data diverges from press releases, and detecting safety signals before they appear on drug labels.

fdaclinical-trialspharmabiotechadverse-eventspublic-data