Synthetic Polling · A Civly Product

We predicted 234 counties without making a single phone call.

Traditional polls are slow, expensive, and harder to field every year. So we built a different instrument: simulate every real voter from the voter file with AI, then correct the result against certified election outcomes — for candidate races, starting from each county's own last certified result, the way any serious forecast does. Across three states it called the 2024 presidential margin in all 234 counties to an average of 1.7 points, and six Florida ballot measures to between 2 and 5. Then we took it past prediction: the same engine now tests which messages move your voters — and turns every poll into a ranked list of the specific voters worth contacting.

Talk to us about your race Explore the county map See the message test See the Turnout Score See the ad audiences

234

counties predicted, out of sample

~1.7 pts

average error, presidential margin

~3.4 pts

average error, ballot measures

interviews fielded

One electorate, end to end

Every stage runs on the same scored voter file — so the poll, the message test, the turnout screen, and the outreach list are never four different models. Act, then re-poll to measure the lift.

↻ One run

The same scored
voter file

every stage, one model

Poll

Score every voter

Score every voter, calibrated to certified results.

234 counties · 0 calls

Message-test

See what moves them

See which argument moves which voter.

ranked by lift

Target

Rank who to contact

The poll doubles as a ranked voter list.

292 toss-ups found NEW

Turnout

Who’ll actually vote

Of those, who’ll actually show up.

81% vs 61% gut rule NEW

Activate

Calls, mail & Meta ads

Contact them — calls, mail, doors, ads.

ranked & ready

↻ then re-poll to measure the lift — and run it again

The closed loop Five stages, one scored voter file. Hover any stage above to see what it does.

Why polling keeps missing

Modern survey research rests on a chain of assumptions that each passing decade has made harder to satisfy. Three gaps motivate a different kind of instrument.

📞

Fewer people answer

Telephone response rates fell from about 36% in 1997 to roughly 6% by 2018, and keep sliding. When fewer than one in ten people respond, the estimate depends entirely on whether the people who answer resemble the people who don't, and they don't.

36% → 6% response rate, 1997–2018 Source: Pew Research Center

💰

Most of the ballot goes dark

A single quality poll costs thousands of dollars and several days. Presidential and marquee races are polled constantly, while county offices, judgeships, school boards, ballot measures, and primaries number in the thousands and are almost never polled at all.

Granular + broad is priced out of reach Source: Roper Center for Public Opinion Research

🎤

Saying isn't doing

Even a perfect survey measures what people say, which diverges from what they do. In corruption research a roughly 30-point penalty in surveys falls to near zero in real-world behavior. Stated answers carry a built-in validity ceiling.

~30 pts in surveys → ~0 in the field Source: Incerti (2020), American Political Science Review

How it works

The unit of simulation is a single real voter. We build a short profile from public records, ask an AI to answer the question in that voter's voice, add up thousands of them, then apply a standard correction so the numbers line up with real results.

Voter file

Start with real registered voters: registration, vote history, and donor records where they exist.

→

Build a persona

A short profile for each voter from public data: age, party, and other registration signals.

→

Simulate

Ask an AI to answer the question in that voter's voice, a few times each to reflect uncertainty.

→

Aggregate

Add the answers up across the whole population into one synthetic poll: a margin, or a yes share.

→

Calibrate

Apply our standard correction — for candidate races, anchored to the county's last certified result — for the final number, with an honest range.

Throughout, the load-bearing ground truth is certified election results on the actual population we deploy on.

Where the correction starts: the county's own last result

Every serious election forecast starts from how a place actually voted last time — public information, on the shelf before any election, for every county in America. Our candidate-race correction does the same: it takes each county's certified margin from the previous presidential election as the starting point, and the AI engine measures the shift from there — who has moved, in which direction, and by how much. Nothing private, nothing that wouldn't be public before election day. And the test stays closed-book: every county below is predicted with that county held out of the fit.

Margin

The lead in a place, for example the Republican share minus the Democratic share.

Calibrated

Our final predicted number for a place. It's the figure we compare against the actual certified result.

Anchored

For candidate races, the correction starts from the county's last certified presidential result — public, pre-election information — and the engine measures the shift from there.

Leave-one-county-out

Every county is predicted by a correction that never saw it, so the test is closed-book.

1Step 1 · Poll

We predicted this. Here's how close we got.

Each county is shaded by our prediction. Pick a question and a state, then hover or tap any county to compare it against the actual certified result and see the error.

The question, asked of every simulated voter

“In the 2024 election for President, did you vote for the Democratic candidate (Kamala Harris), the Republican candidate (Donald Trump), or another candidate?”

Checked against official certified county results from the N.C. State Board of Elections, the Florida Division of Elections, and the Pennsylvania Department of State.

Question:

State:

Predicted margin

Blue = Democratic, red = Republican. One fixed scale across all three states.

Margin (Republican % minus Democratic %)

D +70D +35EvenR +35R +70

Counties: NC 100, FL 67, PA 67 (234 total). 1,000 voters sampled per county. Every county predicted out of sample (leave-one-county-out), anchored to its own previous certified presidential result.

What the numbers mean

Every figure here is measured out of sample, on counties held out of the fit. Lower is better: the average distance from the certified result, in points.

~1.7

Vote choice, 234 counties — anchored

Each county starts from its own last certified presidential result — public, pre-election information — and the engine measures the shift from there. Across all 234 counties, out of sample, the average miss on the 2024 presidential margin is 1.7 points; the statewide margin lands within 0.4 points in North Carolina, 0.7 in Florida, and 0.5 in Pennsylvania. These are 2024 presidential backtest figures, anchored; the full validation study is available on request.

Average county error1.7 pts

Worst county in the study (Miami-Dade, Fla.)10.2 pts

Statewide margin, all three stateswithin 0.4–0.7 pts

Pennsylvania

1.3

North Carolina

1.7

Florida

2.2

~3.4

Policy, six Florida ballot measures

On the six 2024 Florida ballot measures, our predicted yes share lands about 2 to 5 points from the certified result. Tap a measure to read it on Ballotpedia.

2.2

2.9

2.9

3.4

3.5

5.3

Certified results: Florida Division of Elections. Measure text via Ballotpedia.

The honest range holds

The plus or minus we report comes from the method's own track record on held-out data, not a textbook sampling margin. Tested out of sample, the 95% range contained the truth about 96% of the time, narrower where the method is proven and wider where it's weaker.

Where it's strong, where it strains

The method is strongest at relative comparisons (which county leans more, how places rank) and at absolute levels where local ground truth exists. North Carolina is the hard case — the widest spread in the study: deep-blue urban, the rural Black Belt, deep-red rural — and the anchor carries it: the statewide margin lands within half a point. The clearest follow-up is cross-race validation — calling a different office from the same anchor.

We also tested scandals and traits

Beyond vote choice and ballot measures, we tested character questions: “would you vote for a candidate who did X?” No certified result can grade these — people famously claim a scandal would change their vote, then don't — so we validate and sell the order (which attack hurts most), never the point swing.

It ranks scandals the way real voters do

Published experiments on real voters rank four scandals from most to least damaging: corruption, sexual harassment, financial impropriety, an affair. The simulator was never shown those studies — and produced the identical order.

Simulator

1 Corruption

2 Sexual harassment

3 Financial impropriety

4 Extramarital affair

Real experiments

1 Corruption

2 Sexual harassment

3 Financial impropriety

4 Extramarital affair

Four for four, most to least damaging. Spearman rank correlation +1.00, where 1.00 means the exact same order. Benchmark: Doherty, Dowling & Miller (2014).

Checked against Gallup’s 12 candidate traits

Gallup asked Americans whether they’d vote for a qualified candidate who is over 80, an atheist, a socialist — 12 traits in all — and published the share saying yes for each. Twelve real numbers to hit. Our simulated voters answered the same 12 questions on a 0–10 scale: the trait order came out nearly perfect, and the average miss was 8.3 points.

Average miss vs Gallup’s numbers8.3 pts

Trait order match (1.00 = perfect)0.95

Gallup’s 12 candidate traits, 500 simulated national voters. We ask every question 0–10 rather than yes/no — a flat yes/no forces hesitant voters to a hard no and roughly doubles the miss.

Why we use the ranking, not the raw number

Surveys say a corruption scandal costs a candidate about 41 points; in real elections it moves roughly zero. The simulator lives in the survey world, so its magnitudes run about 2 to 3 times too high. That makes it reliable for ranking which attacks and traits land hardest, and an upper bound — not a forecast — on the raw points.

2Step 2 · Message-test

Then we measured what moves them

A prediction tells you where the race stands. Campaigns get paid to change it. So we re-poll the same simulated electorate once per message and measure every voter’s movement against their own baseline — here on a live 2026 Kansas judicial amendment: 2,000 real registered Johnson County voters, five messages, and a no-message control.

The strongest message tested — “campaign cash in the courtroom,” an argument for No

“Electing judges in partisan races means campaign donors and party bosses in the courtroom. Judges would owe their seats to the special interests that fund their campaigns. Kansas courts should answer to the law, not to campaign cash.”

The quote above is one of five messages we tested, all written to sound like real campaign copy. Two argue for Yes, two argue for No, and the fifth makes no argument at all — it simply lists who has endorsed each side. Each of the 2,000 voters answered the same ballot question after reading each message, and once more with nothing shown — that last run is the “no message” control row in the table below.

How to read this table

One voter, one cell

Each voter answers the ballot question cold — 0–10, how likely are you to vote Yes? — then again after one message. Movement is the new answer against that same voter’s own baseline, in points. A cell is the group’s average.

The signs

Positive = moved toward Yes, negative = toward No. The top-left −12.6: after “campaign cash in the courtroom,” the average voter’s Yes-likelihood fell about 12.6 points — a voter at 57% likely-Yes lands near 44%.

Columns overlap

Every voter counts in “All 2,000,” one lean column, and one party column. Read across a row for who a message moves; read down a column for what moves a group.

The control row is zero

Re-asking with no message “moved” voters +0.2 — pure re-ask noise. Grey cells sit at that floor and mean didn’t move; the strongest cells run 4–19× above it.

Near-zero can be a ceiling

Republicans show +0.2 under both Yes arguments because they already back the measure — no room left to move up. Democrats’ −4.2 under the attacks is the same floor logic. A flat cell on your own base means save the money.

Buy the order, not the decimals

Which message wins and who moves most is the validated signal. The sizes are survey-regime upper bounds — expect real-world movement of roughly 0.3× to 3× the printed number.

Message shown	All 2,000	Toss‑ups (292)	Lean Yes (974)	Lean No (734)	Rep. (818)	Dem. (621)	Unaff. (531)
“Campaign cash in the courtroom” · for No	−12.6	−25.1	−13.3	−6.6	−10.2	−4.2	−25.0
“60 years of merit selection” · for No	−11.8	−23.4	−12.3	−6.6	−9.8	−4.2	−23.8
Endorsement lineup · names only, no argument	−4.3	−12.3	−1.3	−5.1	+0.2	−4.2	−10.9
“Lawyers picking judges” · for Yes	+3.4	+5.7	+0.3	+6.5	+0.2	+5.8	+5.5
“Your right to elect judges” · for Yes	+2.2	+6.5	+0.3	+3.1	+0.2	+1.8	+6.0
No message · re-ask control	+0.2	+0.1	0.0	+0.5	0.0	+0.5	+0.1

Purple = toward No, green = toward Yes, grey = within re-ask noise. Toss-up and lean groups use each voter’s calibrated baseline; Libertarian and other small registrations (30 voters) are in the totals but not broken out. Full validation and caveats: in the complete study, available on request.

Opposition out-punched support, four to one

The two No arguments moved the electorate 12.6 and 11.8 points; the two Yes arguments managed 3.4 and 2.2. That asymmetry — attacks beat positives on ballot measures — is exactly what published experiments find, and the engine reproduced it blind in replication before we ever saw it live.

Undecided voters moved most

Toss-ups swung 25.1 points under the strongest attack — nearly twice the movement of Yes-leaners and four times that of No-leaners — and they moved first or a close second on all five messages. The persuasion opportunity concentrates exactly where the race is decided.

A bare endorsement list moved the unaffiliated

The fifth “message” made no argument at all — it just named each side’s backers. Republicans heard that their party leads the Yes side and didn’t move (+0.2). Unaffiliated voters heard the same list and broke 10.9 points toward No. Who stands behind a measure is a message.

The control says it’s signal, not noise

Re-asking with no message moved voters by about 0.07 on the 0–10 scale — under a point. Every real message produced 4 to 19 times that. When a segment doesn’t move in this table, that’s a finding too: it means don’t spend there.

3Step 3 · Target

Every poll doubles as a voter list

The unit of simulation is an individual registered voter — so the same run that produces the topline also scores every voter in it: how close they sit to the fence, how unstable their answers are across repeated asks, whether they’re cross-pressured against their registration, and which message moved them most.

292

Toss-up voters, identified from the voter file

Among Johnson County’s 2,000 scored voters, 292 sat genuinely on the fence. Every one of them moved toward No by double digits (upper bound) under their best message, and 152 were also movable toward Yes; across the full file, 265 voters could be pulled in both directions — the conflicted middle every campaign is guessing at. Each exported row carries a best-message tag: which argument moved that voter most.

Built to act on — and books that balance

Lists export ranked, by county and precinct, ready for calls, mail, and canvass. In this run a mailing address was on file for 100% of voters and a phone for 63% (the Kansas file carries no emails). And the per-voter scores must re-aggregate exactly to the published topline — 52.1% calibrated Yes here — or the run fails loudly. The list is never a different model from the poll.

Five rows from the real export — identity withheld

Started at (0–10)	Registration	Strongest pull toward No	Strongest pull toward Yes	Contact on file
5.0 · toss-up	Libertarian, 30s	−37 pts · campaign cash	+3 · lawyers picking judges	address
4.7 · toss-up	Unaffiliated, 30s	−23 pts · campaign cash	+7 · right to elect judges	address
7.6 · leans Yes	Republican, 70s	−10 pts · campaign cash	0 · none landed	address · phone
2.2 · leans No	Democrat, 50s	−7 pts · campaign cash	+3 · right to elect judges	address
1.6 · leans No	Democrat, 30s	0 · none landed	+10 pts · right to elect judges	address

Real rows from the Johnson County export with identity withheld — the delivered file carries name, mailing address, phone where on file, precinct, and district on every row, none of which belongs on a public web page. Per-voter movement is an upper bound; the ranking is the product.

Privacy, compliance, and the secret ballot

Names and identities are never sent to the AI. Each stand-in is built from voter-file traits — age, party, registration signals — and identity is rejoined only inside our systems, at export time. Exports leave our hands with their permitted-use terms attached: voter-file use is restricted by state law (in Kansas, political use is permitted and commercial use is prohibited), and the limitations header travels with the file.

And because the ballot is secret, no individual ground truth exists — for anyone. Per-voter scores are validated at the aggregate level and rank-ordered at the individual level: a smarter ordering of the call sheet, not a claim to know any single person’s vote.

4Step 4 · Turnout

Then: who will actually show upNew

A ranked list only pays off if those voters actually vote. But the “likely voter” score every campaign leans on — the one already in your file — is really just one number: a voter’s participation rate — of every past general election on their record, the share they actually turned out for. Score everyone that way, call the reliable ones “likely,” and target them. It’s a rear-view mirror — fine for a high-turnout presidential year, but on the low-turnout primaries and municipals that decide most offices, last cycle’s voters and this cycle’s are barely the same people.

So we lined it up against the real thing. For nine states we hold the actual voter file — every registered voter, and exactly who turned out in 2024. For each one, both the participation rate and the Civly Turnout Score make a single call — likely to vote, or not — and we check it against what the voter really did. How often is each right?

78%→84%

Across nine states, the 2024 general

Voter by voter, the call — will vote or won’t — is right 84% of the time with the Civly Score, versus 78% for the participation rate. More accurate in all nine.

67%→83%

And the gap widens where it matters

On a 2025 municipal — the off-cycle races that decide most offices — the old rate gets barely two-thirds of its calls right (67%). The Civly Score: 83%.

Where it costs you most: of the “likely voters” the participation rate flags on that low-turnout race, only 62% actually cast a ballot — nearly four in ten, and the budget behind them, go to people who stay home.

Read the full turnout-score analysis, with the nine-state table

5Step 5 · Activate

From message test to Meta audience

Every segment a message moves is also an ad audience. The same run that scores each voter and tags the message that moved them most will package any slice of the file as a Meta Custom Audience — matched, hashed, and formatted to Meta’s own upload spec, ready for Facebook and Instagram. Your poll is your ad list.

Pick a segment

Straight off the message test — say the 292 toss-ups, or the unaffiliated the endorsement list moved. A filter on a run you already have.

→

Rejoin identity

Name, phone, and address rejoin only inside our systems, at export — never sent to the AI.

→

Hash to spec

Every match key is SHA‑256 hashed the way Meta requires, before the file leaves our hands.

→

Meta Custom Audience

Deduplicated, in Meta’s exact column format. Upload to Facebook & Instagram, or seed a lookalike.

One export, cut from the scored run: the phone bank and the ad buy hit the same modeled voters — not two lists from two models.

Every segment on this page is already an audience

The run defined them, so exporting one is a filter, not a fresh build: the 292 toss-ups, the 531 unaffiliated who broke 10.9 points toward No on a bare endorsement list, the 974 Yes‑leaners worth shoring up. And because every row carries its best‑message tag, the creative you run is the argument the model already said would move that segment — message and audience ship together.

Reach is bounded by what’s on file — and we hand you the number first

Meta matches strongest on email and phone. This Kansas file carries a phone for 63% of voters and no emails, with name, full address, and birth year on 100% as secondary keys — and Meta matches only a subset of any list to real accounts. So we give you the matchable count before you spend, not a surprise after. An audience you can’t reach is not a number we hide.

The upload file, to Meta’s own schema — every value hashed

Column (Meta’s spec)	What Meta matches on	On file, this run	How it ships
email	Primary match key	0% · none in file	—
phone	Primary match key	63%	SHA‑256
first / last name	Secondary match key	100%	SHA‑256
city / state / zip	Secondary match key	100%	SHA‑256
birth year	Secondary match key	100%	SHA‑256

Meta’s Customer List schema, unchanged — the file we output is the file Meta ingests. No real values are shown: every identifier is SHA‑256 hashed before the export leaves our systems, so Meta receives scrambled keys, never a name, number, or address in the clear.

How a campaign runs it — a handful of audiences, not one

The run already sorted the electorate, so the buy is mostly deciding what each group is for. Usually that’s one dominant message pointed at the people it moves — and the discipline to leave everyone else alone.

Audience	Who — straight from the run	Creative the test picked	Budget
Persuade the middle	292 toss-ups + 531 unaffiliated	“Campaign cash in the courtroom” · swung both ~25 pts	The bulk of it
Reinforce your side	734 already leaning your way	Mobilization creative — the attack is spent effort here	Light, ramps late
Expand past the file	A lookalike seeded from the movers	The same winning attack, new faces	Test, then scale
Exclude the immovable	Anyone the test parked at the noise floor	— a flat cell is a buy signal in reverse	$0

Why a handful, not dozens: Meta needs enough matched people and budget behind each audience to deliver — slice a county list too thin and every ad set starves, so the count is bounded by list size, match rate, and spend. And campaign ads run inside Meta’s Special Ad Category: your custom-list segments are allowed — that’s the whole export — but standard lookalikes give way to coarser Special Ad Audiences, so expansion is broader-brush than it is for commercial advertisers.

Privacy and permitted use, the digital edition

Same discipline as the call sheet. Identity is rejoined only inside our systems, and every match key is SHA‑256 hashed before the file leaves — Meta’s standard Customer List flow — so Facebook never receives a name, phone, or email in the clear. And permitted use travels with the file here too: voter-file use is restricted by state law (in Kansas, political use is permitted and commercial use is prohibited), and that limitation applies to a Meta upload exactly as it does to a phone bank.

The secret ballot still holds. An audience is a modeled segment — the voters the run scored as movable, and the message it scored as moving them — not a claim about how any one person voted. You’re buying a smarter ad, not a dossier.

What this is, and what it isn't

Synthetic polling is a complement to traditional polling, not a replacement. Where a high-quality field poll exists it remains a valuable anchor. We're deliberate about where the method is reliable and where it isn't.

●

Validated on three states, one topic area

Vote choice is checked across North Carolina, Florida, and Pennsylvania; policy only in Florida. Going further needs certified results we don't yet hold, which we treat as future work.

●

Calibration is local, not universal

The direction of the correction is universal, but its size is state and topic specific. Borrowing one state's correction for another can do more harm than no correction at all.

●

Behavioral questions are upper bounds

For "would you vote for someone who did X" questions the engine reproduces the real ordering of scandals and traits very well, but the stated effect sizes run larger than real behavior, so we treat them as ceilings.

●

Approve before you spend

In production the service plans for free (scoring the question, expected accuracy, and projected effort), and nothing runs until the plan is explicitly approved.