Between 60% and 70% of the system's data came from InfoUSA, which sells data on voters' income, age, address, home value, telephone numbers, vehicles, bankruptcy filings, mail order purchases, marital status, and more--including such "lifestyle" information as whether they like auto racing or motivational speakers. The rest came from commercial and public databases.
BOMBARD AWAY
Leading up to previous elections, the DNC's data cleansing was unorganized and often manual. This time around, the Dems bought software from Business Objects unit Firstlogic that allows for the creation of rules to automatically scan and correct data, making sure addresses and phone numbers are formatted correctly or people's nicknames are recognized.
Meantime, Harold Ickes, former deputy chief of staff for President Clinton, set up a separate database, called Catalist, for America Votes, a coalition of Democratic groups that targeted elections in battleground states. One source says the rival DNC and Catalist data-gathering efforts will come together eventually.
The DNC's voter file contains 300 million records with up to 900 fields per record, everything from voting history to purchasing power to whether the voter has a hunting license. It can handle 30 to 40 queries at once, automatically cleans up dirty addresses, and crunches numbers up to 20 times faster than it did in the past. Lists will get rebuilt three times a year and could quadruple in size in two or three years as voters move and new data flows in.
Ken Strasma, president of Strategic Telemetry, a microtargeting company that works with the Democrats, says the technique may have tipped the U.S. Senate races in Virginia and Montana by identifying voters in bright red counties that may have otherwise been overlooked. However, the Democrats still lag the Republicans in volume of data and in experience. "It's good for us, just as it is in any industry, to go out and make our universe larger," consultant Bickford says. After all, the 2008 presidential race is just around the corner.
The Democratic National Committee spent $8 million this time around on a multiterabyte relational database from Netezza. Instead of assembling an Oracle database, EMC storage, and IBM servers, Netezza's Performance Server stores, filters, and processes terabytes of data within a single Linux-based appliance, installed in hours rather than weeks and at lower cost, says Gus Bickford, a consultant who helped implement the DNC database.
![]()

![]()
Power to the data!![]()
Photo by Brian McDermott/Reuters![]()
One scenario for microtargeting goes like this: Female cat owners tend to vote for Democrats, as do the majority of married women with children. So if you see a woman at the polls with a ring on her finger, a toddler in her arms, and cat dander on her jacket, you probably know for whom she's voting. Using the data it acquired, chances are the Democratic machine also knows who this lady is and has bombarded her with specially crafted phone calls, mail, and TV ads.
Open Government: A San Francisco Treat
San Francisco took Obama's pledge of open and transparent government seriously, and launched datasf.org -- its attempt to give the city's data back to its citizens. Developers and users have embraced it, and the city's mayor is already looking ahead....

NOTE: Offer valid for U.S., U.S. possessions, & Canada only.