informa
/
Feature

Democrats Unleash "Demzilla" On The GOP

An open-source data mart and transactional database, tied to home-grown BI tools, are helping the Democrats battle their Republican counterparts in the election year data wars.
Plus Three, a Washington-based firm with about 21 employees, built the system using an open-source software package similar to EBay's or Google's -- Linux operating system from Red Hat, Apache Web server, MySQL database and Practical Extraction Report Language -- for reasons of both cost and "freedom," said David Brunton, one of Plus Three's founders. Open-source made the most sense, he said, because the DNC wanted to do its own data mining and analytics. Once Plus Three completed the assembly, it could turn over the source code to the DNC's techies, get them up to speed, and let them have at it. In this particular business, open source also has advantages over closed-format, Brunton said, because changes in potential donor targeting often need to be made on the fly -- if people are for some reason unwilling, on a particular day, to give out their phone numbers, the DNC could write up some code to deal with that contingency, and implement it almost immediately. The software runs on a typical open-source hardware stack, consisting of AMD servers from Penguin Computing.

As far as the build-out, Brunton said a major challenge was integrating the database to its disparate data sources. Though open-source made the problem easier to overcome than a closed-format system otherwise would have, he said, another obstacle arose: how to make the physical connections between systems fast enough yet stable enough to handle all that data flow -- voter information streaming into DataMart (and then into Demzilla, depending on the direct market success) from volunteers knocking on doors and entering survey questions into laptops, or voters clicking through a DNC e-mail. Plus Three also needed to link DataMart to all the far-flung systems used by the state party organizations.

The answer lay in RSS, or "really simple syndication," a feed technology that first took off among bloggers a few years ago. Plus Three developed its own kind of RSS for the DNC, which allowed it to deliver an XML stream between multiple systems. Plus Three's benchmark for a data-transfer rate was 5,000 records per second when those records needed to be parsed (or decoded and transformed into actual data), and 15,000 per second when they did not. "Anything less than that is probably slower than acceptable," Brunton said, "and anything faster is probably too fragile." Another important piece of gear Plus Three used was Spread, the multicasting technology. Information gathered from online transactions might hit one of ten different servers, said Brunton. But a Spread machine allowed Plus Three to then multicast all the logs from those disparate servers, collecting them in one place, and in real time, rather than waiting for an end-of-the-day update. This timeliness is particularly valuable in the fundraising world, said Brunton. "With the ability to raise $5.5 million or $6.6 million in a day, it's important to know where you are in any given hour. It could affect ad buys, or a get-out-the vote effort."

The DNC says that DataMart and Demzilla have enabled the party to increase its number of listed donors from 400,000 at the time of the 2002 elections to "well over a million now," though it won't be more specific. It has also let the DNC cover the costs of prospecting for donations. No longer does it need to pay third-party vendors for lists of target voters, nor must it outsource its various e-mail campaigns. The cost of a very large e-mail blast, in other words, amounts only to the tech staff's payroll.

As good as all this sounds, the viability of the system has been called into question before. About a year ago, an article in Roll Call, the Capitol Hill weekly, quoted an anonymous "consultant," who said, "The system architecture is overly cumbersome and the result is that the data is not easily retrieved . . . Worse, the quality of the data is far from a level that would make it immediately useful." Both the DNC and Plus Three vigorously denied this, of course. They say a different kind of politics was at work: sour grapes. The comment, they say, came from a Plus Three rival rejected by the DNC.