    How to chart campaign cash
    By Cindy Eberting

    Building a database to track campaign contributions is nothing new to computer-assisted reporting. But the best way to do it is in question. Campaign finance consultants are popping up and pitching their service to newsrooms to design and maintain such a database. They say an in-house project wastes money and ties up reporter time with tedious work, such as coding contributors for occupation.

    Many newspapers, however, still opt to create their own database. There are advantages to both methods. And both tactics have resulted in more in-depth election coverage. Here are tips from consultants and reporters about how to approach tracking your local political cash flow. The biggest hurdle, says consultant Tony Raymond of Public Disclosure, Inc., can be convincing editors that the time, effort and money needed to build a good campaign finance database are worthwhile. The key is to point to stories other newspapers have pr

    oduced with such databases and the state laws that changed once the stories ran. "The cost of it can be a deterrent," Raymond says. "But the benefit of a campaign finance database is that it can be used for more than campaign finance stories. Say someone in the metro area gets stopped for DUI. Itıs a good place to throw his name at the database and see if he is a player in the community.²"

    David Poole, a reporter-turned-consultant who helped build campaign finance databases for statewide newspaper consortiums in New York and Virginia, says both his projects have changed state campaign finance practices. "In Virginia, the newspapers have completely helped change the equation," Poole says. "There are more and more candidates that are beginning to file electronically. These projects are a way for the newspapers to have a positive effect on some positive change."

    Once editors are convinced, the careful work of planning the database begins. Itıs important to think ahead about every angle from which the databases will be analyzed. "We try to suggest a database format," Raymond says. "We know that once the database happens and people see it inside the newsroom, then the next thing they say is: 'I'd like to see this or that.' If you donıt set it up right the first time, you have to do the whole thing again." Last fall, The Washington Post hired Public Disclosure, the company owned by Raymond and Kent Cooper, who both worked for the Federal Elections Commission and the Center for Responsive Politics. They built a database of contributions in Maryland races from 1994 to 1998.

    Last summer, Tom Brown at The Seattle Times combined federal and state campaign contributions into one database to determine Washingtonıs Top 50 political contributors. The project, which they did in-house, took about two months to complete. In both cases, some of the data were already in electronic format, eliminating the time-consuming and often costly effort of data entry. Both databases also identified the occupations and employees of as many major donors as possible. The most tedious tasks were standardizing contributor names and coding contributors for occupation and industry. (There is no effort by federal or state election commissions to address the problem of contributor names being listed several different ways, even though the money is coming from the same person.) In both projects, to complete the coding, both the reporter and the consultants were forced to call individual contributors to verify or determine their occupation.

    Project Proposals
    When Raymond and Cooper first start talking with a newspaper about a project, they set up a division of labor from the beginning to define who will do what. Basic questions they initially want answered include:

  • What level of donations do you want? Do you want donations to federal candidates as well as state legislators and county commissioners?
  • Is the information available electronically? What's the average number of transactions for each level: federal, state or local?
  • If the data are only available on paper, do you want to contract out with a data-entry firm to have it key-punched? Or does a state or local entity already key-punch the data into an in-house database, which you could request through the Freedom of Information Act?
  • Do you want to code the contributions for occupation and employer by giving them an industry or sector classification?
  • Both of Poole's projects required data entry. Poole used Quality Data Systems, a nationwide data entry firm. Reporters can also find numerous data entry services near state capitols where state agencies require their work. When hiring a data entry firm, Poole suggests asking the firm's customers about the firm's accuracy rating. The contracts are usually based on a per-record basis. It's best to receive bids for the contract from two or three vendors, Poole says. Then ask a price for a specific number of records and an expected turnaround time. When doing data entry, the contributorıs name, donation amount and ZIP code should be typed in twice to help later with verification. Also add fields for the page number on which the contribution is listed and the schedule of the contribution form. These extra fields make it easier to spot poor data entry and to cross-verify summary contribution information. "I think it's overkill to fully verify (key in twice) every single field for cost and for time reasons," Poole says. The newspaper needs to come up with clear rules for the data entry workers to follow, particularly when entering addresses. The database coordinator also needs to work closely with reporters who have covered campaign finance issues using paper records. In Poole's Virginia project, the summary contribution pages didn't have standardized date ranges. A query of the itemized data told the actual date range. "On the summaries, it helps to know what they say and what they actually mean, which can often be two different things," Poole says.

    Occupation Coding
    Occupation coding can be the most complicated, tedious and subjective process of building a database, but it's also the most enlightening. "Some of the best stories that have been done are those where classification has been done," Raymond says. "You can see trends and patterns, such as how much gambling has influenced a particular area." The Washington Post's project tried to identify employer and occupation for every donor who gave $1,000 or more. To make the matches, they ran names against other databases, looked up names in newspaper archives, and phoned several people just to ask who they were and what they did for a living. Public Disclosure did most of the work, identifying 78 percent of major donors to the governor's race. But it's important to point out the caveats of occupation coding to editors and readers. Raymond uses the example of General Electric, which also owns a television network, makes commercial and defense jet engines and constructs refrigerator parts. "So what do you code GE as?" Raymond asks. "Is it a defense contractor, a home appliance company or jet crafter?" When dealing with large companies, reporters need to be aware of the large players involved. When Raymondıs company works with newspapers to code contributions, many coding decisions are left to the newspaper and reporters themselves, who are more familiar with area industries.

    In the Seattle series, the reporters considered a couple as one entity when both were significant donors. They also included corporate contributions of companies known to be controlled by one individual. Brown warns that reporters looking at contributions over a long period must beware of contributors who have died. Four of their original Top 50 had died during the seven years of contributions the newspaper considered.

    Structuring the Database
    Raymond suggests having a master table that lists details on all contributors. This serves as a look-up table for all the names. Raymond builds databases using Microsoft Access. Each contributor name is given a unique identifier. Then each office or race ­ such as the gubernatorial or state legislative races ­ becomes its own separate table. "You're going to get information in the gubernatorial race that will be different than the legislature," he says. "But you want to keep all that information. The most flexible thing you have going is that master table. It's really the key to it." With this structure, every time a media outlet wants to add another level of campaign contributions, they don't have to change the database structure; they just add a related table for that office and add the contributor names to the master table. The database structures for the New York and Virginia projects differ. In the Virginia database, every contributor and candidate has a unique identifying number in a relational database with 10 tables that relate to the main contributions table. They also code contributors for occupations.

    In New York, for competitive reasons within the consortium, the database only assigns a unique number to candidates. Each newspaper then decides whether they want to standardize names or code for occupation. "The papers there have a much higher level of distrust," Poole says. "They were all happy to have the base data, and each of them just wanted to take it from there.

    Verifying the Data
    Raymond reminds reporters that the database is only as good as the data you obtain. The reporter has to be able to verify the database information. "The data you're going to get needs to be scrubbed up so it's garbage in and garbage out," Raymond says. In Seattle, reporters ran an explanatory note with the tables pointing out potential problems with the data. The note mentioned data-entry error by public agencies and that paper records were examined selectively. The note also stated that the contribution totals reported were most likely the minimum anyone actually gave.

    Making the Data Available
    After the database is built, it's important to get the information to key people in the newsroom and make sure they understand how to use it; otherwise, you render the system worthless. Raymondıs company sets up the database on an Intranet in which reporters use their Web browser to query the database and sort the data in various ways. An Intranet also gives data access to a reporter working from home or on the road.

    Maintaining the Database
    The campaign finance database increases in value when it's maintained and contribution data is continuously collected as candidates file their reports. "Campaign finance databases are living and breathing things," Raymond says. "They donıt end at the end of an election cycle. Right after the election you have a whole other set. What happens in a newsroom is that they look at these things like a project with a beginning and end date."

    Raymond's company offers to maintain databases for newspapers and suggests, though from an obviously biased point of view, contracting out that duty to an outside party. "It's almost inviting a waste of money if you try to do it in-house," Raymond says. "That's when they set it up as a project as a one-time shot. Then the next election comes and it's going to be: We've got to pitch it to somebody again. Dig it out and start from ground zero."

    Cindy Eberting can be reached by e-mail at cindy@nicar.org