A new Web-based effort promises to track the sources of congressional earmarks, compile databases of the Twitter posts of state lawmakers, and add sharper perspective to the Obama administration’s open-government efforts.
“Government puts out a ton of data that is really interesting about what it does, but people can’t understand it,” says Clay Johnson, director of Sunlight Labs, an arm of the open-government group Sunlight Foundation, based in Washington, DC.
The foundation has already tapped open-source developers to help process the often-fragmented and cryptic data released by the government. “We are doing anything we can to celebrate the opening of this data–and making it so it’s useful,” says Johnson.
Now the group is raising an army of Web-based volunteers to go through all the information contained in those releases. Congressional earmarks–funds for projects inserted anonymously as line items in various bills, without any hearings or reviews–are a big initial focus. In 2004, members of Congress wrote more than 14,000 earmarks costing more than $50 billion. Technically, it’s already possible to find the sources of earmarks, but this involves going through all 535 congressional websites and reading PDFs of the earmark requests posted.
The new Sunlight Labs transparency corps invites users to log in and join the effort to analyze this information collaboratively. Users are presented with the PDFs and prompted to read them carefully and then enter the pertinent information–the date and dollar amount of a request, name of the requester, description of the project, and so on–into fields on the screen. These then become part of a searchable database.
Another corps project aims to track the Twitter statements of all state lawmakers. Volunteers who log on are asked to seek out and enter the Twitter addresses of state senators and representatives; Sunlight Labs will seek verification from three or four people that the address is correct, and then start recording the lawmakers’ tweets (the short messages they send through Twitter).
The project is in its infancy, but with this data in hand, it will later become possible to search the tweets for the most popular words that each lawmaker uses, and to search their statements by topic and date. It may even be possible to compare their tweets with statements made in other contexts, such as speeches recorded in the Federal Register.
The project’s launch roughly coincided with the launch earlier this month of a White House effort to chart the progress of information-technology (IT) projects in various federal agencies. The new IT Dashboard, an online tool accessible to the public, allows users to see which IT projects are under way, check their status, and provide feedback to the chief information officers at different federal agencies.
The tool revealed particularly startling delays at the Veterans Administration, so the White House halted 45 over-budget or behind-schedule projects for review. “We were able to catch these contracts, in part, thanks to our new tool,” Vivek Kundra, the White House chief information officer, wrote in his blog earlier this month.
“The dashboard may be just the tip of an iceberg that will herald a new-age transparency regarding federal spending,” says Andrew Rasiej, founder of the Personal Democracy Forum, a website that covers politics and technology. “Once people get used to this type of information being so readily accessible, they will demand to see [it] for all other federal spending too, and then the genie will be completely out of the bottle.”
But the IT Dashboard also shows the limitations of the government’s own open-government efforts, says Johnson. It helps users find the primary recipients of funding, but not subcontractors. Furthermore, it’s not easy to discern the origins of contracts or their geographic distribution, and it’s almost impossible to see how they are connected to elected officials. “The IT Dashboard is a tool for government to audit itself, but it isn’t a particularly good tool for citizens to look at,” Johnson says.
Johnson says the transparency corps could be mobilized to fill that gap. He notes that the dashboard is based on government forms that track the progress of government contracts and the milestones reached. These raw forms, which are available through the site, could be a gold mine for further work, he says.
For example, it’s possible to extract the names of all contractors and subcontractors from these forms and plot their locations geographically, to see if they happen to reside in a particular congressional district. It’s also possible to trace contractors’ contributions to lawmakers, by identifying the company board members from Securities and Exchange Commission filings, and then cross-referencing their names to Federal Elections Commission records of campaign donors. Thanks to the dashboard’s own analyses, it may also be possible to highlight which low-performing companies are most closely tied to which politicians.
“The IT Dashboard is just one way of looking at the data,” says Raseij, “and shows the government is trying to partner with the public in a transparent way. It’s up to groups like the Sunlight Foundation and others to take the government’s lead and make even more sense out of the available information and data for the public good.”
Meanwhile, if you happen to hold public office, be careful what you ask for, and watch what you tweet.
This new data poisoning tool lets artists fight back against generative AI
The tool, called Nightshade, messes up training data in ways that could cause serious damage to image-generating AI models.
Rogue superintelligence and merging with machines: Inside the mind of OpenAI’s chief scientist
An exclusive conversation with Ilya Sutskever on his fears for the future of AI and why they’ve made him change the focus of his life’s work.
Data analytics reveal real business value
Sophisticated analytics tools mine insights from data, optimizing operational processes across the enterprise.
Driving companywide efficiencies with AI
Advanced AI and ML capabilities revolutionize how administrative and operations tasks are done.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.