Online Extra: Popp on protecting privacy in Total Information Awareness
TR: Privacy advocates and the press have expressed concerns about the kinds of data the Total Information Awareness (TIA) program might be looking at. What sorts of databases would TIA examine?
Popp: TIA, which now stands for Terrorism Information Awareness, is using either foreign intelligence or counter-intelligence data that was obtained legally and may be used by the federal government under existing laws, regulations and policies, or else wholly synthetic, artificial data that we’ve generated to resemble real-world patterns of behavior. So far our major focus has been on providing entities within the Department of Defense (DoD) and intelligence community with information technologies like collaboration software, analytical tools and decision support aids that they then use in experiments on the data and databases they currently have available to them in accordance with existing laws, regulations and policies. Under no circumstance are we in the TIA program providing any real data or any means to collect real data.
Examples of the types of data being used include foreign intelligence data such as imagery intelligence, signals intelligence, human intelligence; data that is in the public domain such as the World Wide Web and various news feeds such as AP, Reuters and Al Jazeera; and map and geospatial data that is available from a variety of commercial and government sources.
TR: Is there any oversight of the TIA projects, either from within the Department of Defense or from outside?
Popp: Yes, in fact, both. The Secretary of Defense has established an oversight framework that consists of both an internal oversight board and an external Federal Advisory Committee. The charter of the internal oversight board is to oversee and monitor the manner in which TIA tools are being developed and prepared for transition to real world use, as well as establish the supporting policies and procedures as necessary. The board, formed in February 2003, is composed of senior officials within the Defense Department and intelligence community and is chaired by the Undersecretary of Defense for Acquisition, Technology and Logistics. The external Federal Advisory Committee held its first meeting in May 2003, and its broad charter is to advise the Secretary of Defense on the legal and policy issues, particularly those related to privacy, that arise as advanced technologies are being applied in the war on terrorism, such as those being developed by TIA.
TR: Are there any privacy safeguards built into the TIA program?
Popp: Yes-many. In addition to the oversight boards, DARPA has been and continues to be fully committed to managing and overseeing the TIA program in full compliance with the laws and regulations that protect the privacy and civil liberties of all Americans.
As an integral part of our commitment to safeguarding privacy, we are also sponsoring research that aims to create and assess the merits of a variety of privacy protection technologies. These technologies will protect not only U.S. citizens’ privacy, but also protect the identity of sensitive intelligence sources and their methods which is an analogous problem in our view.
TR: What are those?
Popp: Before describing some of the specific technologies, let me first note that our overarching goal for the privacy protection work we’re sponsoring is to enhance both privacy and security. Oftentimes, the debate over privacy and security issues tends to be portrayed as a tradeoff; we don’t believe this to be the case. A second important point that is worth reemphasizing: TIA’s research and experimentation is using only legally obtained intelligence or counter-intelligence data, or synthetic data. We’re exploring a number of privacy protection approaches. Among the more promising techniques are Transformation Spaces, Selective Revelation, Self-Reporting Data, Anonymization, Immutable Audit, and Privacy Appliance. Let me explain these.
Transformation Spaces is an approach that uses well-known concepts of mathematical analysis and encoding techniques to transform data from a plain-text representation to a cipher or mathematical space that is unintelligible to a human. Once transformed, a plethora of data analysis functions or mathematical operations can be applied very efficiently while simultaneously protecting the privacy of the data. TIA hopes to demonstrate that very specific patterns can be developed that describe terrorist signatures. We anticipate these signatures may be buried in an enormous amount of data about everyday worldwide activity that has nothing to do with international terrorism whatsoever. We also think there is a wider range of intelligence data, both classified and open source, which analysts need to search in order to understand terrorist intent. To have any hope of making sense of this, we believe there must be a more structured and automated way of handling this problem. With Transformation Spaces, we’re exploring the merits of working within a transformed mathematical space to address this challenge. Because the data is represented in a space that is unintelligible to the human, privacy protection is inherent in this approach.
Selective Revelation would allow incremental access to and analysis of increasingly privacy-sensitive data; analyst knowledge of an individual’s identity would occur only after the appropriate legal standards have been met. The approach proceeds incrementally by initially requiring a data owner to release only subsets or statistical interpretations of its privacy-sensitive data to an analyst’s query. If the results of the initial query turn out to be meaningful say the level of suspicion has been heightened then through appropriate legal frameworks, such as probable cause, more privacy-sensitive data can be released as the analysis progresses. As an example, an analyst can initially issue a pattern-based query indicative of a terrorist attack across a distributed set of data sources to determine if there is any evidence of that pattern. The initial set of results to the query may be so large that it needs to be anonymized or converted into statistics no identity data is provided. Through iterations, the analyst can sufficiently refine the pattern-based query so that it more strongly implies terrorist activity and only an acceptably few individuals match the query. At that point, the analyst can use query results as evidence to seek legal authorization from an appropriate authority to obtain the identities of the suspicious individuals.
Self-Reporting Data would continuously track and report the location of data and the person accessing it as it transits from one location to the next. We’re exploring the efficacy of digital watermarks and similar techniques for this concept.
Anonymization would allow a data owner to release a generalized or obfuscated version of its data to an analyst with a guarantee that the specifics of any privacy-sensitive data in the released data cannot be determined, yet the released data still remains useful from an analytical point of view. Examples of identifiers that typically would be anonymized include name, address and telephone number.
Immutable Audit would automatically record all accesses to data immediately and permanently, with no possibility that audit records can be altered or tampered with. Moreover, to prevent against potential abuses by malicious insiders or agents, the immutable audit will be designed to detect with high probability any misdeeds, and encrypt and transmit the audit records to an appropriate trusted third party oversight authority. We will also develop tools to query and analyze audit data in on-line real-time scenarios as well as in off-line batch mode. The contents of the audit log may contain fields such as the identity of the government user; the authorizations being used; the date and time of the entry; the data requested; and the data returned.
Privacy Appliance is a novel concept that would employ a separate hardware device placed on top of a database, metaphorically of course. It would serve as a trusted, guarded interface between the user and the database analogous to a firewall and would implement several privacy functions and mechanisms to control and enforce access rules and accounting policies. It would also explicitly publish the details of its operation; it would be tamper-resistant and cryptographically protected; it would enforce business rules established between the database owner and the user issuing a subject- or pattern-based query; it would verify the user’s access permissions and provide access only to authorized users; it would filter queries to permit only those that do not violate privacy; it would verify the credentials of the user issuing the query that are packaged with the query with respect to specific legal and policy authorities under which the query has been conducted; and it would initiate an immutable audit log capturing the user’s activity and transmitting it to an appropriate trusted third party oversight authority to ensure abuses are detected.
TR: In February, President Bush signed a spending bill that included an amendment introduced by Sen. Ron Wyden (D-Ore.). The Wyden amendment called for a report on TIA to Congress. Has the amendment limited the project in any way so far?
Popp: First, I’m happy to report that the congressionally mandated report on TIA was delivered to Congress on May 20, as the Wyden amendment had ordered. Now, with respect to the limitations on the TIA project as a result of the Wyden amendment, it does pose limits on the deployment and implementation of TIA technology (and technology from related programs within the Information Awareness Office) to any department, agency, or element of the Federal Government that is not engaged in lawful military operations of the United States conducted outside the United States or lawful foreign intelligence activities conducted wholly against non-US persons. We have and will continue to comply with these limitations.