Preemptive Intelligence

The Department of Homeland Security's concept of an information sharing and analysis center — designed to watch for crisis signals so that it can respond rapidly and intelligently if one does unfold — has merit for private businesses. Here's why

InformationWeek Staff, Contributor

February 19, 2004

10 Min Read
InformationWeek logo in a gray background | InformationWeek

This article was originally published in November 2003.

The Department of Homeland Security's concept of an information sharing and analysis center — designed to watch for crisis signals so that it can respond rapidly and intelligently if one does unfold — has merit for private businesses. Here's why.

Information sharing and analysis centers (ISAC) enable and enhance the production, analysis, distribution, and coordination of timely and accurate intelligence information. Intelligence and law enforcement agencies such as the Department of Homeland Security (DHS) are using ISACs to derive indicators and warnings from intelligence data. Intelligence analysis is a core component of homeland security, and the ISAC supports that mission by helping the government anticipate, preempt, and deter threats to the homeland.

The ISAC is viable in commercial industries as well. Companies can use the ISAC model to grow their market and become more efficient, cost effective, and profitable. Like the DHS, commercial ISACs can watch for data that indicates events that might negatively impact a company's goals. ISAC-produced information can create mechanisms that detect important changes or signals in the political and social environments, for example, social unrest that may predict a decrease in the popularity of a firm's products.

A recent example of this is the launching of a new Turkish soda. Drinking it, a series of television ads claimed, would make you more Turkish. Here we have a small foreign bottler exploiting Turkey's ambivalent feelings toward the United States and its associated Pepsi and Coke products (never mind that the actual cola products are not made in the United States and employ largely Turkish citizens).

An important role of an ISAC is to look for the precursor signals associated with different types of social, economic, and political events or crises. If the signals are collected and properly interpreted or measured, the information can be used to avert or diminish the impact of a crisis.

Just what is a crisis? A crisis is a set of events, usually of ambiguous origins, that have the potential to threaten an organization or one of its critical goals. The associated events are fairly improbable and infrequent, but when they occur they have a devastating impact on the organization. The belief is that when a crisis occurs businesses must be able to make decisions quickly. If an organization can detect events and understand the repercussions of these signals, it can respond by collecting additional information and selecting business strategies that can mitigate negative events or exploit positive events.

But in order for companies to react, they must be aware of these signals and, more important, the presignals associated with the event or crisis. The company must know what type of signals to look for, be able to sift though massive amounts of data (such as legislative initiatives, newspapers, Internet articles, and so on), find the indicators of those signals, and have a strategy in place for how to react to the signal. The challenge, of course, is determining what signals to look for and then measuring the appropriate indicators of the existence of the events.

Requirements First

Successful ISACs are driven by requirements, which are traditionally derived from the goals for the organization. For the DHS, these requirements are threats to the homeland. For a commercial firm, it is the short- and long-term objectives relevant to the business enterprise. For example, these objectives could be accelerating product growth, expanding a family of products, growing profitability, improving investment application, increasing efficiency, and/or maximizing cost effectiveness.

Another set of goals or requirements is aimed at reducing the impact of negative economic, political, social, physical, and natural events. The types of signals that might be of concern include changes in economic markets, government, and the product business, financial fluctuations, cultural viewpoints of international events, weather conditions, advertising, marketing and promotional programs effectiveness, raw material cost and availability, earnings and revenue forecasts, laws and regulation change, developing and emerging markets, patents, copyrights, trade secrets, and a changing corporate image.

ISACs have five basic layers: data collection, data transformation, data warehouse, signals analysis, and signals application.

Given a set of signals of interest, the challenge is to determine actual data observables that are associated with the event. ISACs start collecting data that users believe can help predict these events of interest. The data may take many forms, including structured, unstructured, qualitative, and quantitative. Documents (or unstructured text) are one of the largest sources of data. These documents can be in the form of email, company documents, newspaper articles, or even articles found on the Internet. The potential of the ISAC is realized as users acquire more and more data that is potentially relevant to the signals that they are trying to collect.

Data Transformation

Until recently, distilling and analyzing the relevant facts from text (or text mining) has been a challenging, time-consuming process. New natural language technology called entity extraction has revolutionized our ability to take unstructured text-based documents and automatically extract signals information in the form of people, places, things, and events — names, places, currency fluctuations, organizations, phone numbers, riots, legislative initiatives, product introduction, and more. The entity extraction process is truly exciting because it lets us apply reason to the facts embedded in text. Several entity extraction software products are on the market, including those made by SPSS, Attensity, ClearForest, SRA, and Inxight.

XML tags are an important component of entity extraction technology. Entity extraction software automatically tags data with appropriate XML tags specifying name, organization, location, and so on. Incoming data is also tagged with metadata concerning content, originator, level of classification, date, and so forth.

For structured data transformation, XML tags can contain metadata that can dramatically improve the interoperability of the data. The data transformation process also focuses on creating database rows that are relevant to the topics that we wish to mine. The temporal aspect of the data is an important component of the transformation. Temporal analysis lets us reason with the time between events.

The XML tag, whether it is derived or comes already associated with data or documents, helps improve our ability to exploit the data and use more sophisticated analysis tools. XML facilitates distributed queries, helps integrate the results from different entity extraction engines, and helps control access to specific data sources and data elements.

Data Warehouse

An important contributor to the success of an ISAC is the creation of a data warehouse that integrates data from structured sources with data derived from unstructured text. Database integration is one of the most challenging tasks, because each data source has its own database design and data dictionary. The warehouse structure must facilitate the analysis required by users. The data warehouse must also act as a repository for the business rules associated with the detection of a signal. The data integration can occur either virtually or within a centralized data warehouse.

As part of the requirements analysis, crises or interesting high-level events are analyzed to determine what types of data could possibly help predict the existence of events, a change of interest, or the likelihood of a crisis. The analysis function lets users generate hypotheses such as whether or not a specific set of signals can predict an event. If the signals can predict an event, what is the strength of the prediction? Which variables are the primary predictors? What are the values associated with the variables? How much does this value have to change before the variable is no longer a good predictor? This type of analysis is facilitated through data mining.

Data mining plays two important roles. The first role is that of historian, to extract important information from past history to create usage patterns, models, scenarios, and temporal relationships. Data mining helps users understand whether there is a predictive relationship between a set of signals and an event as well as how frequent and strong the predictive relationship is. If a set of signals is found to predict an event, then the data mining can also be used to find similar sets of signals that could also predict the same event. (Because we're reasoning about rare events, the traditional data models must be modified.)

The newly identified relationships, information, and scenarios are then available to analysts for evaluation. Some of the information will indicate probabilistic relationships that must be evaluated in terms of strength, frequency, and subjective interest. Relationships may strengthen over time as more data is collected. The analysts select the results that are interesting and use them to trigger either preplanned business strategies or additional investigation. Several data mining tools exist today, including those from SPSS and SAS.

The second role of data mining is to apply the derived signal information and create rules, models, scenarios, and patterns that indicate potentially threatening activities. Many of the data mining tools can automatically export the discovered relationships or patterns into SQL or C++. The rules, models, and patterns are reintroduced to the ISAC as profiles. Analysts are notified when profiles are matched and can choose whether or not to implement planned strategies.

In the intelligence context, we must search for the new, different, and unusual, or even perhaps the existence of a single link. This type of search mandates the analysis of the entire data space, not just a sampling. The data spaces are frequently in the terabytes. Note that the type of data mining, rather than the size of the data, dictates the data architecture and hardware. A centralized data repository facilitates traditional data mining analysis such as prediction analysis. Without the centralized data infrastructure, it becomes quite expensive to calculate the Bayesian predictive statistics, which are a component of relationship detection. Mining for relationships is fairly insensitive to the computer infrastructure environment and depends mostly on database query speed. The actual signal detection process can be run on either a centralized or a distributed computer infrastructure.

ISAC Applications

The rules, patterns, or profiles must be compared against all incoming new text information or related data to determine if certain events or signal are occurring. One interesting application of an ISAC is to mine all of the congressional bills, both at a state and federal level, to ensure that new legislation does not negatively impact products. As new patterns are matched, the system can cue the appropriate analyst. Given the detection of interesting signals, the company must have a set of responses in place so it can quickly respond to the detected signals. The responses may include collecting more data, changing business tactics and business plans, modifying strategies, or even applying defensive tactics.

ISACs are a potentially revolutionary intelligence community concept that commercial companies can use to help identify social, political, and economic changes that may adversely or positively impact their business objectives. The underlying ISAC technology is so powerful that, for the first time, you can collect, analyze, and fuse the data derived from terabyte-sized collections of documents. The information derived from this massive collection of data can be used to help create a signals detection and prediction mechanism. When companies know about the potentially relevant events occurring all over the world, they are in a much stronger position to react proactively. Information really is power.


Lisa Sokol [Lisa.Sokol@gd_ais.com] is the technical director of General Dynamics Advanced Information Systems' Knowledge Management Area. She received her doctorate in Industrial Engineering and Operations Research from the University of Massachusetts.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights