Preemptive Intelligence

The Department of Homeland Security's concept of an information sharing and analysis center — designed to watch for crisis signals so that it can respond rapidly and intelligently if one does unfold — has merit for private businesses. Here's why

As part of the requirements analysis, crises or interesting high-level events are analyzed to determine what types of data could possibly help predict the existence of events, a change of interest, or the likelihood of a crisis. The analysis function lets users generate hypotheses such as whether or not a specific set of signals can predict an event. If the signals can predict an event, what is the strength of the prediction? Which variables are the primary predictors? What are the values associated with the variables? How much does this value have to change before the variable is no longer a good predictor? This type of analysis is facilitated through data mining.

Data mining plays two important roles. The first role is that of historian, to extract important information from past history to create usage patterns, models, scenarios, and temporal relationships. Data mining helps users understand whether there is a predictive relationship between a set of signals and an event as well as how frequent and strong the predictive relationship is. If a set of signals is found to predict an event, then the data mining can also be used to find similar sets of signals that could also predict the same event. (Because we're reasoning about rare events, the traditional data models must be modified.)

The newly identified relationships, information, and scenarios are then available to analysts for evaluation. Some of the information will indicate probabilistic relationships that must be evaluated in terms of strength, frequency, and subjective interest. Relationships may strengthen over time as more data is collected. The analysts select the results that are interesting and use them to trigger either preplanned business strategies or additional investigation. Several data mining tools exist today, including those from SPSS and SAS.

The second role of data mining is to apply the derived signal information and create rules, models, scenarios, and patterns that indicate potentially threatening activities. Many of the data mining tools can automatically export the discovered relationships or patterns into SQL or C++. The rules, models, and patterns are reintroduced to the ISAC as profiles. Analysts are notified when profiles are matched and can choose whether or not to implement planned strategies.

In the intelligence context, we must search for the new, different, and unusual, or even perhaps the existence of a single link. This type of search mandates the analysis of the entire data space, not just a sampling. The data spaces are frequently in the terabytes. Note that the type of data mining, rather than the size of the data, dictates the data architecture and hardware. A centralized data repository facilitates traditional data mining analysis such as prediction analysis. Without the centralized data infrastructure, it becomes quite expensive to calculate the Bayesian predictive statistics, which are a component of relationship detection. Mining for relationships is fairly insensitive to the computer infrastructure environment and depends mostly on database query speed. The actual signal detection process can be run on either a centralized or a distributed computer infrastructure.

ISAC Applications

The rules, patterns, or profiles must be compared against all incoming new text information or related data to determine if certain events or signal are occurring. One interesting application of an ISAC is to mine all of the congressional bills, both at a state and federal level, to ensure that new legislation does not negatively impact products. As new patterns are matched, the system can cue the appropriate analyst. Given the detection of interesting signals, the company must have a set of responses in place so it can quickly respond to the detected signals. The responses may include collecting more data, changing business tactics and business plans, modifying strategies, or even applying defensive tactics.

ISACs are a potentially revolutionary intelligence community concept that commercial companies can use to help identify social, political, and economic changes that may adversely or positively impact their business objectives. The underlying ISAC technology is so powerful that, for the first time, you can collect, analyze, and fuse the data derived from terabyte-sized collections of documents. The information derived from this massive collection of data can be used to help create a signals detection and prediction mechanism. When companies know about the potentially relevant events occurring all over the world, they are in a much stronger position to react proactively. Information really is power.

Lisa Sokol [[email protected]] is the technical director of General Dynamics Advanced Information Systems' Knowledge Management Area. She received her doctorate in Industrial Engineering and Operations Research from the University of Massachusetts.

Editor's Choice
Brian T. Horowitz, Contributing Reporter
Samuel Greengard, Contributing Reporter
Nathan Eddy, Freelance Writer
Brandon Taylor, Digital Editorial Program Manager
Jessica Davis, Senior Editor
Cynthia Harvey, Freelance Journalist, InformationWeek
Sara Peters, Editor-in-Chief, InformationWeek / Network Computing