Investigation: that would involve application of link analysis to discover relationships that go beyond participation in threaded e-mail exchanges. It would involve derivation of association rules that capture correlation between variables. It would involve information extraction — identification of named and pattern-described entities (e.g., personal, organizational, and geographic names in the first category and dates, addresses, phone numbers, and currency amounts in the second) and of facts (statements relating entities) — and record linkage to transactional data and descriptive data such as the contents of directories.
Coins have two sides and an edge, but there's also the metal they're made from, their substance. In a corporate world, compliance, e-discovery, and auditing guide us in managing information. But it's investigation, the as-yet neglected fourth dimension (so far as process automation is concerned), that reveals substance, that determines the value of particular information to corporations, regulators, and litigators.
It shouldn't be much longer before organizations understand that the technology exists to go well beyond monitoring and managing, beyond archiving and filtering, even if they're still coming to grips not only with those operational mandates but also with the basic concepts behind them. Let's understand that e-discovery is not compliance monitoring is not auditing, and then let's understand that investigatory text analytics and data mining can enhance all three, adding knowledge discovery to what is currently an operationally focused mix.E-discovery and auditing are flip sides of a single coin, the one concerned with retention of records and their production in litigation, the other with studying records to verify the correct of execution of corporate business processes and accounting procedures. Compliance is the coin standing on edge: operational rules and monitoring designed to ensure that businesses stay out of legal and accounting trouble.