Led by RFID and enterprise metadata, new streams of information are in place to move businesses past the traditional world of analyzing only transaction data. Location analytics and metadata mining are key fields to watch.

InformationWeek Staff, Contributor

September 7, 2004

10 Min Read

Analytics add business value by leveraging corporate data streams. Until recently, the focus of business analytics has been the transaction data stream. The bulk of business analytics today is wedded to corporate transaction systems managing financial, demand, and supply chain transactions. These transactions reflect what used to be paper documents — orders, invoices, payments, and so forth. And there's still a long way to go to optimize and extend the scope and reach of transaction analytics. But as new data streams are expanding the domain of analytics, they're driving new directions in analytic technology. Two such directions are location and metadata analytics.

Look at any business today and the traditional transaction data stream is just one of many. The use of the Web for e-commerce — online sales, sourcing, and marketing — has generated new clickstreams to mine to better understand customer buying behavior and brand preferences. Digital voice traffic, email, and now instant messaging are also creating equally voluminous message streams that literally encapsulate much of the day-to-day conversations of doing business. Web site log files and call-center or email archive files are the basis for clickstream and message-stream data analysis. But what's interesting about the new directions in analytics is that the drivers are less the data streams themselves but new hardware and new ways of describing data.

Location Analytics

Location analytics is about creating business value gained from data derived from location awareness, the movement of people and items between locations, and location context. Location analytics is being driven by the proliferation of synergistic hardware, including global positioning satellites (GPS), mobile/cell phone networks, and radio frequency identification (RFID) tags. Broadly speaking, people-centric location analytics will depend on mobile phone and GPS networks, while item-centric location analytics will depend on RFID tags.

Locating people is already relatively accurate. With services such as FollowUs a user of a standard mobile phone can be located within 100 meters in an inner city area in the United Kingdom. The latest cell/mobile phones enabled with Assisted GPS (A-GPS) can bring that down to less than 40 meters. (For examples of the capabilities location-based services can deliver, see the Qualcomm SnapTrack link in Resources.)

But it's not just people that can be tracked. When a vehicle is fitted with a GPS-enabled device such as the AsItMoves locator, the location of the vehicle can be pinpointed to within tens of meters. And there will soon be literally billions of items (and people) that can be located once they're fitted with a RFID tag and come within range of an RFID scanner. A RFID tag stores data that allows a scanner to record the location of an object (and other contextual information) when it comes within the scanner's range by reading and writing data from and to the tag (if required). And these aren't the only devices generating location data. Closed circuit TV cameras in stores, streets, and highway tollbooths are also collecting location data in the form of timestamped images of people or vehicle traffic.

The use of RFID data is particularly interesting because of the potential and scope of RFID use in a business context. RFID tags have already found their way not just into "things" — such as delivery trucks and the items, cartons, or pallets transported along a supply chain — but also into animals and even humans. RFID technology is not only used for asset location tracking but also for identification and recording contextual data at a point in time. VeriChip is a supplier of RFID tags that have been implanted in humans to allow specific individuals entry into secure areas or provide medical information to doctors. Passive RFID tags carry data that RFID scanners can read — for example, a product identifier, a patient's medical status, or the origin of a cow in the food chain. Active RFID tags allow contextual data to be written to the tag as part of a read and write scanning process including location identifiers, timestamps, and temperature.

Currently, the business focus of RFID is in supply chain optimization. According to Jonathan Byrnes, a senior lecturer at MIT (see Resources): "Analytical applications improve supply chain coordination, ensuring that the right amounts of the right products are in the right places at the right times. An example of this is using RFID to get an early read on demand trends, and transmitting this information throughout the supply chain to align production and inventory levels." However, improving demand forecasting isn't the limit of RFID's analytic potential. RFID will become a key part of the "real-time enterprise" by providing new data streams about the location, movement, and context of both animate and inanimate objects within an organization.

But location analytics based on RFID faces a number of challenges. The volume of data collected could be enormous. With the potential for thousands of scanning devices operating in a large organization scanning at rates much faster than humans can create and post "transactions," we could move toward RFID databases that might reach multiple terabytes — maybe even a petabyte — dwarfing the largest data warehouses of today. Also, standards are required to oil the flow of data, like the use of globally accepted Electronic Product Codes (EPCs) and data interchange metadata such as the XML-based Physical Markup Language (PML). And there are important privacy and security implications when RFID is embedded in humans that go way beyond concerns about who knows when and where you bought a candy bar.

Yet this isn't stopping vendors like Manthan Systems, Airgate Technologies, and analytic heavyweights like SAS from moving forward with location analytics based on RFID. Manthan is promoting its concept of "sequential behavior analysis" which is concerned with understanding the changes in perspective gained from monitoring an RFID-tagged item through its life cycle and combining this data with other contextual data gathered along the way. Airgate has announced the development of Matrix Analytics, a data mining and data visualization engine, which it says will be optimized for analyzing RFID data in a variety of vertical industries. And according to Jim Davis, senior vice president at SAS, "SAS Retail Intelligence Solutions, like all SAS solutions, are RFID-compliant and can immediately handle all RFID data, from the supply chain to point-of-sale transactions." Clearly, the potential for RFID analytics extends way beyond conventional supply chain boundaries.

Metadata Mining

Just as hardware is driving the new data streams for locational analytics, metadata based on extensible markup language (XML) schemas is driving the new area of metadata mining. The impetus is coming from the many initiatives to create XML-based standards for describing industry or information specific data streams to make it easier for systems to exchange and interrogate data without human intervention in the new era of interactive Web services. Metadata mining is about leveraging these accepted metadata "frameworks" as the way to navigate and mine the data streams the metadata describes. So what's important is less the data stream itself, which is often old types of data streams packaged in a new way, but the ability to analyze the data for specific purposes through the "lens" of a particular metadata framework.

Metadata is already widely used as an intermediary layer between business analytic front-end applications and back-end data warehouses and data marts. For example, financial services analytics vendor Reveleus provides a so-called "unified metadata" layer to sit between its data mart and a variety of specialist analytic front ends used for customer, performance, and risk analytics. The difference is that these metadata layers are usually proprietary to the vendor. Metadata based on industry-supported XML schema standards opens up the possibility of creating analytic applications created specifically to mine these "known" data streams.

The eXtensible Business Reporting Language (XBRL) is one example of a metadata schema that can be mined and has the potential to add a whole new dimension to the rather moribund world of financial analytics. By publishing financial reports in XBRL format, that is with the report context and numeric content "tagged" with XBRL metadata, metadata mining can drive new and improved financial analytic processes that will use Web services to find, download, compare, contrast, and consolidate financial data without the need for a business or financial analyst's intervention. Instead metadata mining can rely on sets of predefined rules to analyze the data. XBRL metadata mining is likely to be mainly rule-based because individual data items within a specific data package (for example, a financial report) can be identified and contextualized through a known metadata tag identified in the XBRL schema applied to the report.

XBRL remains in the early adopter section of the technology utilization curve, but the income statements, balance sheets, and cash flow statements of more than 12,000 public companies are already available in XBRL format directly from EDGAR Online. Yet XBRL mining for analytic purposes is in its infancy because not all business applications output financial data in XBRL format. The number is steadily rising, however, as many top- and mid-tier ERP systems and popular financial reporting solutions from vendors such as Hyperion and FRx can provide XBRL output. But analytic solutions that can do something really compelling with the XBRL output are thin on the ground.

The Microsoft Office Tool for XBRL provides the ability to download, view, and perform some limited analysis of XBRL documents in Excel and Word. UBmatrix is one vendor moving beyond XBRL taxonomy management and toward XBRL analytics through alliances with service organizations such as Institutional Risk Analytics. And both Fujitsu and DecisionSoft provide tools for making XBRL analysis easier from within other applications (for example, custom built or legacy apps) by providing an extensible API access to XBRL formatted data. XML-based metadata mining analytics promise a raft of new vertical analytic applications focused on leveraging specific XML schemas. XBRL mining in particular has an immediate potential application in helping with Sarbanes-Oxley compliance management scenarios, which are specifically concerned with the integrity and veracity of the financial reporting process.

Another area where metadata mining is surfacing is in Web logging and RSS feeds. The explosion of blogs and the generation of automatic XML-based RSS newsfeeds to quickly syndicate content from all kinds of Web sites has created a whole new data stream that can be used for analytic applications such as innovation tracking, competitor intelligence, and reputation management, among others. IMN approaches RSS analytics from a marketing perspective. It aims to make RSS feeds bidirectional so that providers of RSS feeds can learn more about what subscribers to the feeds use the data for so that the feed can be modified to better suit those needs. In contrast, BlogStreet is offering what it calls Blog Post Analytics to search for and analyze blog posts based on keywords from a set of indexed blogs.

These new data streams, leveraging location data and metadata, have the potential to create many new analytic applications. Location analytics are set to change the way we buy and sell based on where consumers are — not who they are — inform decisions about where to locate items within a store or stores within a neighborhood, and better predict supply and demand chain problems and opportunities. Metadata mining opens up the possibility of a whole new set of vertical analytics focused on specific schema-based data streams that will provide highly targeted mining of the data underneath the metadata. Let's face it, transaction analytics are now the "legacy system" in business performance management. Hasta la vista, baby!

Stewart McKie is an independent consultant and technology writer specializing in analytic, enterprise resource management, and Web services applications. Reach him via his Web site at www.cfoinfo.com.


  • Byrnes, Jonathan, "Are You Aiming Too Low With RFID?"


  • Airgate Technologies

  • AsItMoves

  • EPC

  • FollowUs

  • Manthan Systems

  • PML

  • SnapTrack

  • VeriChip


Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights