In last month's Boston Marathon bombing, the key suspect's name, Tamerlan Tsarnaev, had multiple spellings on U.S. intelligence watch lists. How can we beat this problem?
5 Big Wishes For Big Data Deployments
(click image for larger view and for slideshow)
Law enforcement and intelligence agencies have many tools at their disposal to fight terrorism. But the most basic of mistakes, often the result of human error, can have deadly consequences. A suspected terrorist's name, for instance, may be spelled differently on various watch lists, an error that can make it difficult to identify and track a potentially dangerous individual.
Carl Hoffman, founder and CEO of Basis Technology, an 18-year-old text analytics software company based in Cambridge, Mass., sees this as a serious problem that needs to be addressed right away.
"Watch lists play a very important role in national security. We have an obligation to implement them as accurately as we can, and to build information systems that minimize the chance of human error," Hoffman said in a phone interview with InformationWeek.
He singled out two major watch lists as being particularly problematic. One is issued by the U.S. Treasury Department's Office of Foreign Asset Control (OFAC); the other is the National Counter Terrorism Center's Terrorist Identities Datamart Environment, also known as TIDE.
"Both of those watch lists have serious architectural problems -- problems with the way the lists are implemented," said Hoffman, who stressed that he was speaking strictly from a linguistic and technological perspective.
As an example, Hoffman used the case of Nigerian citizen Umar Farouk Abdulmutallab, popularly known as the "underwear bomber." Abdulmutallab was convicted of trying to detonate plastic explosives hidden in his underwear on a Northwest Airlines flight from Amsterdam to Detroit on Christmas Day 2009.
"His name was placed on the TIDE watch list, and his name was present on other lists as well," said Hoffman. "But later when he became a subject of scrutiny, people who were looking for him were unable to find his name."
Why? "He had a very complex name, and an Arabic name," Hoffman recalled. "And when you translate a name from Arabic, as in the case of Abdulmutallab, into our writing system, that's a great opportunity for errors to be introduced."
When investigators and analysts translate names from a foreign language, there may be multiple translations. "And attempting to keep track of all the different spellings of a foreign name can lead to failed queries, and to look-ups being missed ... as definitely happened in the Abdulmutallab case," said Hoffman.
A similar situation may have occurred in last month's Boston Marathon bombing, where the key suspect's name, Tamerlan Tsarnaev, had multiple spellings on U.S. intelligence watch lists.
In January 2012, Tsarnaev's flight reservation for a six-month trip to Dagestan and Chechnya triggered a security alert to U.S. customs authorities, according to an April 24 article in The New York Times.
But his trip didn't set off a similar alert on the TIDE watch list "because the spelling variants of his name and the birth dates entered into the system -- exactly how the Russian government had provided the data months earlier -- were different enough from the correct information to prevent an alert," the Times reported.
Hoffman declined to speculate on the Tsarnaev case, saying it's too early to take a position on an ongoing investigation. He did say, however, that there are smarter methods of placing foreign names on watch lists, such as entering them in their original language, as well as in English.
"If you look at the Treasury Department's OFAC list, the only way a name goes on that list is after it has been translated into our writing system, namely the letters A through Z," said Hoffman.
"You have an opportunity to precisely select the name that you're looking for, even if that name is in Chinese, Arabic, Persian or whatever language. And then your database entry can capture that name, both in its original spelling, and (in) ... English."
The solution, Hoffman believes, is better implementation of software technology that exists today.
"You can't go blaming the analysts, investigators and cops," he said. "The burden lies with the information architects. And they need to know that it's possible to build watch lists and systems that index and catalog names, and to do it in a way that is multilingual."
He added: "Is the terrorist supposed to spell his name correctly when he's purchasing his plane ticket? Are investigators who've never heard the name of the guy, or who are hearing a tip over a telephone, supposed to know to spell a name like Abdulmutallab or Tsarnaev?"
E2 is the only event of its kind, bringing together business and technology leaders across IT, marketing, and other lines of business looking for new ways to evolve their enterprise applications strategy and transform their organizations to achieve business value. Join us June 17-19 for three days of 40+ conference sessions and workshops across eight tracks and discover the latest insights in enterprise social software, big data and analytics, mobility, cloud, SaaS and APIs, UI/UX and more. Register for E2 Conference Boston today and save $200 off Full Event Passes, $100 off Conference, or get a FREE Keynote + Expo Pass!
Google in the Enterprise SurveyThere's no doubt Google has made headway into businesses: Just 28 percent discourage or ban use of its productivity products, and 69 percent cite Google Apps' good or excellent mobility. But progress could still stall: 59 percent of nonusers distrust the security of Google's cloud. Its data privacy is an open question, and 37 percent worry about integration.
CIOs Get Smart About BIIT’s tried for years to simplify business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.
InformationWeek Tech Digest, Nov. 10, 2014Just 30% of respondents to our new survey say their companies are very or extremely effective at identifying critical data and analyzing it to make decisions, down from 42% in 2013. What gives?