Feldman is one of the great authorities of the field, a computer science professor, author, and co-founder of text-analytics vendor ClearForest. No one is more qualified to suggest text mining's research agenda than he. Indeed, the aim of Feldman and his 2006 SIGKDD co-authors proposing Data Mining Grand Challenges goes far beyond research. It is to "get researchers, press, funding agencies, venture capitalists, and public interested, greatly stimulate research, and produce dramatic advances in science and technology." This is a worthy vision and goal.Yet I feel that Feldman's proposed SAT-reading test is incomplete in a number of significant ways:
1) Feldman articulates an Entity Extraction subchallenge of reaching precision of 95+% and recall of 98+%. Yet a far lower f-score - the harmonic mean of precision and recall used to assess accuracy - could allow a machine to pick the best of 5 answers in a multiple-choice reading comprehension test like the ones cited. After all, the answers to questions in these tests are chosen to be findable in the provided reading passages. The tests are designed to be doable.
I wonder if a multiple-choice reading-comprehension test couldn't be passed with moderately sophisticated pattern-matching software - advanced text mining not required!
2) A GRE/GMAT/SAT reading-comprehension test would not test ability to mine real-world materials such as call-center notes, survey responses, e-mail, etc. where the syntax may be fractured and ungrammatical, where the spelling is irregular and often abbreviated, and where a given source document may contain externalities, what a linguist would call exophora, or references that are not resolved in examining a single source document. While the last of those points is addressed by Feldman's Autonomous Text Analytics subchallenge, the GRE/GMAT/SAT wouldn't adequately test it.
3) The ability to do Autonomous Text Analytics, scouring the Web and coming up with "truly interesting findings" could actually lead to wrong test responses. That is, the correct answer in a reading-comprehension test is not necessary the factually true answer, and as we all know, much information to be found on the Web is of dubious accuracy.
The gap could be closed by adding a third, important subchallenge: to create a mechanism for assessing and weighting the correctness of identified responses in order to formulate a single, best response. Mined gold is worth far more after it is refined. Researchers working on question-answering aim to meet this challenge. It's an extension of text mining and merits Grand Challenge inclusion.
Feldman cites the Turing Test as, essentially, a generalization of his text-mining Grand Challenge. The Turing Test is a conversation: can a person tell that an interlocutor is a machine? Conversations take place over time. They flow and meander and are sometimes discontinuous. The language usage is fractured and meaning maybe depends on context and externalities. And the Turing Test also involves synthesis of responses rather than choosing from among a small set of prepared responses.
Admittedly response-formulation does go beyond text mining into other areas. Nonetheless, I would suggest a fourth subchallenge, the synthesis of responses in accordance with a contextually determined model. Solutions are already part-way there: witness Figure 4 in Zaki's "Mining the Proteome" section of the panel report.
It shows a network of protein-protein interactions, which would be one of several usable presentations information extracted from scientific literature via text mining. (It's also rather ugly.) I much prefer examples like this one -
- "The Wnt signaling pathway describes a complex network of proteins most well known for their roles in embryogenesis and cancer, but also involved in normal physiological processes in adult animals.")
Currently the mode of results delivery is manually determined and the researcher uses software tools to design the presentation. Couldn't software automatically decide appropriate ways to deliver text-mining outputs?
The conclusion I draw is that passing the Turing Test is (still) the best Grand Challenge for text mining, with updates for current-day inputs and outputs of course.
P.S. I've been developing this material for a course, Integrated Analytics: Text and Data, that I'll be presenting at TDWI's May conference in Boston, and I'm presenting some of the ideas in a white paper I'm writing, What's Next for Text, for the Text Analytics Summit, also in Boston, June 12-13.
P.P.S. The reference for the Wnt signaling pathway image is D. C. Lie, S. A. Colamarino, H. J. Song, L. Desire, H. Mira, A. Consiglio, E. S. Lein, S. Jessberger, H. Lansford, A. R. Dearie and F. H. Gage (2005) "Wnt signalling regulates adult hippocampal neurogenesis" in Nature Volume 437, pages 1370-1375.Entrez PubMed 16251967.Ronen Feldman last year posed a grand challenge for text mining to create "systems that will be able to pass standard reading comprehension tests such as SAT, GRE, GMAT etc." The aim is to "get researchers, press, funding agencies, venture capitalists and the public interested [and to] produce dramatic advances in science and technology." It's a worthy vision and goal, but it's incomplete.