|March 12, 2001|
Microsoft Research: Who Benefits?
Microsoft has superstars in its research lab, but the company's developers grumble behind their backs about the value of their contributions. Whom does Microsoft Research serve?
By Stuart J. Johnston
|More on Microsoft:|
Today, the perception that Microsoft is incapable of innovation is slowly changing. But critics still decry the company's research efforts as unimpressive, particularly in light of those by Xerox Corp.'s fabled Palo Alto Research Center, the yardstick by which all other computer-science research labs have come to be measured. Instead, much of the attention given to Microsoft Research's efforts has focused on its flamboyant founder, Myhrvold, a genius who graduated from high school at 14, received multiple degrees--including a Ph.D. in theoretical physics--by 23, and did post-doctoral work with renowned cosmologist Stephen Hawking before co-founding a tiny Bay area software company that was acquired by Microsoft in 1987.
Myhrvold left Microsoft almost two years ago to pursue other interests, including starting a technology investment firm with a former Microsoft colleague, Edward Jung, also a Ph.D. By the time he left, Myhrvold had brought into Microsoft Research some of the most brilliant minds in computer science. However, in its first two or three years, Microsoft Research had a hard time attracting top-tier talent. But after a few key hires, the research lab's roster began to grow dramatically. Now, there are more than 600 researchers in four labs around the world: Redmond, Wash.; San Francisco; Beijing; and Cambridge, England. The lab employs a veritable who's who of computer-science researchers, including a core group of the original Xerox PARC researchers: Gary Starkweather, who keeps one of the original patents for the laser printer on his office wall; Chuck Thacker, who engineered the Xerox Alto computer; and Butler Lampson, progenitor of bit-mapped fonts. It also employs Gordon Bell, the engineer who designed the Digital Equipment VAX architecture, and Jim Gray, who invented a key database technology known as two-phase commit.
Microsoft Research is overseen by Rick Rashid, a former professor from Carnegie Mellon University who's credited with designing the Mach operating system kernel--ironically, the underlying technology in Apple Computer's new OS X operating system. An old college buddy of Rashid's, Dan Ling, himself a well-recognized researcher who was recruited from IBM's Thomas J. Watson Research Center, is in charge of the day-to-day running of the labs themselves.
A critic might ask--and not without justification--that if Microsoft Research is so blessed with talent, why after 10 years of work hasn't something as innovative as the PC exploded from its labs? Myhrvold, Rashid, Ling, and other Microsoft Research managers point out that a research organization serves a number of purposes besides inventing "the next big thing." Researchers track technological developments and help keep company executives up to speed on where technology is going. In addition, a number of key technologies developed by Microsoft Research have incrementally improved Microsoft products.
One of the most recent is a product code-named Tahoe. It's a search engine that will debut later this year as part of Microsoft's SharePoint Portal Server. The development of one key element of Tahoe exemplifies the internal politics, the application of arcane knowledge, and the interaction of personalities that constitute the workings of a cutting-edge computer-research organization.
Susan Dumais joined Microsoft Research in 1997, but she has spent 20 years trying to figure out how people search for information. After earning a Ph.D. in cognitive psychology, she worked for more than 17 years at what was originally AT&T's famed Bell Labs, later renamed Bellcore, and finally renamed Telcordia. She's also the chairwoman of the Association For Computing Machinery's Special Interest Group On Information Retrieval.
Dumais' involvement in Tahoe began three years ago, when she says she got an E-mail from one of Microsoft's product groups interested in finding out whether her work, and that of her colleagues, could be harnessed to improve text retrieval in the company's Site Server/Index Server product. The colleagues in question were David Heckerman and John Platt.
David Heckerman is a doctor--the medical kind--but readily admits he studied medicine simply to learn about the human brain. He and a friend, Eric Horvitz, earned their medical degrees as well as Ph.D.s in computer science with specialties in machine learning and expert systems. In the mid-1980s, Heckerman and Horvitz co-founded a company, Knowledge Industries, that developed expert systems. Myhrvold managed to entice them, as well as another key artificial-intelligence researcher at the company, to join Microsoft Research in the lab's early days. Heckerman now manages Microsoft's machine-learning and applied-statistics group and has an abiding interest in how our intelligence functions.
John Platt is a senior researcher in Microsoft Research's signal-processing group. Platt, who holds a Ph.D. in computer science with an emphasis in graphics and an area of artificial intelligence called neural networks, was also 14 when he graduated from high school. While a student at the California Institute of Technology taking a class from astronomer Gene Shoemaker, Platt discovered two asteroids, one of which he named for his father.
Together, Dumais, Heckerman, and Platt helped develop the Tahoe search engine. Tahoe also owes some of its genesis to a Microsoft product group located in Israel, as well as to a technology known as Okapi that was developed by a Microsoft researcher in the company's Cambridge lab. Tahoe combines work done by Dumais on defining the "context" of a user's text-search query with Heckerman's insights into how to imbue a computer with a certain amount of machine intelligence for figuring out, statistically speaking, which possible results a user would actually find helpful. "In the latter half of 1997, several groups, including the Tahoe [product] group, requested this text-classification feature," Heckerman says.
Both Dumais and Heckerman agree that a key algorithm worked out by Platt enabled them to dramatically speed up the processing as well as increase the accuracy of the search engine's results. Platt suggested that the engine use a particular type of evaluation system known as a "support-vector machine," which is known to be highly accurate but also extremely slow. So Platt developed an algorithm that sped up processing. Platt also wrote the core kernel code for the engine. "Within six months, we had something up and running," Dumais says. All in all, Dumais and Heckerman each devoted about three months worth of work to the project; Platt worked on it a bit longer.
The way Tahoe works is that it can be "trained" to recognize and categorize relationships among words by churning through a batch of text documents. An administrator or a knowledge expert then can weigh keywords and categories, from most to least important, for use in future searches. When the Microsoft researchers had finished their work, the Tahoe engine could digest and process as many as 30,000 text documents--which could represent, say, a Reuters news feed--in 120 categories in "under two CPU minutes," Dumais says.
Beta testers say the search engine is fast and accurate. "We initially put [the beta of SharePoint Portal Server] up for the developers I work with," says Peter Sprague, a software engineer at Knosys Inc., a Boise, Idaho, vendor of online analytical processing software. Sprague took on testing and administration of the portal in lieu of the company's IT department because the plan was to use it only within his research and development organization. But it wasn't long before other groups at the 100-person company wanted access.
Despite the portal server's beta status, groups all over Knosys have been using it for the past several weeks. The SharePoint Portal Server can access data that the research and development group uses to track bugs and product milestones; the data is stored in disparate SQL Server databases. "We can see a chart of live data showing how many bugs each developer has, and we can drill down into it, which is really handy," Sprague says. Also, marketing can get to the information to see whether the developers are on track to meet delivery schedules. About a quarter of the company's employees, working in R&D, marketing, quality assurance, and customer support, are using the system.
That positive response is echoed by an internal user at Microsoft's information services group, which is part of Microsoft's corporate library service. "We're indexing 2.8 million documents in company Web sites, file shares, Exchange public folders, and SQL Server databases," says Alex Wade, manager of the knowledge access group within information services. Wade works on Microsoft's intranet, known as MS Web. Internally, the company has been using "release candidate 1"--a nearly releasable version--of SharePoint Portal Server since the end of January. "What we're providing is one-stop shopping for information across the corporate intranet," Wade says, which includes news, market research, company events, and the library's catalog.
How important is the Tahoe engine's context-classification capability in the scheme of things? Not very, perhaps. But it advances the functionality of search engines significantly, and the fact is, most computer-science research never goes beyond the research stage. That may be due to the differences between the goals of researchers and the goals of software engineers.
On the product side of a company, there's a certain amount of distrust, doubt, and resentment of researchers. Generally, code written by researchers isn't bulletproof; it's meant to demonstrate whether or not a concept works. Engineers want code that's tested and reliable. Often, after a research group has successfully transferred version 1.0 of a technology to a product group, the development engineers will have to expend time and resources testing and correcting the code so it's rock solid.
Tahoe notwithstanding, Microsoft has leaned less on its researchers than some companies. That may be why some Microsoft product developers take a dim view of the idea that such world-class researchers aren't directly at their beck and call, even referring behind their backs to time spent at Microsoft Research as "rest and vest." Some developers have coined a term for Microsoft Research's lofty ambitions and lack of total focus on improving the company's products: "Where the rubber meets the sky."
But for a company such as Microsoft, research proves valuable in many ways. Microsoft Research has generated hundreds of patents, and those patents provide both a cache of intellectual property that can be used in future products and as currency in future business deals. Not inconsequentially, patents can also serve as a backstop in court claims.
Microsoft can be faulted for many things, but not for thinking small. Historically, research tends to move an entire field of study forward. That point wasn't lost on Gates and Myhrvold when they had the opportunity to do something about it.
Much of the historical material in this article was gathered in partnership by Donald L. Barker and Stuart J. Johnston as part of a book project that currently is on hiatus.Photos by Gregg Snodgrass
- The Language of UX: Beyond Buzzwords -
- Get practical information on how to develop your organization's mobile commerce application - Mobile Commerce World - Mobile Commerce World
- Get practical strategies to build a solid plan for profitability and success - Mobile Commerce World - Mobile Commerce World
- Delve into technologies and business issues around mobile payments and wallets - Mobile Commerce World - Mobile Commerce World
- Learn how to enage customers through mobility - Mobile Commerce World - Mobile Commerce World
- How to Start Your Big Data Journey
- Meeting the Unilever eScience Challenges: To out-compute is to out-compete
- Smarter Mobile Security: Securing BYOD
- Accelerate Agility Now: WebSphere Application Server v8.5.5 Overview
- Intelligent Management of WAS Applications: Reduce Cost, Complexity, and Errors
This Week's Issue
- Metzler: The 2013 Application and Service Delivery Handbook
- Comparison of Cisco and ShoreTel Unified Communication Solutions
- Don't Get Stuck on Your Virtualization Journey: Where to Focus Next
- How Virtualization is Key to Managing Risk
- Real World Considerations for Implementing Desktop Virtualization eBook