Microsoft officially entered the market for scientific computing Nov. 15 with a speech by chairman Bill Gates at the SC05 supercomputing conference in Seattle. InformationWeek editor-at-large Aaron Ricadela sat down with Gates after his speech to talk about new collaborations between scientists and Microsoft's researchers, the expanding market for supercomputing, and the possibility of recruiting a new, computer-savvy class of science graduate.
InformationWeek: Microsoft chief technical officer Craig Mundie published an article this month in which he talked about positioning Microsoft Research where it hasn't been historically: on broad, societal problems outside of computer science. How can Microsoft Research technology or the intellect of those people be applied to these broad problems in science, medicine, or engineering?
Gates: Microsoft Research has always had a pretty broad set of activities. We're growing Microsoft research activities faster than the company as a whole because of the great results we've had. It's both growing the individual research centers we have, and then this year we added our fourth center, which is the one in India. Some of the people like [Eric] Horvitz and [David] Heckerman who came to Microsoft Research came--they're MDs, and they're machine learning experts. There's a technique, a Bayesian [statistical] technique, in which Heckerman or Horvitz are two of the leading people. When those guys came, we were always interested in applying machine learning to see what drugs work, and what lifestyles work, things like that. And they applied their things even to big data mining problems in business, where you say, "OK, which are my most profitable customers, or what promotion techniques are working well?" They've taken some of their techniques against clickstreams to figure out how you should design the Web, or how searches work. Search is an amazing example where we relied somewhat on an outside company, Inktomi, which Yahoo bought, then decided to build our own search effort essentially from scratch. Now, in a very short period of time, we will actually have more than matched the kind of relevance that Google can deliver. The role of Microsoft Research in that has been phenomenal.
InformationWeek: Are there areas outside of computer science where Microsoft Research intellect might be applied?
Gates: Yeah. But, OK, an important point about this--it's not so much about saying "Let's just work on some other problem." It's that software is needed. So all these genetic algorithms, like we're using for the AIDS vaccine [project], we invented those, those are software techniques. We're seeing fields of science that have so much data that without our ability to data mine and [manage] work flow and visualize, they can't make progress. The Sky Server example is sort of typical. In astronomy, historically, you wanted to be lucky enough to be gazing at the stars on a night when something interesting happened, and then you wrote a paper about quasars or something. Today, there are thousands of observation points around the world at different locations, at different wavelengths, different resolutions. There are a couple of satellites--lots of things up in the sky. And if you, as an astronomer, want to say, "Well, galaxies cluster like this, or these light sources work like this"--in order to test that hypothesis, there are thousands of databases in different formats that you have to pull data out of and look at and see if they're consistent with your hypothesis. What [Microsoft researcher] Jim [Gray] did is he got the astronomers together to see how you could use Web services to create essentially what we call Sky Server, one logical database. It doesn't mean all the data has to be copied into one place, but you can query it, and it goes out and pulls in the right information. That was a smashing success, but it was based on Jim's view that there's so much data in the sciences that without the kind of software management that we have, both in our products and in our research, that they won't be able to make the rapid advances that they should.
Nowhere is that more true than in biology, life sciences, where you're just gathering so much data. The ability to connect these data sources together using our very state-of-the-art Web service and visualization approaches is pretty exciting. So it's not like we woke up one day and said, "Oh, let's work on some non-software problems." It's like if you noticed that engineers were using math, and you said to the mathematicians, "When did you decide to help these poor engineers?" The mathematician would say, "No way we did." The engineers figured out that the only way to describe the ideas of material strength, and crystal fracturing, and all of these very complicated things, was [through] very deep mathematical techniques. Well, now, mathematics alone isn't enough. You need software that deals with vast amounts of data.
But, you're right that there has been a thing where our outreach to these scientists in these other fields to say, "Hey, here's what we're doing," is stronger. This is particularly true in Europe where there is a lot of good science. Some pure computer science, but, say, less than in the United States. Software is becoming important to them, to their productivity. And when you look at a lot of these poor scientists who are writing low-level code, and transferring data by hand, re-entering the data--I just painted the positive vision future [in my speech]. If I had more time and if somebody was willing, I would have shown an example of how poor it is today that scientists have all this data, but can't really bring it together and get insights into it. They spend a lot of their time, not thinking deep scientific thoughts, but rather re-entering data and writing code that they shouldn't have to.
When we say "science," think about people designing cars, think about people designing planes, think about people thinking through the design of a Web site. It's not just new medicines, although that alone would justify all this work. It's not just modeling the environment, although that alone is a supercritical thing that we absolutely need to do, and advanced computer software will play a key role there. It's sort of the digitization of the world applied to science and business and commerce.
InformationWeek: Is there anything more formal that you're doing in terms of engaging these researchers with outside domains? Jim Gray's work seems like it came largely through his own initiative. Is that generally the case, or is there some way you can formalize these collaborations?
Gates: Well, there are two ways to look at this. One thing that Microsoft Research has done a brilliant job at is we have super good relationships with the top computer science departments around the world. If you go to the computer science departments and say, "OK, list the companies that you work with on a collaborative way," they'll uniformly list Microsoft and talk about the breadth of things we do. When we do our faculty summits, we get some of those people to come in. Every one of those departments is doing interdisciplinary work. It's a super important thing.
At MIT, they're really working on life sciences, and some of the other groups are doing a lot of robotics work. It's hard to say if you consider that all within computer science or whether it's multidisciplinary. [Microsoft researcher] Butler Lampson I think four years ago said one of the goals of Microsoft Research should be that nobody ever dies in a car wreck ever again. That sounds like, "Wow what is that?" That's brilliant software--that's vision software, that's acquisition software, sensor data. It's a very good goal, because software should be able to achieve that. The sensors will be super, super cheap, and the benefits will be very dramatic. So we've often taken goals that are kind of wild like that. All these robotics challenges pull you into the vision, modeling, [machine] learning type of things. It's through these university relationships where we say to the university, "OK, what is your problem, how could software work with that?" Then we have individuals who themselves are multidisciplinary.
The guys in the Cambridge [U.K.] lab have probably done the most of it. But again, it's about software, and the solutions that come out of these software things are valuable. If you can, for example, let somebody submit a job onto the Internet and find the cheapest place to run it, that's not just interesting for scientific cluster computing, that's interesting for business computing. Say I just happen to have a big analysis that I rarely do, or say that I'm in a disaster-recovery situation, where I'm trying to submit work out to be done remotely that would usually be done internally. These techniques of describing these resources, and letting things be visualized--we'll get plenty of benefits from these advances in the business realm. When we hired Heckerman and Horvitz, we were saying machine learning is this great thing for all the things we do.
InformationWeek: When we met back in September, we were talking about the shortage of computer science graduates in the United States, and that if that trend holds up, what it might mean for Microsoft years down the road. This trend of computing becoming integral to scientific advances, and even, as you pointed out, to the curriculum, is producing college students or graduate students who are in these scientific disciplines but are also quite skilled programmers. Has this outreach that you're doing to the high-performance computing community helped you attract a different class of candidate to Microsoft?
Gates: The issue about the shortage really is not going to be about Microsoft, because we have the most interesting jobs in software, and we can pay people super well. The shortage is more about the field as a whole. I mean, our customers need people who understand computer science. That's 90% of the thing. If anything, you could actually say the fact that we need software understanding to advance the sciences means the shortage is all the more acute, because you need people sitting in these computer science classes that then go off and really focus on life sciences, and focus on environmental sciences. As software is becoming this key thing, in the way that math was historically, you say, "Wow, how are we going to deal with all these people signing up for these computer science courses?" Well, quite the opposite, computer science departments have the problem that they can't keep their researchers. Part of the economic equation was that they were the teaching assistants for the computer science classes. And as those have gone down in size, that hurt.
So the world at large needs more scientific understanding. The fact that these understandings allow us to make advances in medicine, and understanding economics, and the environment should make the field all the more attractive. If you're a kid, a young, supersmart kid who says, "Hey, global warming, I want to contribute to that," boy, you'd better learn about data modeling and software, if you want to make a contribution there. You're going to take a bunch of software courses, as well as atmospheric science type courses. So it just shows we need to do a better job of painting the picture of the kind of opportunity and impact you can have, so that instead of such a high percentage of people going to, say, hedge funds or something, they'll come and help global warming, and a new AIDS vaccine, and things like that.
InformationWeek: What would you say the percentage of hires today in the research and development part of Microsoft is from non-computer science or non-computer engineering disciplines? Do you see that proportion changing over the next few years?
Gates: Well, you're never going to be very statistical about that, because the very brightest people often study in multiple fields. I never took any of the computer science classes at Harvard that most people take, because I'd had exposure to computers before I got there. I never went near the people; I was like, "Hey, I already did that stuff years ago." So I was taking physiological psychology and economics. I never got a degree, but if you look at my course sign-up, you wouldn't think I was a software person at all. And often there are very great people who want to look at these different areas. Nathan Myhrvold, who founded our research group, was more of a physicist, but he's a brilliant software person. Now he's like a patent lawyer. But he's multifaceted.
We hire tons of people like Horvitz and Heckerman. In fact, we're always on the lookout for somebody who loves software but knows it so well they're seeing how it can be applied in different ways. In an interview process, it's one of the best things to ask somebody about some problem that they're working on that they're passionate about, to see their depth of understanding and how they go about it, rather than asking them some very specific questions. You take the area where they have let themselves put a lot into it, whether it's a big computer problem, or some problem in the sciences that they think software can apply to. More and more, because personal computing is available and they have these departments, the majority of our people who actually come in and write code have some type of computer science degree. If you go back historically, a lot of computer schools didn't even have computer-science degree. In fact, it's still kind of confusing, is it the department of engineering, or is it mathematics--where does it all fit? I'd have to ask Jim Gray, but did he ever take an astronomy course, or did he just start reading the books? I never thought of him as an astronomer, I thought of him as a database guru pretty much. But supercapable people are often like that.
InformationWeek: Maybe I can shift gears a little bit and talk about coming computing trends. To what extent is this high-performance computing development and outreach a play for the technical market, and how would you balance that in importance compared with this world in which multiple cores on a chip will be important to getting advances in mainstream computing power?
Gates: Those two things really go together. There's the ability to use lots of computers at different levels of granularity. Inside the very microprocessor itself, we're going to have more and more cores. We're at two to four right now, and seven years from now we'll be more at the 16 to 64 level of cores, often with many threads. So a lot of these techniques that come from supercomputing will be applied, certainly in the server, but even in a desktop-type machine. We got into technical computing because it's a big enough field for us, in terms of the software opportunity, to justify that alone. It's very exciting for us, because a lot of the issues around automatic management and developing software that can be more easily parallelized come along with it. But there are plenty of servers out there being used for compute clusters to make that a worthy business focus for us.
InformationWeek: What new kinds of applications or ways of working do you think multicore computing would enable, and how will Microsoft position itself for that coming world?
Gates: On the client, of course, a lot of it is about natural interface. It's about speech, it's about vision, it's about very advanced search, where things that you might be interested in are brought to you without your having to do a lot of work. It's about filtering, making sure that your time, which is a scarce resource, is used in the best way. It's about reliability, it's about security. A lot of the extra compute power we get maps into those issues. I could have listed multimedia, but we're quickly getting to the point where even doing great, high definition rendering of multimedia things computationally is pretty straightforward. The tougher areas are things like getting speech accuracy up. A lot of these security algorithms are very compute intensive. Data mining to find deep insight. So we'll be able to use all the extra power. It would be a little simpler if that power was brought to us just with a higher clock speed, because then you don't have to parallelize the software as much. There's the famous Amdahl's Law that talks about that if you parallelize, say, half of your work, you can only get a factor of two speed-up, even if you have hundreds of processors there. It's a very tough problem to parallelize more and more and more and more.
People like Craig Mundie actually came to us 12 years ago from the supercomputing industry, where they worked on these problems. In a certain sense you can say it's been a Holy Grail of computer science to say, "Can you make one machine with power N, take the software, run it on N machines with power 1 and do the same thing?" The answer is, in general, no you can't, but can you for an interesting class of problems? There are the ones that are called embarrassingly parallel and we've got a lot of the brilliant people who have thought about this for a long time working for us.
InformationWeek: At the risk of overgeneralizing, how would you characterize the kind of software problems that are part of the Live software effort that [Microsoft chief technical officer] Ray Ozzie is managing? Where do those fall on the spectrum?
Gates: It's not really some direct connection. Certainly, when you use Live there will be some services up in the Internet like storing your files on the Internet, being able to back those up over the Internet, being able to have documents translated for you with a service on the Internet. A lot of the Internet services we'll provide will use thousands of servers that will manage using these advanced techniques. Microsoft search, the search cluster, uses what we call our Dynamic Systems Initiative, software that's designed that way to manage thousands of machines. We have less than one operator per thousand servers on our search cluster. Because a lot of those Live back-ends are so gigantic in what they do, there's some of the technology that comes into play there. Live is a very user-centric thing. It's you as the user, as you move between your devices, your information showing up automatically. It's about neat new ways of communicating. So the user won't say, "Wow, parallel computing." They'll just see the benefits. But on the back end are some of the more advanced things we're using. Eventually, Live will use vision and speech, and ink. And so even down on your tablet device, you'll have a multicore thing that you won't have to think about, but it's splitting the problem into many threads.
InformationWeek: In what areas does Microsoft's investment in 10 new university "institutes for high-performance computing" try to draw on intellect outside of Microsoft?
Gates: The 10 institutes are really more evolutionary. We've always had the deep relationship with Cornell Theory Center. We've always had great relationships with the different universities. And, now, those people will literally have Windows computer clusters among the things that they offer, and they'll be telling us where those are working better, where they're not, what we should do with the product. It's to help us evolve the product very rapidly.
InformationWeek: Will Microsoft staffers with experience in high-performance computing, people like Craig Mundie, Jim Gray, Gordon Bell, take on new responsibilities as you're trying to get traction in the market?
Gates: We've always had some of the pioneers of computer hardware working at Microsoft. Gordon Bell's been with us over a decade. Butler Lampson has been with us over a decade. Dave Cutler has been with us over a decade. Craig of course came over a decade ago. The only thing that's fascinating is even though we're really just a software company, we influence the direction of hardware, and we need to understand the direction of hardware. So a lot of these people are our key liaisons with Intel. Craig Mundie sits down with [Intel senior VP] Pat Gelsinger and talks about how we need to work together with them, that they're doing the chip in the way we like, [whether] we're taking advantage of whatever they're doing in the chip. So that's a great two-way dialogue. The stronger people we have that relationship with the hardware industry. It can be as complex as a new chip design with Intel or as simple as a product from Dell where you just buy that personal supercomputer, and boom, you just plug into the Ethernet, plug in the screen and you're done, you're ready. They've preinstalled the software and done it exactly the right way.
So, Craig works across the company very effectively when we have an initiative like this. Microsoft Office is doing things for data visualization, SQL [Server] is doing things for data storage. Windows itself is making this super, super inexpensive cluster edition that academics will actually get for free. We have had to have product initiatives in many different groups to pull all these pieces together.
InformationWeek: I want to ask you about your own role in the computer industry, and your thinking about that. Do you ever see a day when you might still be very active in the computer industry but perhaps without a day-to-day executive role at Microsoft; maybe still doing things that would accrue benefit to the company, but spending more of your time elsewhere?
Gates: Well, my lifetime's work is Microsoft, and I'll always be involved in Microsoft. Today I have kind of this great job, called chief software architect. And that's worked out super well. This week, I came in from a think week, which is very rare for me. I don't usually interrupt think weeks. But since we were reaching out to include these people, it was just too good of an opportunity. I'll go back out to my think week, literally, as soon as we're done. I get to spend a super-high percentage of my time on product and product strategy, and I really love doing that. Someday, in the years ahead. somebody else will get essentially the hot seat, and I'll still be involved in Microsoft and get to make a contribution. Someday I may not work as long hours as I do today. It's an important job, and it's the company I'll always give my time to. I do spend time on my foundation, but it's full-time Microsoft, part-time foundation. Someday that will probably switch. But nothing imminent there.