Facebook Translator Available As Open Source Code

Thrift helps keep Facebook's many moving parts working together and keeps outside developers producing new applications in the language that is best for the task.

Charles Babcock, Editor at Large, Cloud

July 10, 2008

5 Min Read

One of the secrets of Facebook's successful operation for millions of visitors a day is an in-house code translator that lets a system developed in one language talk to a system developed in another -- typically a major stumbling block, or at least performance degrader, on many sites.

This translator, dubbed Thrift, helps keep Facebook's many moving parts working together and keeps both inside and outside developers producing new Facebook applications in the language that is best for the task, another gain for the site.

A year ago, Facebook announced it was making its Thrift translator open source code and establishing a project around it. Continued work on Thrift by Facebook developers and developers at firms outside of Facebook prompted the Apache Software Foundation last month to take Thrift in as an incubator project, a preliminary step often to becoming a fully fledged, Apache project.

Thrift was needed in part because Facebook has a culture of letting the engineers choose the best tool for the job, rather than restricting developers to a handful of possibilities, said Aditya Agrawal, director of engineering and co-author of paper on Thrift with technical leads Mark Slee and Marc Kwiatkowski.

Thrift doesn't translate everything from one language to another, the way a U.N. translater conveys a Russian speech into English. Rather, it is making sure the data type of one language can be translated into that of another, and it's providing a library of remote procedure calls so that one system may call for a service from another system.

"Internally, we use Thrift everywhere. It's an instant messaging service (between systems). We don't have to worry about how data will be exchanged when it's on different servers," said Agrawal.

In effect, Facebook has come up with a way to neutralize system differences, sidestepping the need to create an immense library of point to point connectors or a universal code translater -- the latter still a figment of the programmer's imagination.

Instead, Facebook has come up with "a language neutral" way to define the data an application will use, and when called on to share data, the application turns to Thrift to cast it into the form that the receiving application will recognize. In addition, it found a neutral way to implement remote procedure calls so that a call for assistance will reach a system composed in a different language and located on a different server. The Thrift Interface Definition Language is an neutral third party that lets developers label their data structures with a minimal amount of information. The Thrift code generator then reads the IDL annotation and generates the code to move the data to a different system, explained Agarwal.

"We allow interlanguage remote procedure calls, and the developer doesn't have to worry about all the coding requirements," he said in an interview.

Thrift can translate between applications written in C++, Java, PHP, Python, Perl, C#, Objective C, Smalltalk, Ruby on Rails, and several esoteric, specialized languages used in search and other missions. They include Haskell, OCaml, and Erlang. It was non-Facebook developers who contributed the Thrift IDL and other translation effort that brought C#, Perl, and Ruby into the fold. "A really good community has formed around Thrift. It's good to see the power of open source," said Agarwal.

Another Facebook technical lead, David Reiss, took that thought a step further in his June 12 blog on Thrift, citing contributions from developers at Powerset, imeem, Evernote, and Amie Street.

Mark Slee, Agarwal's co-author on the paper, "Thrift: Scalable Cross-Language Services Implementation", said in launching Thrift as open source a year ago that Facebook's move was in contrast to other online giants. "Many large corporations are famous for keeping this type of code under lock and key," he wrote in a blog. That was an unsubtle swipe at Google, known to make use of large amounts of open source code while sharing only a few parts back with the communities that produce it. Thrift is part of Facebook's effort to both appear and act more open. One of the cards that it plays as an online social networking site is the strength of its search engine versus more general purpose ones, such as Yahoo's or Google's. Agarwal is an architect of Facebook's ability to search for someone you think you know and come up with the right answer.

While Google, Yahoo, and other search engines can retrieve and cache the results of the most popular searches, speeding their average response times, Facebook has no such option. At Facebook, every search is an individual one, trying to connect an individual to someone they know. To do so, Agarwal says it weights "the social graph" indicators highly, such as is the person being sought a "friend" of the searcher, did the person graduate from the same school as the searcher, is the person sought an employee of the same company or a resident of the same city as the searcher. Finding the "John Smiths" who meet such criteria is more likely to yield the right one than simply finding all John Smiths in the index.

That's because Facebook hosts 600 million searches a month of people looking for other people, many of whom share a first and last name with someone else included in Facebook's 28 million individual profiles. To make the search results relevant, Agarwal realized, Facebook would have to base them on "a social graph" of who the searcher was most likely to be looking for.

By putting tools like Thrift into the hands of outside developers, they are more likely to be able to make use of Facebook's search powers and other capabilities and generate applications that are useful to Facebook subscribers. As it is, Facebook is conducting 600 million searches a month, making it one of the top 20 search engines on the Web, Agarwal said.

About the Author(s)

Charles Babcock

Editor at Large, Cloud

Charles Babcock is an editor-at-large for InformationWeek and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive Week. He is a graduate of Syracuse University where he obtained a bachelor's degree in journalism. He joined the publication in 2003.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights