How much personal information does a user trade for access to a "free" smartphone application? It depends on the application, but the type of data collected can seem "Orwellian," according to Tyler Shields, a senior researcher for application security testing firm Veracode.
"Your personal information is being transmitted to advertising agencies in mass quantities," he said, at least based on his teardown of online music provider Pandora's radio-streaming application for Android smartphones, which he detailed in a recent blog post.
Shields' assessment is relevant given the news, first reported by the Wall Street Journal Tuesday, that federal prosecutors in New Jersey are investigating whether mobile application vendors are illegally retaining or sharing personal information about their customers with third-party advertising groups.
Included in that investigation is Pandora, an online music service that also makes smartphone applications. As disclosed in a recent SEC filing from Pandora, "in early 2011, we were served with a subpoena to produce documents in connection with a federal grand jury, which we believe was convened to investigate the information sharing processes of certain popular applications that run on the Apple and Android mobile platforms."
According to Pandora's filing, "we were informed that we are not a specific target of the investigation, and we believe that similar subpoenas were issued on an industry-wide basis to the publishers of numerous other smartphone applications." The company also warned that it might face fines or lawsuits as a result.
Exactly what data is being collected by Pandora -- and by extension other smartphone applications? To find out, Veracode's Shields first identified that Pandora's Android application integrates with five different mobile advertisement libraries: AdMarvel, AdMob, ComScore (SecureStudies), Google.Ads, and Medialets. Next, he decompiled all of these libraries.
Just looking at AdMob (the other libraries largely collected similar data) the advertising group will receive a user's location (as GPS coordinates), the application package name, as well as application version. In addition, said Shields, "there were variable references within the ad library that appear to transmit the user's birthday, gender, and postal code information." The application also shared the android_id, which is a variable that developers can use to identify individual smartphones, although the legality of doing so is unclear.
In isolation some of this data may appear uninteresting, but when combined, an advertiser might be able to determine the user's actual identity. "When all that is placed into a single basket, it's pretty easy to determine who someone is, what they do for a living, who they associate with, and any number of other traits about them," he said. "I don't know about you, but that feels a little Orwellian to me."
But Shields was careful to note that many smartphone application builders integrate prebuilt code snippets from advertising agencies, and thus might not be aware of the breadth of user information being collected or shared. "They may merely think they are getting $x per ad impression, not that the ad library is leaking significant information about the user," he said.