Leveraging Data Prep to Prevent Human Trafficking

How one organization is using a data prep tool to combat human trafficking and slavery.

Emily Johnson, Digital Content Editor, InformationWeek

November 15, 2017

5 Min Read
Image: Shutterstock

Earlier this year, on a stifling 101-degree day in San Antonio, Texas, 10 people died after being held in a tractor-trailer at a Walmart parking lot. Those who died were alongside 20 others who were packed in the truck and suffering from what CBS News described as "dire conditions." The driver of the trailer was accused of driving a trailer packed with immigrants for "commercial advantage or private financial gain."

The International Labor Organization estimates that there are 20.9 million victims of human trafficking globally. Of those victims, 68% of them are trapped in forced labor, 26% of them are children, and 55% are women and girls. 

While human trafficking is a major global problem, Washington, D.C., based nonprofit, Polaris, is using data to learn more about this issue and prevent future instances of human trafficking and enslavement. 

Polaris recently went through a digital transformation -- not one that included lots of buzzworthy technologies like AI and machine learning -- but a transformation that involved a simple upgrade to their data prep process so they could better analyze their data to fight human trafficking. Polaris was gifted a data prep tool from Paxata. Before they had this tool, most of their data sets were being handled manually in Excel.   

While human trafficking affects millions across the globe and makes an estimated $150 billion in profits, there’s surprisingly little additional data on the subject. 

“Because [human trafficking] is a clandestine crime...there’s actually very little information about the problem,” says Sara Crowe, associate director of data systems for Polaris, adding that victims are often afraid of coming forward to due fear of retaliation or shame, which contributes to the difficult task of collecting and identifying trends. 

Crowe also says that the human trafficking data that exists is difficult to combine and compare with other data sets. “There are actually entities collecting data, but that data is extremely siloed and you can’t easily combine it with other data sets.” 

“Polaris has been really focused on trying to make sure there is data and information about human trafficking out there and it’s accessible to researchers and academics to truly understand the scope of this issue,” she says. 

One way Polaris gathers data on this issue is through hotlines. Their hotlines, which receive reports of human trafficking cases in each of the 50 states and D.C., have reported nearly 14,000 calls this year.    

In addition to the U.S. hotlines, Polaris gathers data from their international partners. 

“My job has been one to build up data collection about human trafficking globally,” says Crowe. Crowe says she also makes sure all of the data is collected for analysis, and Polaris helps other anti-trafficking organizations do the same.

Crowe says now with the new data prep tool they’re able to do things with greater ease that they struggled with before, for example, comparing data sets that come in formatted differently and in different languages. 

“We receive data sets from a Mexican hotline, all in Spanish, and data they collect is not exactly the same [as ours] and not formatted the same way,” says Crowe. “If we wanted to combine the data sets we would have to go through a manual process, using lots of different formulas,” says Crowe. 

“[Our data prep tool] has allowed us to create those transformations in a much more user-friendly way,” without macro scrips, they’re able to save the way they’ve transformed the data and will be able to auto transform it in the future, says Crowe. 

Another challenge they faced before receiving their tool was removing identifiable information from their reports before sharing their data with research firms and with other anti-trafficking organizations. 

“Right now, we are not sharing any identifying information about individuals across borders because that’s generally just not allowed. We try and take the approach of sharing de-identified information that shows trend-level information rather than intelligence analysis,” says Crowe, adding that working with sensitive data is a major concern and huge obstacle to their work. 

Crowe says a that while one piece of a victim’s profile might not identify them, it’s the combination of the elements in their file, even with the name redacted, that can lead to a victim being identified. “One of the ways you can get around those issues [of not being able to share data] is by finding those unique situations and redacting them. It’s the combination of variables, that’s what’s [personally] identifying, and is extremely hard to account for. We figured out a way to identify those cases and automatically change the data so it says ‘redacted.’ ” 

Crowe says that the data prep tool has allowed them to reduce time spent on data projects by 1,100%. “Six times a year we pull and edit case data to add to our website. Before Paxata, this process would take about three hours to do manually each time - 1080 minutes per year. With Paxata, we have been able to cut this time down to about 15 minutes each time - 90 minutes a year.” 

When it comes to projects for social good, it’s not uncommon to see nonprofits in the position that Polaris was in before being gifted their data prep tool. Founder and executive director of Datakind, Jake Porway, says that many nonprofits are just getting started when it comes to building out their data systems.   

“In our experience, nonprofits have a tough time getting the resources to collect and manage data the way industry would,” says Porway. “In this particular space, however, there’s also the extra rub that the data is extremely personal and exceedingly sensitive. For that reason, data in the human rights space has a stigma attached to it that gives folks pause before using it - anything collected can often be weaponized against you in these situations, and that makes using data a potentially fraught path.” 

While Porway and Crowe both lament the struggles of handling data that could be used to identify a victim of a human rights violation, Porway says that some organizations are making steps to handle data safely. 

“Thankfully lots of groups (The Engine Room, Tactical Tech Collective, Data & Society) are working to promote ethical and safe use of data, so we’re getting there,” says Porway.

About the Author(s)

Emily Johnson

Digital Content Editor, InformationWeek

Emily Johnson is the digital content editor for InformationWeek. Prior to this role, Emily worked within UBM America's technology group as an associate editor on their content marketing team. Emily started her career at UBM in 2011 and spent four and a half years in content and marketing roles supporting the UBM America's IT events portfolio. Emily earned her BA in English and a minor in music from the University of California, Berkeley. Follow her on Twitter @gold_em.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights