1940 Census Data Swamps Servers - InformationWeek
Software // Information Management

1940 Census Data Swamps Servers

National Archives release data on 132 million individuals from 1940 census, raising privacy concerns and getting 22.5 million hits from 1.9 million users.

American Red Cross Social Media Command Center
American Red Cross Social Media Command Center
(click image for larger view and for slideshow)
A trove of census data from 1940 released Monday includes the individual records of millions of U.S. citizens. Privacy watchdogs have warned that public release of the records could lead to identity theft or other misuse.

The Census Bureau launched a Web page that offered macro-trend data in charts, graphs, and videos showing national demographic trends comparing 1940 to 2010, how people moved about the country--such as the migration of rural African-Americans to northern industrial cities--and the make-up of the workforce.

The National Archives and Records Administration launched its own website of data from the 1940 census on 132 million individuals, which has been awaited by genealogists, marketers, and data enthusiasts, and there was a surge in demand when it was released. It is this information that had provoked privacy concerns because the National Archives site includes individual names, addresses, ages, and other personal information.

[ Governments at all levels are making more data available. Read Cloud Of Government Data Grows Over Chicago. ]

The census records contain name, age, relationship to the person canvassed, occupation, questions about migration, education and participation in Depression-era programs such as the Works Progress Administration (WPA). You can read a list of questions here. It is the first census to raise large-scale concern by privacy advocates, such as the American Civil Liberties Union, which filed several suits to block release of some information. Even Census Bureau officials had voiced concern. Although some individuals and groups have always been wary of the unrestricted release of census data, the 1940 census release is the first to come out at a time when digitized records could be ready fodder for identity thieves. Those fears have been allayed, in part, because social security numbers and dates of births were omitted. About 21 million people canvassed in the census are still alive.

Shortly after it went live, Archives officials Tweeted that they had received about 22.5 million hits from 1.9 million users and they were working with host Amazon Web Services to bring up additional servers as many users complained of slow or non-responsive service. The site represents a continuing move by government agencies to partner with the private sector because the National Archives website is being hosted by Archives.com, a private website operated by Silicon-Valley based Inflection, a company that boasts of having over 14 billion public records in its databases.

Because 1940 census has not yet been indexed by family name it must be searched by location or Enumeration District. So far, private volunteers have tried to streamline the process with outside software and search engines such as those at the U.S. Census Community Project, and Steve Morse. Expect to see more slicing and dicing of the data from websites such as Ancestry.com, which is in the process of uploading all of the 3.8 million document images from the census. Users also are able to download and share their data finds via social media sites, including Facebook and Twitter, and email directly from the government's website.

As federal agencies embrace devices and apps to meet employee demand, the White House seeks one comprehensive mobile strategy. Also in the new Going Mobile issue of InformationWeek Government: Find out how the National Security Agency is developing technologies to make commercial devices suitable for intelligence work. (Free registration required.)

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
Number 6
Number 6,
User Rank: Moderator
4/3/2012 | 7:09:29 PM
re: 1940 Census Data Swamps Servers
Census data has always been released a little over 70 years after it was collected. It's a treasure-trove for genealogy research. I worry a little bit more about the SSDI, but there are far greater privacy threats posed today by other databases and practices than the Census and SSDI data.
Andrew Hornback
Andrew Hornback,
User Rank: Apprentice
4/3/2012 | 3:36:37 AM
re: 1940 Census Data Swamps Servers
Couldn't the Census Bureau data, charts, etc. have been released without relasing all of the data backing them?

I think this sets a very bad precendent - the Federal Government is putting census information out for public use/display. How do they think that this is going to play out? In New York City, people were already weary about talking to the Census Bureau - but now that there's the possibility that the Federal Government could release this data at any time, without notifying anyone that their data is being released, I think even more people are going to avoid talking to the Census.

Sure, I believe in the idea that there are some uses for this data that might be inoccous, but there are other uses that sit at the other end of the spectrum.

And one last thing, in this day and age of budgetary issues at the Federal level, how much is this data release costing the American taxpayers? What are we getting in return?

Andrew Hornback
InformationWeek Contributor
TaylorMade IT Spin-Off Taps Cloud Database
Jessica Davis, Senior Editor, Enterprise Apps,  2/15/2019
2019: The Year IT Makes a Comeback
Guest Commentary, Guest Commentary,  2/18/2019
Myth or Matter: Is There a DevOps Talent Shortage?
Joao-Pierre S. Ruth, Senior Writer,  2/14/2019
White Papers
Register for InformationWeek Newsletters
Current Issue
Security and Privacy vs. Innovation: The Great Balancing Act
This InformationWeek IT Trend Report will help you better understand and address the growing challenge of balancing the need for innovation with the real-world threats and regulations.
Flash Poll