Government // Big Data Analytics
Commentary
12/10/2013
10:01 AM
Joel Westphal
Joel Westphal
Commentary
Connect Directly
RSS
E-Mail
100%
0%

The War On Military Records

The era of big data has arrived on the battlefield and we need to find new ways to deal with it.

When I took over the responsibilities as chief of records management for US Central Command in May 2009, one of my primary responsibilities was to oversee the records for the Joint Task Force Headquarters, US Forces-Iraq and US Forces-Afghanistan. For the next four years, my life became devoted to capturing, preserving, and then organizing what I believe is one of the most important document collections from the early twenty-first century, the Operation Iraqi Freedom Collection.

In the days following the end of combat operations in Iraq in September 2010, the media focused on a variety of important topics: What will Iraq look like in the post-war? How will we handle the troops coming home? Determining the costs of the war both in human and economic terms.

Little attention was paid, though, to the documents that will be crucial to future historians and others who will debate the conduct of the conflict, looking through an historical lens rather than media's present-day view on current events. Military analysts and historians require primary source material, from communiques (which are often emails), to reports drafted in the field, to the orders issued to units to accomplish a given task. Without these records a true history of Second Iraq War, a conflict lasting longer than the Second World War, would not be possible.

[What's your big data challenge? Read: How NASA Manages Big Data]

Since the end of the first Gulf War in 1991, modern historians have struggled to write about that conflict. In fact, a review of historical texts available on Amazon turned up only a few first-hand accounts and other oral histories of this conflict, and the question is "why?" The answer lies into what occurred shortly after the war ended.

In a rush to get home, our troops and their leadership orchestrated what was possibly the single largest destruction of records in our nation's history. Millions of records were either burned or simply left in the desert to literally be buried by the sands of time. This action (which is a matter of public record) is little known or spoken about outside of the National Archives and the Department of Defense (DoD).

It was never covered by the news media and if it weren't for post-war efforts to deal with Gulf-War Syndrome by the Veterans Administration, even more records may well have been destroyed. As tragic as this was, the lessons learned from this shortsightedness allowed us to press leaders in Iraq and Afghanistan into action to avoiding making the same mistake.

The continuity and preservation of our modern democracy in large part relies on records, with the Constitution and Declaration of Independence serving as examples of the ultimate primary source documents. Without preserving records, the elements of our historical memory are lost.

The loss of a primary source record is especially alarming. A primary source record has no bias. It is not an opinion, but is an artifact of what has taken place. The document that orders a unit to take a hill, a piece of the desert or a town, or even how to treat wounded prisoners may prove to be more valuable than 5 to 10 oral histories on the same topic because verbal testimony can be construed as hearsay, while a record can be a smoking-gun.

The Second Iraq War (or the "Second Gulf War" or "Bush's War" -- whichever name future historians will choose) may also be known as the first Digital War in our nation's history. More than 99% of all the records created during the conflict were born digitally and resided only in digital form, thus no paper copies were available.

This was a dramatic paradigm shift from our last major conflict where the vast majority of the records were created in paper form. This shift has made records far more vulnerable than they were in the past as the destruction of paper is far more difficult than simply hitting the delete key.

Previous
1 of 2
Next
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Page 1 / 2   >   >>
WKash
50%
50%
WKash,
User Rank: Author
1/2/2014 | 10:45:09 AM
e-Discovery
Interesting update to this discussion on archives. As we reported this past week: Federal legal professionals appear to be losing confidence in the ability of their agencies to deal effectively with the rising challenges of electronic discovery.

Only 38% of respondents believed that, if challenged by a court or opposing counsel, their agencies could demonstrate that their electronically stored information (ESI) is accurate, accessible, complete, and trustworthy, compared to 68% in 2012. Moreover, the percentage of "not at all confident" responses to this question nearly doubled, from 23% to 42%., according to Deloitte's latest benchmarking study of electronic discovery practices in the government.  You can read more at:

http://www.informationweek.com/security/compliance/federal-agencies-lack-e-discovery-savvy---/d/d-id/1113271?

 
jries921
50%
50%
jries921,
User Rank: Ninja
12/12/2013 | 1:10:14 PM
The point is well taken
One of the ways in which we learn from past wars (and many other things) is by examining their records.  If the records of the Gulf War were lost through mere carelessness, then blame rests at the desk of then-Secretary of Defence Cheney (a former White House Chief of Staff), who surely had enough of a sense of history to know better.  Ironic that his handling of the war was almost certainly got him the Republican VP nomination in 2000.

 
WKash
50%
50%
WKash,
User Rank: Author
12/11/2013 | 6:05:01 PM
Re: Preserving Public Records
Joel5171, thanks for your reply to my question, regarding whether these file analysis tools might breed a false sense of security.  You noted didn't have enough space to share more on this. Based on the number of comments and interest here, we'd be happy to have you consider elaborating on these points in another article.
WKash
50%
50%
WKash,
User Rank: Author
12/11/2013 | 6:00:51 PM
Re: Terabyte Limit Doesn't Compute
It appears there's a bit of a misunderstanding here: When Joel wrote that the National Archives only deemed between 10 and 15 terabytes of the 54 TB generated by US Central Command to be of a permanent nature, he wasn't referring to a capacity problem but to statutes agencies must follow in preserving certain types documents for the nation's public (and historic) record. (You can fault the way government buys IT these days, but buying storage isn't the problem.)

 
Joel5171
50%
50%
Joel5171,
User Rank: Apprentice
12/11/2013 | 1:32:31 PM
Re: Terabyte Limit Doesn't Compute
The great thing about Active Nav was that it allowed us to go through and find those 15 copies or drafts and then asks how many you would like to keep. It also goes further, and can find near like (not exact) documents and you can get a grasp of those.

For someone who has been working with records both in and outside the federal government, I can tell you the issue of document creep is a big problem. In my experience these File Analysis tools really can help attack that problem.

It was perhaps only 6 years or so I used to laugh when companies that told me they could do auto-classification. Now, in combination with other products like an ERMA, I truly feel that they are requirement for agencies/organizations who have a lot of data. 

The PII and de-duplification, finding all those 0 byte files and empty folders is just icing on the cake!
aidtofind
50%
50%
aidtofind,
User Rank: Apprentice
12/11/2013 | 10:57:36 AM
Re: Terabyte Limit Doesn't Compute
As an archivist who formerly worked for a state government, I'm not the slightest bit surprised that 75% of the total number of documents generated in the conflict turned out to be redundant and/or inconsequential.  Usually by the time documents reach an archives, they've been picked over by their creators and many duplicates and low-content items have already been discarded; even when that's the case it's quite common for the archivist to further "weed" multiple copies of memos, duplicates of reports filed elsewhere, etc.  How much more so when it sounds like this project caused the documents to more or less come straight to NARA without that filtering?  

Beyond the cost of storage, there's also the fact that a bloated, duplicate-ridden collection is more difficult to search and use than a streamlined one.  What Li Tan said is accurate:  "Keep it all" would be akin to an individual carefully filing away every single scrap of paper that ever came into her house, whether it was information-packed correspondence from distant family members or the fifteenth copy of the exact same Little Caesars flier.  It's not helpful to researchers, and it's not an effective use of funds.  I do believe that great care must be taken in determining which materials are truly redundant, and there needs to be transparency in terms of what's being kept versus what's being discarded, but it's extremely rare for "Keep it all" to be the appropriate response to the intake of a large collection.

Also, can I just say that the idea of software that identifies potential privacy issues in the documents warms my heart?  I can't tell you how much time I spent combing through materials we were going to provide to researchers to make sure we wouldn't be revealing social security numbers. 
Li Tan
50%
50%
Li Tan,
User Rank: Ninja
12/10/2013 | 10:43:54 PM
Re: Terabyte Limit Doesn't Compute
Save everything is just like keeping all the posts you have received from banks, companies and government agencies. These posts are of importance when they first arrived so you would like to keep them in case of future reference needed. But normally what has happened is that, most of these posts will stay in pile without being touched anymore. Furthermore, save everything will leave you a dilemma: you don't know what is really important for you and when you really need something, simly you are not able to find it. So it's quite important to identify really valuable stuff instead of saving everything.
AlvinP563
50%
50%
AlvinP563,
User Rank: Apprentice
12/10/2013 | 6:31:59 PM
Take a closer look.
If there is indeed a war on Military records, it would be the government giving the order for records to be distroyed.

The 1st Gulf war is recorded as the most toxic war in "Western military history" due to the vast array of toxic exposures troops were forced to encounter, this consisted in the main of Toxic fallout plumes from the allied boming of enemy chemicle weapons sites, Deplited uranium dust from our very own ordinence, Toxic smoke inhelation from multiple oil well fires.

In addition, troops were ORDERD to take a experimental cocktail of drugs.

It has been widely reported over the last 23 years that 1 in 4 Desert Storm veterans have come down with serious multiple illnesess, many have sadly died.

All this talk about costings and admin problems for the retention and management of battlefield records is a mere deflection from the truth.

The real reason records go missing is two fold, 1. To hide embarissment & negligence exposures  2. To avoid any legal litigation resulting from such action, pure and simple.
Thomas Claburn
50%
50%
Thomas Claburn,
User Rank: Author
12/10/2013 | 4:39:59 PM
Re: Terabyte Limit Doesn't Compute
> Save everything! History won't forgive you for repeting the same mistakes you described after the first Gulf War.

If you save everything, you're inviting a maintenance cost that will grow over time. In my experience, data that isn't in motion, being maintained and updated, quickly becomes inaccessible. Selectivity might end up being a better use of tax dollars.
Joel5171
50%
50%
Joel5171,
User Rank: Apprentice
12/10/2013 | 3:16:40 PM
Re: Terabyte Limit Doesn't Compute
As far as I know there is no cloud for classified records, nor do I think ther ever should be unless there is full assurance that those records are protected until such time as they are declassified. 
Page 1 / 2   >   >>
Skirting the Big Data Expertise Shortage
Skirting the Big Data Expertise Shortage
Federal departments and agencies have embraced big data in a big way, despite a shortage of trained and experienced workers, particularly data scientists. What tools and strategies are helping bridge the divide?
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Must Reads Oct. 21, 2014
InformationWeek's new Must Reads is a compendium of our best recent coverage of digital strategy. Learn why you should learn to embrace DevOps, how to avoid roadblocks for digital projects, what the five steps to API management are, and more.
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
A roundup of the top stories and community news at InformationWeek.com.
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.