Sunday, November 30, 2008

Technology and data loss

In the lastest Fall/Winter (print) issue of the American Archivist, Robert Dorman has an interesting article titled "The Creation and Destruction of the 1890 Federal Census."

As Dorman points out, both scholars and the public, historians and genealogists, are painfully aware of the loss, due to a fire in January 1921 and the fire's aftermath, of the only copy of the returns. For every preceeding census, and every later one, multiple copies were made of the returns. Dorman shows that the choice to create only a single record of returns for the 1890 census grew out of the "pervasive fiscal conservatism" of the government, an unprecedented bulkiness of returns due to a broader set of questions being asked than ever before, and from "bureaucratic hubris over the first use of the Hollerith electrical tabulating machines."

As Dorman says, all the statistical data from the census was extracted and published before the 1921 fire. What was lost in the fire were the names of the people, and with that the ability to tie the data to individual households. The names were not transferred to the punch cards used to enumerate the census. Having transferred data to punch cards, the bureaucrats argued that further copies, and even the binding of the returns as in previous censuses, was unneccessary. Both of these choices played a role in making the 1921 fire the disaster that it was. With family names, an important source of information about the last great era of immigration has been lost, as have the possibilites of aggregating data differently for different scholarly purposes.

No comments: