Normalization of File Formats

There was a large percentage of readers who were interested in digital preservation, so we will begin to take a few topics each month but also continue to provide information on basic archives education. Hopefully this should help those of you just starting out as well as those of you who are ready (or not) to begin to tackle all the digital records on your campus.


CDs recently found in the storage room of the Lower School Library. Almost all the files are either videos (various formats), Powerpoint slideshows, or individual still photos. There were six binders found, and this particular one holds 74 CDs.

Like every archivist, you are taking in the paper records and artifacts that staff and alumni bring to you to enhance your school’s collection. What you may not be as comfortable with are all the digital records that staff members have, photographs taken by students and alumni, or disks that you randomly find in traditional file folders. It’s hard to know where to begin, especially if you are trying to do all of this part-time and you still have a backlog.

There is much that is available online for background information on this topic so that you will be more prepared when you speak with and work with your IT department on this project. In an ideal situation, you will be able to sit down and have regular meetings with the department, and working together, come up with a plan to regularly take in the permanent records into your new digital repository. The majority of your records will be text-based (such as publications), still photographs, emails (correspondence deemed to be permanent), and moving images (videos).

One of the first issues you will need to decide is in what format will these records be kept? The resources listed below will help in your decision-making, but one major rule of thumb is that you want to avoid proprietary formats (such as .doc for textual documents or .xls for spreadsheets).

There are many more issues and concern, such as how to convert the records, where to store the records, and what system you will use to access the records, but those will be put in their own blog post.

Find a disk or CD in files coming in to the archives and you see an extension that you don’t recognize? Use PRONOM (from the UK National Archives) to help you learn more about it.


  • Recommended Data Formats for Preservation Purposes in the Florida Digital ArchiveThis table is intended to help Florida university administrators develop guidelines for preparing and submitting files to the Florida Digital Archive. The chart includes the types of formats (text, visual, etc.), and which types of file formats have a high, medium, or low confidence level of being accessible for a long time.
  • Sustainability of Digital Formats: Planning for the Library of Congress: The Digital Formats Web site provides background information, sustainability factors, content categories, and format descriptions. 
  • What file formats should I use? The list of articles and reference sources on this Digital Curation Exchange page will help you determine which file formats your school can use for a quick access copy and which formats your school will use for a preservation copy, which is typically larger in size and in a format that should last a longer time or will be easily converted in the future.
  • How do I avoid file format obsolescence? Another resource from the Digital Curation Exchange. This Wiki page will continue to include the most recent sources of information on the topic of conversion of file formats while keeping the integrity of the data.

2 responses to “Normalization of File Formats

  1. It’s worth being aware of the dangers of normalisation too. Converting that .doc file may well change aspects of the layout. Though it’s a proprietary format, its sheer ubiquity means it’s unlikely to be high risk

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s