Q. How do I prepare for a digitizing initiative?
There is much to consider when undertaking digitization work of any size but fortunately many successful projects during the past two decades have established best practices and clear workflows. Additionally, national and international organizations have published several data management standards relevant to digitization and the preservation and dissemination of the digital content you produce.
Examples of successful projects and long-term digitization programs range across libraries, archives, and museums; you may find the most compelling arguments for digitization or the most relevant models within your own type of repository, but you may also find it among use cases at other institutions. The fundamental tasks, workflows, and knowledge-base essential to successful digitization and provision of this material to your audiences is constant across all settings. The format of analog materials, however, does require specific knowledge and techniques to accomplish the basic digitization processes.
It is important to note that digitization projects or long-term programs always build on established practices in libraries, archives, and museums. Digitization does not replace other institutional tasks; rather it adds to them and thus requires new funds or a redistribution of existing funds.
Please note that there is not a strict order to these actions. Many will take place in parallel. Decisions regarding materials selection, workflow, funding, and equipment are often decided in an iterative fashion until the project planning document is finalized and vetted with administrators.
Attend conferences, workshops, and courses to learn about digital curation, digital preservation, and digitization.
- International Digital Curation Conference (IDCC). Various cities in Europe and the US; generally held in winter.
IDCC brings together those who create and manage data and information, those who use it and those who research and teach about curation processes. Our view of ‘data’ is a broad one – video games and virtual worlds are of just as much interest as data from laboratory instruments or field observation. Whether the information originates in the arts, humanities, social or experimental sciences the issues faced are cross-disciplinary.
- International Conference on Preservation of Digital Objects. (i-PRES). Rotates through Europe, North America, and Asia; generally held in the fall. (Link to 2013 conference)
i-PRES brings together a wide array of information professionals focused on preservation of digital content.
- IS&T Archiving Conference; various US and European cities; usually April through June.
Since the first meeting in 2004, Archiving has continued to offer the opportunity for imaging scientists and those working in the cultural heritage community (curators, archivists, librarians, etc.), as well as in government, industry, and academia, to come together to discuss the most pressing issues related to the digital preservation and stewardship of hardcopy, audio, and video.
- Society of American Archivists (SAA) Annual Meeting; various US cities; usually in August.
The annual meeting of the Society of American Archivists, held in late summer in different cities throughout the country, includes a wide array of informative education sessions, pre-conference workshops, networking opportunities, special events, exhibits, and tours of local repositories.
Workshops (This reflects just a small sampling of workshops, webinars, and other training events available)
- DigCCurr Professional Institute. School of Information and Library Science, University of North Carolina at Chapel Hill. Chapel Hill, NC. Annually in May with follow-on session in January.
This professional institute consists of one five-day session May and a two-day follow-up session January. Each day of the summer session will include lectures, discussion, and hands-on "lab" components. This institute is designed to foster skills, knowledge, and community-building among professionals responsible for the curation of digital materials. Participants in the May event will return to Chapel Hill in January to discuss their experiences in implementing what they have learned in their own work environments. Participants will compare experiences, lessons learned, and strategies for continuing progress.
Digital Archives Specialist (DAS) Certificate. Society of American Archivists, Chicago. Workshops held online and in various locations in the U.S. throughout the year.
The DAS Curriculum, developed by experts in the field of digital archives, is structured in tiers of study that guide you to choose courses based on your specific knowledge, training, and needs. You can choose individual courses—or you can take your learning to the next level by earning a Digital Archives Specialist Certificate from SAA after completing required coursework and passing both course and comprehensive examinations.
- Digital Curation 101. Digital Curation Centre. Edinborough, Scotland. Various locations throughout the UK.
Research Councils and funding bodies are increasingly requiring evidence of adequate and appropriate provisions for data management and curation in new grant funding applications. The DCC's free half-day workshops provides an introduction to research data management and curation, the range of activities and roles that should be considered when planning and implementing new projects, and an overview of tools that can assist with curation activities.
- Digital Preservation Management Workshop. Various locations; various dates each year.
The Digital Preservation Management Workshops, a series presented since 2003, incorporate community standards and exemplars of good practice to provide practical guidance for developing effective digital preservation programs. The workshop and Tutorial are now based at MIT under the direction of Nancy McGovern, Head, Curation and Preservation Services for MIT Libraries. The workshops, partially funded by grants from the NEH, were initially developed at Cornell University beginning in 2003 under the direction of Anne Kenney and Nancy McGovern and have been further developed under the direction of Nancy McGovern at ICPSR and MIT from 2008 - 2013.
- Digital Preservation Outreach and Education (DPOE) Program. Library of Congress. Online and face-to-face; various locations and dates.
The DPOE mission is to foster national outreach and education about digital preservation by building a collaborative network of instructors and partners to provide training to individuals and organizations seeking to preserve their digital content.
The DPOE team supports the growth of the DPOE National Trainer Network and builds relationships with organizations to make digital preservation training more widely available to working professionals.
Since the first Train-the-Trainer workshop in September 2011, DPOE Trainers have held over 20 training events (with 12 more upcoming) in States across the nation. As a result, more than 900 working professionals from a variety of institutions have received training in the fundamentals of digital preservation.
- Preservation 101. Preservation Basics for Analog and Digital Collections and Preservation Basics for Paper and Media Collections. Northeast Document Conservation Center (NEDCC).
Designed as an introductory course, Preservation 101 grounds participants in the theory and practice of preserving library and archival collections. Beginning with an orientation to the characteristics of paper-based materials, photographs, audiovisual media, and digital content, the course also examines the key causes of deterioration; strategies for slowing deterioration and preventing loss; prioritizing collections for preservation; conservation treatment options; and reformatting strategies for preservation and access.Preservation 101 is a hybrid course. A series of eleven live webinars builds on self-paced study through assigned readings. Putting theory into practice, participants will perform a preservation needs assessment of their own institution, and will develop recommendations for improvement and a long-range preservation plan. Preservation 101 is also offered as a self-paced course without an instructor.
- State Electronic Records Initiative (SERI). Council of State Archivists (CoSA).
The Program for Electronic Records Training, Tools, and Standards (PERTTS) began January 1, 2013, and will run for two years with funding from the National Historical Publications and Records Commission. This project will focus on two areas: (a) providing access to in-depth information about standards, best practices, and tools for the management and preservation of electronic records, and (b) delivering education and training to ensure that these standards, best practices, and tools are widely and effectively implemented.
Courses and Degrees
- Post Master's Certificate in Data Curation. School of Information and Library Science. University of North Carolina at Chapel Hill.
The Data Curation emphasis within the Post-Masters Certificate (PMC) program aims to educate information professionals in libraries, archives, government agencies and corporations or businesses who are responsible for managing and preserving the data assets of the organization. We expect graduates to serve as leaders in those organizations and to be active participants in defining data curation standards and practices. The curriculum will include advanced courses that prepare students to understand the statutory, economic, and technical issues related to data creation, dissemination, use and reuse and preservation in organizations.
The Data Curation program consists of two introductory courses taken on campus over a 2 week period, six courses taken online over subsequent semesters, and two project-oriented independent studies related to the student’s current or desired work environment.
Ask questions and establish the parameters of your project
Ask why you are considering digitization.
All projects need specific missions, goal, objectives, and audiences. Determining the "why" of digitization projects often involves a collection survey or a user needs/audience assessment. You need to establish how the digitization project will serve the repository, the larger institution, all the relevant stakeholders, and the materials themselves. For more detailed considerations see the "Why Should I Digitize" section of this guide.
Determine who the stakeholders are for this project.
Stakeholders include funders, content creators, content users, staff, administrators, and anyone else involved with the lifecycle of the digital content.
Determine what materials will be digitized.
This is covered in the "Select" section in this guide.
Determine the condition of the materials and associated information management tools.
An assessment of the materials will indicate any special handling that may be required and reveal how much associated metadata currently exists and in what formats such as finding aids, indexes, registration databases, and catalog records.
Compile, read, and follow established best practices and standards from the outset. Build your selection, workflows, storage, and dissemination practices on the successes of others.
See the standards and best practices listed in this guide and organizations such as ISO, NISO, and the Library of Congress.
Whether you are preparing a grant application to fund your project or not, draft a project proposal that includes all the components funders require including:
Once reviewed by knowledgeable professionals experienced in the type of project you are proposing (e.g., colleagues at other institutions) and vetted by your administration this document will become your roadmap to the project. Make sure that all components are fully detailed. You should revisit this document often to check your progress against your plan.
Now that you have good reasons for this project and a feasible plan to go forward, solicit support from your institutions administrators and other potential supporters and resource allocators.
It is unlikely that you will be able to proceed with a project unless you have support from above. In cases where the mandate to digitize materials comes from the administration, it is important to make sure that all involved share the same goals and objectives and understand the extent of the project and its ramifications of issues such as budgeting, personnel allocation, staff training, and the public perception of the institution. All involved should thoroughly discuss the benefits and risks of the project or more extensive program.
Look for collaborators as appropriate.
Not all projects need collaborators, but many benefit from being in a larger context, especially when your partners are experienced with digitization projects and can provide knowledge, strategies, workflows, and evaluation procedures.
Determine if your repository holds the copyright or other intellectual property rights (IPR) to the materials you are considering to digitize.
See "Selecting for Digitization" section of this guide.
Determine if you have or can garner adequate financial support for the project. Often this step involves writing a grant proposal to a state, federal, or private funder but small projects may be self-funded.
Experience shows that while small projects may consume limited funds, digitization projects all require financial and human resources. Projects "run on a shoestring" usually have little impact or simply fail.
Determine if you have adequate technical support and infrastructure.
Anyone can buy a scanner and make images of pages or photographs; not all institutions have the in-house knowledge and skills necessary to build databases of these materials, create adequate metadata, and effectively provide the content to the public. Many smaller institutions may not have the technical infrastructure in the form of servers and software, nor the IT staff to run these systems. In these cases, a small institution with significant materials may seek to partner with a larger institution or a digital consortium.
Is your institution committed to long-term preservation and curation of the digital resources it creates?
Digital materials need ongoing curation. Without a firm commitment to preservation of the digitized content it is unlikely to last or be viewed by many after 5-10 years. It is important to ask whether the project is worthwhile if there is no institutional commitment to preserving it for the long term.
- Ball, Matt. Preparing a Digital Project: Lessons from the Harvard Law School Library. June 6, 2011. http://www.youtube.com/watch?v=vvpCT9_btiM http://www.youtube.com/watch?v=vvpCT9_btiM
Matt Ball discusses lessons learned for a Harvard Law School Library Digitization Project. Video of a 2004 CALI Conference for Law School Computing Conference.
Review use cases
Digitization projects started in earnest in the mid- to late-1990s. In the intervening years there have been many successful project that have helped to establish informal best practices and formal standards. Many of these projects and their websites remain useful and viable today. Visiting these sites and reading papers and reports from the project staff illustrate and explain what was done and how it was accomplished. Some of the sites listed below present specific tools or methodologies the project developed and used.
- IFLA. Digitisation Projects and Best Practices. Newspaper Digitisation Projects Worldwide.http://www.ifla.org/node/6777
Site lists several successful projects and their related digitization guidelines and best practices.
- Library of Congress. American Memory. Technical Information. Building Digital Collections. http://memory.loc.gov/ammem/about/techIn.html
This page provides information on the copyright, metadata, preservation, scanning and conversion, and text mark-up practices and policies that the Library of Congress follows for the American Memory site.
- Library of Congress.National Digital Newspaper Program. http://www.loc.gov/ndnp/
The National Digital Newspaper Program (NDNP), a partnership between the National Endowment for the Humanities (NEH) and the Library of Congress (LC), is a long-term effort to develop an Internet-based, searchable database of U.S. newspapers with descriptive information and select digitization of historic pages. Supported by NEH, this rich digital resource will be developed and permanently maintained at the Library of Congress. An NEH award program will fund the contribution of content from, eventually, all U.S. states and territories. This site provides extensive guidelines and best practices for newspaper digitization projects.
- Rieger, Oya Y. Preservation in the Age of Large-Scale Digitization: A White Paper. Washington, DC., CLIR, 2008.
From the CLIR abstract: This report examines large-scale digital initiatives (LSDIs) to identify issues that will influence the availability and usability, over time, of the digital books these projects create. The paper describes four large-scale projects—Google Book Search, Microsoft Live Search Books, Open Content Alliance, and the Million Book Project—and their digitization strategies. It then discusses a range of issues affecting the stewardship of the digital collections they create: selection, quality in content creation, technical infrastructure, and organizational infrastructure.
General Guides and Overviews:
- Caplan, Priscilla. "The Preservation of Digital Materials," Library Technology Reports 44(2), Feb./March 2008. http://www.alatechsource.org/ltr/the-preservation-of-digital-materials
Priscilla Caplan notes that this issue of Library Technology Reports,"is intended to provide a relatively brief, relatively comprehensive introduction to digital preservation."
- Digital Preservation Management: Implementing Short-term Strategies for Long-term Problems. Tutorial. Cornell University Libraries, 2003-2007. Site maintained by ICPSR.
Tutorial that covers a range of topics around the digital curation lifecycle.
- Federal Agencies Digitization Guidelines Initiative (FADGI) Digitization Activities: Project Planning and Management Outline, November 2009.
This document outlines a generic workflow of high-level activities for planning and management purposes relating to the digitization of cultural materials.
- Heritage Preservation. A Public at Risk: The Heritage Health Index Report on the State of America’s Collections; a project of Heritage Preservation and the Institute of Museum and Library Services, 2005. http://www.heritagepreservation.org/HHI/HHIsummary.pdf
The Heritage Health Index is the first comprehensive survey Over 4.8 Billion Artifacts are held in public trust by more than 30,000 ever conducted of the condition and preservation needs of our nation’s collections. The project was conceived and implemented by the nonprofit organization Heritage Preservation in partnership with the federal Institute of Museum and Library Services.
- JISC. Digital Media Website. http://www.jiscdigitalmedia.ac.uk/
This site is designed to help the UK’s higher education, further education and skills communities embrace and maximise the use of digital media (still images, sound and video). Through their online resources, help desk and consultancy services, JISC Digital Media helps education providers to use digital media in innovative, practical and cost-effective ways.
- Kenney, Anne R., and Oya Y. Rieger, eds. Moving Theory into Practice: Digital Imaging for Libraries and Archives. Mountain View, Calif.:Research Libraries Group, 2000. http://www.library.cornell.edu/preservation/tutorial/
Seminal textbook for conducting digitization projects. Despite its age this volume still contains much useful information.
The NINCH Guide to Good Practice in the Digital Representation and Management of Cultural Heritage Materials. Humanities Advanced Technology and Information Institute (HATII), University of Glasgow, and the National Initiative for a Networked Cultural Heritage (NINCH), 2002. http://www.nyu.edu/its/humanities/ninchguide/
Seminal guide to digitization practices and collection building.
- NISO Framework Advisory Group. A Framework of Guidance for Building Good Digital Collections. 3rd edition. Bethesda, Md.: National Information Standards Organization, 2007. http://framework.niso.org/
A Framework of Guidance for Building Good Digital Collections provides an overview of some of the major components and activities involved in the creation of good digital collections and provides a framework for identifying, organizing, and applying existing knowledge and resources to support the development of sound local practices for creating and managing good digital collections. It is intended for two audiences: cultural heritage organizations planning projects to create digital collections, and funding organizations that want to encourage the development of good digital collections.
- NEDCC. “Surveying Digital Preservation Readiness: Toolkit for Cultural Organizations.” http://www.nedcc.org/resources/digtools.php
From the website: "n May 2005, NEDCC conducted an online survey to gather data about the state of digital preservation readiness in cultural organizations. This initial survey showed that many cultural organizations are digitizing without policies in place to deal with long-term preservation of those digital resources. The article NEDCC Survey and Colloquium Explore Digital Preservation Policies and Practices outlines the findings from the online survey and explains the colloquium process and results. The experts at the colloquium determined that although self-evaluation is important, surveying by consultants will better serve small and medium-sized institutions."
- North Carolina Department of Cultural Resources."Digital Preservation Education for NC State Employees." Last updated October 26, 2010.http://digitalpreservation.ncdcr.gov/newtodp.html
Website to help North Carolina state employees understand the fundamentals of digital preservation and how they can produce more durable digital products.
- OCLC Webjunction.Lori Bell and Joe Natale. Best Practices and Planning for Digitization Projects. 2012. http://www.webjunction.org/documents/webjunction/Best_Practices_and_Planning_for_Digitization_Projects.html
Recommendations and resources for small- to medium-sized libraries, archives, and museums planning a digital imaging project.
- University of Colorado Digital Library Digitization Best Practices. Version 1.0. https://www.cu.edu/digitallibrary/cudldigitizationbp.pdf
This document offers an introduction to digitization, provides links to resources containing more information, and describes the recommended digitization parameters for collections in the University of Colorado Digital Library.
Models and Standards:
- CCSDS 650.0-M-2: Reference Model for an Open Archival Information System (OAIS). Magenta Book. June 2012. [This Recommendation has been adopted as ISO 14721:2012] public.ccsds.org/publications/archive/650x0m2.pdf & www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber=57284
From the ISO website: "ISO 14721:2012 defines the reference model for an open archival information system (OAIS). An OAIS is an archive, consisting of an organization, which may be part of a larger organization, of people and systems that has accepted the responsibility to preserve information and make it available for a designated community. It meets a set of such responsibilities as defined in this International Standard, and this allows an OAIS archive to be distinguished from other uses of the term "archive". The term "open" in OAIS is used to imply that ISO 14721:2012, as well as future related International Standards, are developed in open forums, and it does not imply that access to the archive is unrestricted."
- DCC Curation Lifecycle Model. http://www.dcc.ac.uk/docs/publications/DCCLifecycle.pdf.
Model to explain the interrelationships between digital curation activities and the life span of documents and data.
- Higgins, Sarah. “The DCC Curation Lifecycle Model.” International Journal of Digital Curation 3, No. 1 (2008): 134-140. http://www.ijdc.net/index.php/ijdc/article/viewFile/69/48
The DCC Curation Lifecycle Model has been developed as a generic, curation-specific, tool which can be used, in conjunction with relevant standards, to plan curation and preservation activities to different levels of granularity. The DCC will use the model: as a training tool for data creators, curators and data users; to organise and plan their resources; and to help organisations identify risks to digital assets and plan management strategies for their successful curation.
- Lee, Christopher. Matrix of Digital Curation Knowledge and Competencies. http://ils.unc.edu/digccurr/digccurr-matrix.html
Defines a 6-dimensional matrix for identifying and organizing the material to be covered in a digital curation curriculum: (1) mandates, values, and principles, (2) functions and skills, (3) professional, disciplinary, institutional, organizational, or cultural context, (4) type of resource, (5) prerequisite knowledge, and (6) transition point in information continuum.
- Lee, Christopher A., Helen R. Tibbo, and John C. Schaefer. “Defining What Digital Curators Do and What They Need to Know: The DigCCurr Project.” Paper presented at the ACM IEEE Joint Conference on Digital Libraries, Vancouver, British Columbia, June 2007. http://www.ils.unc.edu/digccurr/jcdl2007_paper.pdf
This paper summarizes an initial draft and guiding principles behind a matrix of digital curation knowledge and competencies, which are acting as the basis for curriculum design efforts.