Skip to Content

Archiving Web Sites - Selection

Q. What do I need to consider for developing a selection policy?

After you have identified what web site content you have and what content you may be acquiring, you will have to decide what content you will continue to preserve and what content you should acquire.  With digital storage costs dropping it may seem like you can grab and preserve anything, but remember that while storage is cheap, managed storage is expensive.  For every web site you wish to preserve, you must also collect and store the metadata necessary to understand and access it in the future. 


Take action

  • Set selection policy goals to meet the institutional mission
  • Build the basis for planning, analysis, and coordination with other archives if needed
  • Set priorities
  • Consider elements that policy should address: content and scope; target audience; anticipated use; depth of coverage; exclusions; review and revision
  • Develop policies

Review use cases

  • National Archives and Library of Canada, "Digital Collection Development Policy."  Last updated January 18, 2007.

    This policy indicates the directions Library and Archives Canada takes to ensure the collection of digital documentary heritage materials of enduring interest to the history and culture of Canada, and in collaboration with others, to enable the collection of other digital information resources of value to Canadians.

  • National Library of Australia, "Selection Guidelines."  Last updated April 27, 2011.

    Provides specific selection guidelines for each of PANDORA's participating agencies.

  • National Library of Australia, "Policy and Practice Statement."  Last updated October 5, 2011.

    PANDORA, Australia’s Web Archive, is a collection of Australian online publications and web sites which is being built by the National Library of Australia and ten other participants. This initiative was commenced in 1996 by the National Library in recognition of the fact that an increasing volume of Australia’s documentary heritage was being published in online formats only.  Given the mandate under the National Library Act, 1960 to build a comprehensive collection of Australian published materials, collecting online resources was seen as a necessary extension of the Library’s collecting responsibilities.


  • International Federation of Library Associations and Institutions, Section on Acquisition and Collection Development.  "Guidelines for a Collection Development Policy using the Conspectus Model."  March 2001. (subscription required to access this resource)
    This booklet is a brief guide on how to write a collection development policy, making use of the Conspectus methodology.  It is the result of the recognition by the IFLA Acquisition and Collection Development Section that its worldwide members lacked a handy introduction to this important subject.  The guide is intended to be of particular value to staff new to collection development and in areas where there is little written tradition of collection development.  We hope that it will be of practical use to librarians setting out on the sometimes daunting task of writing a collection development policy.
  • Library of Congress.  "Collection Policy Statements Supplementary Guideline."  November 2008. 
    Focuses on Web Archiving, with sections on Scope, Research Strengths, Collecting Policy, Acquisition Source: Current and Future, and Collecting Levels.
  • Viégas, Fernanda B. “Bloggers’ Expectations of Privacy and Accountability: An Initial Survey.” Journal of Computer-Mediated Communication 10, no. 3 (2005),
    "This article presents an initial snapshot, based on an online survey of weblog authors, of bloggers' subjective sense of privacy, and of their perceptions of liability. The findings suggest that the social norms of bloggers are emergent and self-imposed. When confronted with questions of defamation and legal liability, respondents in the survey expressed contradictions between their actions and their knowledge of how the technology works. They generally believed that they were liable for what they published online, although they were not concerned about the persistence of their entries. In general, bloggers do not feel as if they know their audiences. For the most part, blog authors have no control over who accesses their entries, and this inability to define their audiences leads them to make a number of assumptions about who their readers are. "
  • Last updated on 08/26/13, 9:54 pm by callee



about seo | group_wiki_page