E-Book, Englisch, 234 Seiten
Masanès / Masanés Web Archiving
1. Auflage 2007
ISBN: 978-3-540-46332-0
Verlag: Springer Berlin Heidelberg
Format: PDF
Kopierschutz: 1 - PDF Watermark
E-Book, Englisch, 234 Seiten
ISBN: 978-3-540-46332-0
Verlag: Springer Berlin Heidelberg
Format: PDF
Kopierschutz: 1 - PDF Watermark
This book assembles contributions from computer scientists and librarians that altogether encompass the complete range of tools, tasks and processes needed to successfully preserve the cultural heritage of the Web. It combines the librarian's application knowledge with the computer scientist's implementation knowledge, and serves as a standard introduction for everyone involved in keeping alive the immense amount of online information.
Julien Masanès is a cofounder and the Director of the European Archive, a non-profit foundation for Web preservation and digital cultural access. Before this he had directed the Web Archiving Project at the Bibliothèque nationale de France (BnF) since 2000. He also participated very actively in the creation of the International Internet Preservation Consortium (IIPC), which he coordinated during its first two years. He is a curator and received a degree in librarianship at ENSSIB (Lyon) in 1999. He was a digital preservation adviser at BnF and has participated in various national and international initiatives in this domain, such as the European Project NEDLIB, the Global Digital Format Registry and the OCLC/RLG Open Archive Information System Certification Group. He has numerous publications in this field, and he launched and presently chairs the International Web Archiving Workshop (IWAW) series, the main international rendezvous in this domain.
Autoren/Hrsg.
Weitere Infos & Material
1;Contents;5
2;1 Web Archiving: Issues and Methods;8
2.1;1.1 Introduction;8
2.2;1.2 Heritage, Society, and the Web;9
2.3;1.3 Web Characterization in Relation to Preservation;18
2.4;1.4 New Methods for a New Medium;25
2.5;1.5 Current Initiatives Overview;47
2.6;1.6 Conclusion;53
2.7;References;53
3;2 Web Use and Web Studies;61
3.1;2.1 Summary;61
3.2;2.2 Content Analysis;62
3.3;2.3 Surveys;64
3.4;2.4 Rhetorical Analysis;65
3.5;2.5 Discourse Analysis;66
3.6;2.6 Visual Analysis;67
3.7;2.7 Ethnography;69
3.8;2.8 Network Analysis;70
3.9;2.9 Ethical Considerations;71
3.10;2.10 Conclusion;72
3.11;References;73
4;3 Selection for Web Archives;76
4.1;3.1 Introduction;76
4.2;3.2 Defining a Selection Policy;77
4.3;3.3 Issues and Concepts;81
4.4;3.4 Selection Process;87
4.5;3.5 Documentation;94
4.6;3.6 Conclusion;94
4.7;References;95
5;4 Copying Websites;97
5.1;4.1 Introduction – The Art of Copying Websites;97
5.2;4.2 The Parser;99
5.3;4.3 Fetching Document;106
5.4;4.4 Create an Autonomous, Navigable Copy;111
5.5;4.5 Handling Updates;113
5.6;4.6 Conclusion;116
5.7;Reference;116
6;5 Archiving the Hidden Web;119
6.1;5.1 Introduction;119
6.2;5.2 Finding At Least One Path to Documents;120
6.3;5.3 Characterizing the Hidden Web;123
6.4;5.4 Client Side Hidden Web Archiving;125
6.5;5.5 Crawler-Server Collaboration;127
6.6;5.6 Archiving Documentary Gateways;129
6.7;5.7 Conclusion;131
6.8;References;132
7;6 Access and Finding Aids;134
7.1;6.1 Introduction;134
7.2;6.2 Registration;136
7.3;6.3 Indexing and Search Engines;138
7.4;6.4 Access Tools and User Interface;140
7.5;6.5 Case Studies;149
7.6;Acknowledgements;154
7.7;References;154
8;7 Mining Web Collections;155
8.1;7.1 Introduction;155
8.2;7.2 Material for Web Archives;157
8.3;7.3 Other Types of Information;162
8.4;7.4 Use Cases;163
8.5;7.5 Conclusion;174
9;8 The Long-Term Preservation of Web Content;179
9.1;8.1 Introduction;179
9.2;8.2 The Challenge of Long-Term Digital Preservation;180
9.3;8.3 Developing Trusted Digital Repositories;183
9.4;8.4 Digital Preservation Strategies;186
9.5;8.5 Preservation Metadata;191
9.6;8.6 Digital Preservation and the Web;195
9.7;8.7 Conclusion;196
9.8;Acknowledgements;196
9.9;References;196
10;9 Year-by-Year: From an Archive of the Internet to an Archive on the Internet;202
10.1;9.1 Introduction;202
10.2;9.2 Background: Early Internet Publishing;203
10.3;9.3 1996: Launch of the Internet Archive;203
10.4;9.4 1997: Link Structure and Tape Robots;204
10.5;9.5 1998: Getting Archive Data Onto (Almost) Every Desktop;205
10.6;9.6 1999: From Tape to Disk, A New Crawler, and Moving Images;206
10.7;9.7 2000: Building Thematic Web Collections;207
10.8;9.8 2001: Public Access with the Wayback Machine: The 9/11 Archive;208
10.9;9.9 2002: The Library of Alexandria, The Bookmobile, and Copyrights;209
10.10;9.10 2003: Extending Our Reach via National Libraries and Educational Institutions;211
10.11;9.11 2004: And the European Archive and the Petabox;212
10.12;9.12 The Future;212
10.13;References;213
11;10 Small Scale Academic Web Archiving: DACHS;214
11.1;10.1 Why Small Scale Academic Archiving?;214
11.2;10.2 Digital Archive for Chinese Studies;215
11.3;10.3 Lessons Learned: Summing Up;224
11.4;10.4 Useful Resources;225
12;List of Acronyms;227
13;Index;229




