Skip to Content

Practice

Title Updatedsort icon
Diigo

Diigo provides a browser add-on that can really improve your research productivity. As you read on the web, instead of just bookmarking, you can highlight portions of web pages that are of particular interest to you. You can also attach sticky notes to specific parts of web pages.

6 years 42 weeks ago
Heritrix

Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

6 years 42 weeks ago
HTTrack

 

HTTrack is a free and easy-to-use offline browser utility. It allows you to download a World Wide Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer. HTTrack arranges the original site's relative link-structure. Simply open a page of the "mirrored" website in your browser, and you can browse the site from link to link, as if you were viewing it online.

6 years 42 weeks ago
Metaproducts

Metaproducts offers several commercial capture and off-line browsing tools.

6 years 42 weeks ago
mod_oai

The goal of the mod_oai project is to bring the efficiency of OAI-PMH to everyday web sites.

6 years 42 weeks ago
The Nalanda iVia Focused Crawler

The Nalanda iVia Focused Crawler (NIFC) is a focused Web crawler. It was created by Dr. Soumen Chakrabarti (Indian Institute of Technology Bombay) and developed with the support of IIT Bombay, the iVia Team and the U.S. Institute of Museum and Library Services.

6 years 42 weeks ago
NetarchiveSuite

The NetarchiveSuite is the complete web archiving software package developed within the netarchive.dk project from 2004 and onwards. The primary function of the NetarchiveSuite is to plan, schedule and run web harvests of parts of the Internet. It scales to a wide range of tasks, from small, thematic harvests (e.g. related to special events, or special domains) to harvesting and archiving the content of an entire national domain.

6 years 42 weeks ago
pageVault

 

pageVault supports the archiving of all unique responses generated by a web server. It allows you to know exactly what information you have published on your web site, whether static pages or dynamically generated content, and regardless of format (HTML, XML, PDF, zip, Microsoft Office formats, images, sound), regardless of rate of change.

6 years 42 weeks ago
Spadix software

Spadix Software can download websites from a starting URL, search engine results or web dirs, and is able to follow external links. It also supports filtering and crawling of password-protected sites.

6 years 42 weeks ago
Sparkleware

Sparkleware is a commercial off-line browser.

6 years 42 weeks ago
Syndicate content


about seo