Offline web browsing

Here is a new version of a simple script I have written to create local copies of websites suitable for browsing offline. We have been using the program successfully at the university to schedule downloads of websites during off peak hours of internet usage for reading the following day.

The program uses the *nix wget utility to do its magic. My code is simply a wrapper around wget which sets the proper command line arguments for wget to create a mirror website. The script uses conservative settings by default for fetching sites in order to be respectful to website owners and other users of the network. Once a site is downloaded the program automatically zips the file in a tar gz for you. You will need python and wget installed in order to run.

Here are some basic examples of how it can be used:

To view the command's help:

    ./offline_browser --help

To create a browseable copy of this website you can type:

    ./offline_browser http://www.saintsjd.com/malawi

To create a browseable copy of this website, and clean up all downloaded files, keeping only the final tar gz archive use the -c option:

    ./offline_browser -c http://www.saintsjd.com/malawi

Some web masters prevent requests from wget. To identify yourself as if you were browsing with Internet Explorer type:

    ./offline_browser -U IE http://www.saintsjd.com/malawi

Or for Firefox identification, with clean up:

    ./offline_browser -U FF -c http://www.saintsjd.com/malawi

Have fun, and send me your bugs and improvements!


AttachmentSize
offline_browser4.54 KB
Categories: