Snarfing up the pieces
--#--
Since we're going to be scraping up a whole bunch of HTML,
and we really only want specific nuggets of information
embedded in that HTML, we need a tool that is very adept
at parsing and winnowing strings of data. Perl is the tool
of choice. (We'll see later how Java can be made to work
nearly as well.)
The fortunate thing about Perl is that it has so many good
people writing useful tools and sharing those tools with,
well, with everybody. Free.
An essential tool for Perl screen-scraping is LWP (libwww-perl)
by Gisle Aas and Martijn Koster. You can find it (and dozens of
other free Perl modules) at
http://www.perl.com/CPAN/modules/by-module/LWP.
Get LWP installed (and Perl, too, if you haven't already) and
you're ready to start.