ParaTools 1.00 Documentation - How-To Guides |
On Linux systems this should just involve doing 'locate Utils.pm', otherwise 'find / -name Utils.pm' should work. Alternatively, you can edit the Utils.pm in the DocParser/ directory of an unpacked distribution, and install it once you have finished.
Add the converter to the list.If you are editing an already installed Utils.pm file you will probably have to be root to do this. If you are editing the Utils.pm inside an unpacked distribution, you will have to reinstall the modules once you are finished (see the Installation section).
The %CONVERTERS hash maps from file extension to converter - _IN_ is replaced by the input file, _OUT_ is replaced by the output (ASCII) file. For example:
html => "links --dump _IN_ > _OUT_"
This takes an input HTML file (say, in.html) and an output ASCII file (out.txt), and carries out 'links --dump in.html > out.txt'.
NB: Don't forget a comma after your converter.
All new document parsers should be named Biblio::DocParser::SomeName, where SomeName is replaced with a unique name (ideally the author's surname). The parser should extend the Biblio::DocParser module like so:
package Biblio::DocParser::SomeName; require Exporter; @ISA = ("Exporter", "Biblio::DocParser"); our @EXPORT_OK = ( 'new', 'parse' );
You should then override the 'new' and 'parse' methods:
e.g.
sub new { my($class) = @_; my $self = {}; return bless($self, $class); }
sub parse { my($self, $lines, %options) = @_;
# Do something with the lines my @lines = split("\n", $lines); my @references = get_refs(@lines); return @references; }
This makes it easy for users to swap out one document parser for another.
ParaTools 1.00 Documentation - How-To Guides |