==== NAME ====
html2dbk - convert XHTML to DocBook.
==== VERSION ====
This describes version ``0.02'' of html2dbk.
==== DESCRIPTION ====
This script (and module) converts an XHTML file into DocBook, using both
XSLT and heuristics (as XSLT alone can't do everything).
This script will convert "*filename*.html" into "*filename*.xml"
This expects the input file to be correct XHTML -- there are other programs
(such as html tidy) http://tidy.sourceforge.net/ which can correct files for
you, this does not do that. (Note that if you use HTML Tidy, don't forget to
set 'enclose-block-text' or any unenclosed text will dissappear.)
Note also this is very simple; it doesn't deal with things like
or
which it has no way of guessing the meaning of. This does not merge
multiple XHTML files into a single document, so this converts each XHTML
file into a , with each header being a section (sect1 to sect5).
The first header is used for both the chapter title and the first section
title.
There will likely to be validity errors, depending on how good the original
HTML was. There may be broken links, elements that should be s,
and overuse of and .
==== REQUIRES ====
Getopt::Long
Pod::Usage
Getopt::ArgvFile
HTML::ToDocBook
Cwd
File::Basename
File::Spec
XML::LibXML
XML::LibXSLT
HTML::SimpleParse
==== AUTHOR ====
Kathryn Andersen (RUBYKAT)
perlkat AT katspace dot com
http://www.katspace.org/tools
==== COPYRIGHT AND LICENCE ====
Copyright (c) 2006 by Kathryn Andersen
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.