=head1 NAME XML::Records - Perlish record-oriented interface to XML =head1 SYNOPSIS use XML::Records; my $p=XML::Records->new('data.lst'); $p->set_records('credit','debit'); my ($t,$r) while ( (($t,$r)=$p->get_record()) && $t) { my $amt=$r->{Amount}; if ($t eq 'debit') { ... } } =head1 DESCRIPTION XML::Records provides a simple interface for reading "record-structured" XML documents, that is, documents in which the immediate children of the root element form a sequence of identical and independent sub-elements such as log entries, transactions, etc., each of which consists of "field" child elements or attributes. XML::Records allows you to access each record as a simple Perl hash. =head1 METHODS =over 4 =item $reader=XML::Records->new(source, [options]); Creates a new reader object I<source> is either a reference to a string containing the XML, the name of a file containing the XML, or an open IO::Handle or filehandle glob reference from which the XML can be read. The I<Option>s can be any options allowed by XML::Parser and XML::Parser::Expat, as well as two module-specific options: =over 4 =item I<Latin> If set to a true value, causes Unicode characters in the range 128-255 to be returned as ISO-Latin-1 characters rather than UTF-8 characters. =item I<Catalog> Specifies the URL of a catalog to use for resolving public identifiers and remapping system identifiers used in document type declarations or external entity references. This option requires XML::Catalog to be installed. =back =item $reader->set_records(name [,name]*); Specifies what XML element-type names enclose records. =item ($type,$record)=$reader->get_record([name [,name]*]); Retrieves the next record from the input, skipping through the XML input until it encounters a start tag for one of the elements that enclose records. If arguments are given, they will temporarily replace the set of record-enclosing elements. The method will return a list consisting of the name of the record's enclosing element and a reference to a hash whose keys are the names of the record's child elements ("fields") and whose values are the fields' contents (if called in scalar context, the return value will be the hash reference). Both elements of the list will be undef if no record can be found. If a field's content is plain text, its value will be that text. If a field's content contains another element (e.g. a <customer> record contains an <address> field that in turn contains other fields), its value will be a reference to another hash containing the "sub-record"'s fields. If a record includes repeated fields, the hash entry for that field's name will be a reference to an array of field values. Attributes of record or sub-record elements are treated as if they were fields. Attributes of field elements are ignored. Mixed content (fields with both non-whitespace text and sub-elements) will lead to unpredictable results. Records do not actually need to be immediately below the document root. If a <customers> document consists of a sequence of <customer> elements which in turn contain <address> elements that include further elements, then calling get_record with the record type set to "address" will return the contents of each <address> element. =back =head1 EXAMPLE Print a list of package names from a (rather out-of-date) list of XML modules: #!perl -w use strict; use XML::Records; my $p=XML::Records->new('modules.xml') or die "$!"; $p->set_records('module'); while (my $record=$p->get_record()) { my $pkg=$record->{package}; if (ref $pkg eq 'ARRAY') { for my $subpkg (@$pkg) { print $subpkg->{name},"\n"; } } else { print $pkg->{name},"\n"; } } =head1 RATIONALE XML::RAX, which implements the proposed RAX standard for record-oriented XML access, does most of what XML::Records does, but its interface is not very Perlish (due to the fact that RAX is a language-independent interface) and it cannot cope with fields that have sub-structure (because RAX itself doesn't address the issue). XML::Simple can do everything that XML::Records does, at the expense of reading the entire document into memory. XML::Records will read the entire document into a single hash if you set the root element as a record type, but you're really better off using XML::Simple in that case as it's optimized for such usage. =head1 AUTHOR Eric Bohlman (ebohlman@earthlink.net, ebohlman@omsdev.com) =head1 COPYRIGHT Copyright 2001 Eric Bohlman. All rights reserved. This program is free software; you can use/modify/redistribute it under the same terms as Perl itself. =head1 SEE ALSO XML::Parser XML::RAX XML::Simple XML::Catalog perl(1). =cut