XML-Reader Version 0.49
Reading XML and providing path information based on a pull-parser.
SYNOPSIS
use XML::Reader;
my $text = q{n tm r};
my $rdr = XML::Reader->new(\$text);
while ($rdr->iterate) {
printf "Path: %-19s, Value: %s\n", $rdr->path, $rdr->value;
}
This program produces the following output:
Path: /init , Value: n t
Path: /init/page/@node , Value: 400
Path: /init/page , Value: m r
Path: /init , Value:
DESCRIPTION
XML::Reader provides a simple and easy to use interface for sequentially parsing XML
files (so called "pull-mode" parsing) and at the same time keeps track of the complete XML-path.
It was developped as a wrapper on top of XML::Parser (while, at the same time, some basic functions
have been copied from XML::TokeParser). Both XML::Parser and XML::TokeParser allow pull-mode
parsing, but do not keep track of the complete XML-Path. Also, the interfaces to XML::Parser and
XML::TokeParser require you to distinguish between start-tags, end-tags and text on seperate lines,
which, in my view, complicates the interface (although, XML::Reader allows option {filter => 4, mode => 'pyx'} which
emulates start-tags, end-tags and text on separate lines, if that's what you want).
There is also XML::TiePYX, which lets you pull-mode parse XML-Files (see
L for an introduction to PYX).
But still, with XML::TiePYX you need to account for start-tags, end-tags and text, and it does not
provide the full XML-path.
By contrast, XML::Reader translates start-tags, end-tags and text into XPath-like expressions. So
you don't need to worry about tags, you just get a path and a value, and that's it. (However, should
you wish to operate XML::Reader in a PYX compatible mode, there is option {filter => 4, mode => 'pyx'}, as mentioned
above, which allows you to parse XML in that way).
But going back to the normal mode of operation, here is an example XML in variable '$line1':
my $line1 =
q{
- abc
-
fgh
ooo ppp
};
This example can be parsed with XML::Reader using the methods C to iterate one-by-one through the
XML-data, C and C to extract the current XML-path and its value.
You can also keep track of the start- and end-tags: There is a method C, which returns 1 or
0, depending on whether the XML-file had a start tag at the current position. There is also the
equivalent method C.
There are also the methods C, C, C and C. C gives you the current tag-name,
C returns the attribute-name, C returns either 'T' for text or '@' for attributes and
C indicates the current nesting-level (a number >= 0).
Here is a sample program which parses the XML in '$line1' from above to demonstrate the principle...
use XML::Reader;
my $rdr = XML::Reader->new(\$line1);
my $i = 0;
while ($rdr->iterate) { $i++;
printf "%3d. pat=%-22s, val=%-9s, s=%-1s, e=%-1s, tag=%-6s, atr=%-6s, t=%-1s, lvl=%2d\n", $i,
$rdr->path, $rdr->value, $rdr->is_start, $rdr->is_end, $rdr->tag, $rdr->attr, $rdr->type, $rdr->level;
}
...and here is the output:
1. pat=/data , val= , s=1, e=0, tag=data , atr= , t=T, lvl= 1
2. pat=/data/item , val=abc , s=1, e=1, tag=item , atr= , t=T, lvl= 2
3. pat=/data , val= , s=0, e=0, tag=data , atr= , t=T, lvl= 1
4. pat=/data/item , val= , s=1, e=0, tag=item , atr= , t=T, lvl= 2
5. pat=/data/item/dummy , val= , s=1, e=1, tag=dummy , atr= , t=T, lvl= 3
6. pat=/data/item , val=fgh , s=0, e=0, tag=item , atr= , t=T, lvl= 2
7. pat=/data/item/inner/@id , val=fff , s=0, e=0, tag=@id , atr=id , t=@, lvl= 4
8. pat=/data/item/inner/@name, val=ttt , s=0, e=0, tag=@name , atr=name , t=@, lvl= 4
9. pat=/data/item/inner , val=ooo ppp , s=1, e=1, tag=inner , atr= , t=T, lvl= 3
10. pat=/data/item , val= , s=0, e=1, tag=item , atr= , t=T, lvl= 2
11. pat=/data , val= , s=0, e=1, tag=data , atr= , t=T, lvl= 1
INSTALLATION
To install this module, run the following commands:
perl Makefile.PL
make
make test
make install
SUPPORT AND DOCUMENTATION
After installing, you can find documentation for this module with the
perldoc command.
perldoc XML::Reader
You can also look for information at:
RT, CPAN's request tracker
http://rt.cpan.org/NoAuth/Bugs.html?Dist=XML-Reader
AnnoCPAN, Annotated CPAN documentation
http://annocpan.org/dist/XML-Reader
CPAN Ratings
http://cpanratings.perl.org/d/XML-Reader
Search CPAN
http://search.cpan.org/dist/XML-Reader/
COPYRIGHT AND LICENSE
Copyright (C) 2009-2011 by Klaus Eichner
All rights reserved. This program is free software; you can redistribute
it and/or modify it under the terms of the artistic license 2.0,
see http://www.opensource.org/licenses/artistic-license-2.0.php