Net::Z3950::AsyncZ - Perl extension for the Z3950 module
use Net::Z3950::AsyncZ; use Net::Z3950::AsyncZ qw(:record :headers :errors); use Net::Z3950::AsyncZ qw(asyncZOptions isZ_MARC isZ_GRS isZ_RAW isZ_DEFAULT noZ_Response isZ_Header isZ_ServerName Z_serverName);
my $asnycZ = Net::Z3950::AsyncZ->new(servers=>\@servers, query=>$query,cb=>\&output);
my $asnycZ = Net::Z3950::AsyncZ->new( servers=>\@servers, query=>$query, timeout=>$tm, num_to_fetch=>$num,cb=>\&output, options=>\@options, log=>$log, format=>\&format, timeout_min=>$min, interval=>$interval, maxpipes =>$max, );
my @servers = ( [ 'amicus.nlc-bnc.ca', 210, 'NL'], ['bison.umanitoba.ca', 210, 'MARION'], [ 'library.anu.edu.au', 210, 'INNOPAC' ] ); my $query = ' @attr 1=1003 "Henry James" '; my $asnycZ = Net::Z3950::AsyncZ->new(servers=>\@servers, query=>$query,cb=>\&output);
\&output
is a reference to a callback function which outputs
the records returned by the servers. Basically, the callback function gets the
records in the form of an array, in which each element of the array is a line of the
record. At the simplest level, you just loop through the array,
printing each line and a newline
.
my $asnycZ = Net::Z3950::AsyncZ->new(servers=>\@servers, query=>$query, cb=>\&output, log=>"errors.log", num_to_fetch=>10);
Same as Example 1 but requesting 10 records from each server, instead of the default 5 and setting a log for debug error output.
my @servers = ( [ 'amicus.nlc-bnc.ca', 210, 'NL'], ['bison.umanitoba.ca', 210, 'MARION'], [ 'library.anu.edu.au', 210, 'INNOPAC' ] );
my $query = ' @attr 1=1003 "Henry James" ';
my @options = ( asyncZOptions (num_to_fetch=>5,log=>bison_errors.log"), #amicus asyncZOptions (num_to_fetch=>10, query=>' @attr 1=1003 "James Joyce" '), # bison undef # library.anu.edu.au );
$options[0]->set_GRS1();
my $asnycZ = Net::Z3950::AsyncZ->new(servers=>\@servers, query=>$query,cb=>\&output, options=>\@options, log=>"errors_main.log" );
Here we set options which apply to individual servers in the @options array.
asyncZOptions
returns a reference to a Net::Z3950::AsyncZ::Options::_params
object;
we can pass into it options we want to set for individual servers. We have
not defined a _params
object for library.anu.edu.au, so a default _params
will be created for it.
As you can see, we can set different queries for different servers;
we can set separate logs, assuming we want to track errors separately--
we can even suppress error reporting on an individual basis.
In the case of 'amicus', we have asked that the
preferredRecordSyntax
be set to Net::Z3950::RecordSyntax::GRS1
,
since the Natonal Library of Canada uses GRS-1 as its default output;
we could also have done that in the call to asyncZOptions
:
asyncZOptions(preferredRecordSyntax=>Net::Z3950::RecordSyntax::GRS1);
In addition to detailed logging of error messages, there's also error reporting aimed at the user, to inform users when records haven't been returned. See Errors below.
Net::Z3950::AsyncZ adds additional asynchronous support for the Z3950 module through the use of multiple forked processes.
Net::Z3950::AsyncZ adds an additional layer of asynchronous support for the Z3950 module through the use of multiple forked processes. Users may also find that it provides a convenient front end to Z3950.
My own experience with Z3950 async mode was that I could connect to servers and get back the number of records waiting to be fetched, but I was unable to retrieve the records themselves.
The Z3950 documentation talks about this situation:
when the connection is anychronous, the errcode() may be zero, indicating simply that the record has not yet been fetched from the server. In this case, the calling code should try again later. (How much later? As a rule of thumb, after it's done ``something else'', such as request another record or issue another search.)
The documentation promises to provide user code for asynchronous access at a later date, and since synchronous access is apparently written on top of asynchronous code, the techniques for the async mode no doubt exist. But I searched the mailing list archive and couldn't find anything relevant. So, at the risk of carrying coals to Newcastle, I wrote AsyncZ.
AsyncZ forks off maxpipes
processes at a time. After these processes have returned
and reported their results, or after a timeout
period, the next set of maxpipes
are forked off, and so forth. An Event loop is set in motion that enables AsyncZ to
wait for results--either records or error messages--to return from the Z39.50 servers.
Records are passed through, in the order in which they arrive, to a callback
function (cb
), which you supply and which outputs the records.
Each of the forked processes, in turn, runs in its own Event loop while waiting for results to return from the server. The two-fold purpose of these loops, local to each forked process, is:
[1] to help insure that a request to a server doesn't get swallowed up on the network and never return, causing a script or program to hang;
[2] to set a timeout on how long you are prepared to wait for a response.
The loop in the child process is not always enough in itself to prevent a script from
hanging; for such cases you can set a monitor
which will kill the main process
after a timeout period. See the discussion of monitor
in Options.pod
.
monitor
which will kill the main process
after a timeout period. See the discussion of monitor
in Options.html.Various conditions may be responsible for the failure to receive records from a server. In some circumstances, such as timing out, it may be worth a second try. In such cases AsyncZ will try the server a second time. (I refer to these two tries as two cycles.)
The constructor does not return a reference to Net::Z3950::AsyncZ until this two cycle process is completed. This reference gives you access to any errors which may have been reported, i.e. you can check to see why a server has not returned any records and provide error messages to the user as you see fit. In addition, you can keep an Error log with considerably more detailed error reporting; you can in fact keep a separate log for any one or combination of the servers you contact.
Everything essentially proceeds from the constructor. Once you provide the constructor with a list of servers and a query (or queries), and a callback function to output your records, you have nothing to do except wait for the reference which gives you access to the error messages. You can exercise a great deal of control by setting options for both the parent process and any or all of its children.
use Net::Z3950::AsyncZ qw(isZ_Error);
my @servers = ( [ 'amicus.nlc-bnc.ca', 210, 'NL'], ['bison.umanitoba.ca', 210, 'MARION'], [ 'library.anu.edu.au', 210, 'INNOPAC' ], ['130.17.3.75', 210, 'MAIN*BIBMAST'], [ 'library.usc.edu', 2200,'unicorn'], [ 'z3950.loc.gov', 7090, 'Voyager' ], [ 'fc1n01e.fcla.edu', 210, 'FI' ], [ 'axp.aacpl.lib.md.us', 210, 'MARION'], [ 'jasper.acadiau.ca', 2200, 'UNICORN'] );
my $query = ' @attr 1=1003 "Henry James" '; my $asyncZ = Net::Z3950::AsyncZ->new(servers=>\@servers,query=>$query,cb=>\&output); showErrors($asyncZ);
exit;
#------END MAIN------#
sub output { my($index, $array) = @_; foreach my $line(@$array) { print "$line\n" if $line; } print "\n--------\n\n"; }
sub showErrors { my $asyncZ = shift; print "The following servers have not responded to your query: \n"; for(my $i=0; $i< $asyncZ->getMaxErrors();$i++) { my $err = $asyncZ->getErrors($i); next if !isZ_Error($err); print "$servers[$i]->[0]\n"; print " $err->[0]->{msg}\n" if $err->[0]->{msg}; print " $err->[1]->{msg}\n" if $err->[1]->{msg}; } }
You will notice that I have retained the @servers array used in Mike Taylor's sample scripts for the Net::Z3950 module, i.e. an array of references to 3-element arrays of servers, ports, and databases.
When you run this script at the terminal, you will find several types of headers
and detailed error messages interspersed with the query results. For a ``clean''
output see basic_pretty.pl
, which is included in the distribution.
my $asyncZ = Net::Z3950::AsyncZ->new( servers=>\@servers, # array of references to servers in form: [ $host, $port, $database] query=>$query, # format depends on Z3950 querytype: defaults to 'prefix' timeout=>25, # total timeout in seconds for all processes timeout_min=>5, # minumum timeout in secs to exit event loop if all processes are finished interval=>1, # Event loop timer interval maxpipes => 4, # maximum number of forks to be executed at one time log=>undef, # undef, name of log file to which extended error messages are written # or Net::Z3950::AsyncZ::Errors::suppressErrors() cb=>\&cb, # callback function to which records will be sent as available format=>\&format, # callback function to format individual lines of records num_to_fetch=>$num, # number of records to fetch from each server options=>\@options, # array of references to Net::Z3950::AsyncZ::Options::_params objects monitor => 0 # timeout in seconds for a monitoring child process: if # 0 no monitor is created );
AsyncZ::new() takes a set of named parameters. Some of them, like maxpipes
and
timeout
apply to the overall functioning of Net::Z3950::AsyncZ, i.e. to the parent process.
Others, like num_to_fetch
and format
can be set individually for each server in the
servers
array, i.e. for each child process. Settings for the child processes are made using
the options
parameter and the Net::Z3950::AsyncZ::Options::_params array. If a _params object
does not exist for a child process, one is automatically created using default values. The indices of the _params
array must be synchronized with the indices of
the servers array.
Options are treated fully in the separate Options documentation.
For the HTML documentation see: Options.html
For every query sent to a server you must supply three required parameters:
servers
, query
, and cb
. That is, you must supply an array reference to
the server's $host, $port, and $database, you must supply the the query itself, and
finally a callback function, which is responsible for outputting the data returned
from the Z39.50 server. This is the minimal configuration, the one shown above
in The Basic Script.
The optional parameters have either default values or default behaviors.
Some of the optional parameters are exclusive to the functioning of
the parent process, for instance timeout
and interval
. Others are for use only in the child processes, for instance
format
and num_to_fetch
, while log
is used in both the parent
and its children.
There are three kinds of methods in AsyncZ:
$err_array_ref = $asyncZ->getErrors($index);
$index
: index of the server for which error inquiry is
being made. (See servers=>\@servers
parameter of Constructor)
$err_array_ref
: a reference to an array of
two Net::Z3950::AsyncZ::ErrMsg
objects or undef
if the server pointed to by this $index
had no errors.
This array reference must be tested using isZ_Error()
to determine whether
it represents represent a valid error. The two ErrMsg
objects are referred to
as $err_array_ref->[0] and $err_array_ref->[1].
$err_array_ref->[0] references a ycle 1 error if it exists $err_array_ref->[0] references a cycle 2 error if it exists
$error_number = $asnycZ->getMaxErrors();
$error_number
: the Maximum number of possible errors
which have occurred for all servers during current session; because of the two-cycle process,
some errors reported in the first cycle are nullified by successful outcomes during the
second cycle; the class method isZ_Error()
tests for whether a cycle 1 error has been
nullified by a successful second attempt. See Net::Z3950::AsyncZ::isZ_Error.
$asnycZ->_printError($err)
[error_number] error_message Type_of_Error is_Retry_able
[111] Connection refused NET
[225] An error occurred when accessing the library database. --Z3950 ERROR --RETRY
(This is an internal method I used for debugging but leave it here for its possible utility.)See Net::Z3950::AsyncZ::Errors for explanations of error types, etc.
$params_ref = asyncZOptions([option_1=>opt_1, option_2=>opt_2, . . .option_n=>opt_n]);
_params
object is created with a set of
default values. Unless you plan to override the default values, it's not
necessary to call asyncZOptions
: AsyncZ.pm will create
a default _params
object for you.
There is a full range of accessor methods by which each option can be set
and queried in the form of $params_ref->set_option_1(value)
and $value=$params_ref->get_option_1()
. This makes it possible
to set options dynamically.
Options are treated fully in the separate Options documentation.
$param_ref
: reference to a Net::Z3950::AsyncZ::Options::_params object.
Net::Z3950::AsyncZ::Options::_params objects are used internally by AsyncZ and hence treated as private.
Creating a _params object directly by calling its new
method is not recommended.
See the Net::Z3950::AsyncZ::Options::_params manpage
$bool = isZ_<TYPE>
$line
:
current $line of record array
$bool
:
true if header $line designates that current record is of <TYPE>, otherwise false
These utilities test for the type of record which is currently being presented to the callback function. Each record is sent to the callback prefaced with headers that provide information about the record, including its type. If you are querying a variety of servers, some might send back MARC records, others GRS-1.
foreach my $line(@$array) { isZ_MARC($line) and do_something(); isZ_GRS($line) and do_something_else(); . . . . . . }
See also Net::Z3950::AsyncZ::isZ_Header which tests for whether a $line is a type-header, as opposed to whether it designates a particular type of record
Records are sent to the callback function as an array of lines in which records are separated from one other by a set of headers; you can determine the number of the current record by extracting the record number from its type-header using getZ_RecNum. See Headers and getZ_RecNum.
$bool = isZ_Header($line);
This function tests whether $line is a type-header (i.e. whether this is a USMARC reocord, GRS-1, etc).
$line
:
current $line of record array
$bool
:
true if $line is a type-header, otherwise false
$recnum = getZ_RecNum($line)
$line
: The current $line
of the records array.
$recnum
: The number
of the current record in the Record Set, i.e. if there are 20 records
matching the query, and you have asked for 5 at time, the record number is
not one of five, but one of 20. You must first test the line to make sure it is a header:
if(isZ_Header($line)) { print "Recnum = ", getZ_RecNum($line),"\n"; }
$recsize = getZ_RecSize($index);
$index
: The $index
of the server
that has returned the records
$recsize
: The number
of records in the Record Set
$retv = isZ_Error($err_array_ref)
$err_array_ref
:
an array reference returned by Net::AscyncZ::getErrors
(the array holds
two Net::Z3950::AsyncZ::ErrMsg
objects).
Because of the two-cycle process, some errors reported in the first cycle are nullified by successful outcomes during the second cycle; this method tests for whether a cycle 1 error has been nullified by a successful second attempt.
$retv
:
0 if not an error; 1 if non-recoverable cycle 1 error;
2 if cycle 2 error.
In other words, it returns false
if there has been no error and true
if there has been. The type of true
value it returns is used by Net::Z3950::AsyncZ::isZ_nonRetryable
to
determine whether this error was non-recoverable.
$retv = isZ_Error($err); $bool = isZ_nonRetryable($retv); $bool = isZ_nonRetryable(isZ_Error($err))
$retv
:
the return value from isZ_Error
.
$bool
:
true if $err is non-recoverable, otherwise false
This is a convenience method in which the idiom isZ_nonRetryable(isZ_Error($err))
tests whether $err is a non-recoverable cycle 1 error. Since such errors
often occur at the system level, this enables you to side-step
outputting what might be gobbledygook (e.g. ``illegal seek'') to the user:
print "There has been an error in contacting this server\n" if isZ_nonRetryable(isZ_Error($err));
Since there are some non-recoverable cycle 1 errors which might be of interest to the user (e.g. ``connection refused'', which is identified as a network error), you might test whether it is also a system error:
print "There has been an error in contacting this server\n" if isZ_nonRetryable(isZ_Error($err)) && $err->isSystem();
$bool = isZ_Info($line);
$line
:
current $line of record array
$bool
:
true if header $line contains internal data, otherwise false
See Headers, Net::Z3950::AsyncZ::isZ_PID, and Net::Z3950::AsyncZ::noZ_Response.
$bool = isZ_PID($line);
$line
:
current $line of record array
$bool
:
true if header $line contains pid of child process, otherwise false
The preferred method for testing for the PID header is isZ_Info
.
Therefore, isZ_PID
is not explicitly exported and requires the
full package name: Net::Z3950::AsyncZ::isZ_PID.
$bool = noZ_Response($line);
$line
:
current $line of record array
$bool
:
true if header $line stipulates that there was no response from a server--
i.e. that a child process returnsed without obtaining any records--otherwise false
$bool = isZ_ServerName($line);
$line
:
current $line of record array
$bool
:
true if $line is a header with server's name, otherwise false
$server = isZ_ServerName($line);
$line
:
current $line of record array
$server
:
server's name if this $line is a header with server's name; otherwise undef
.
These functions are used as follows:
$line = delZ_header($line, $gmodifier, $subst);
$line
:
string or reference to a string: current $line
of record data
$gmodifier
:
boolean--if true
then the g
modfier is applied to substitutions: s///g
$subst
:
the value to be subtituted for the item being deleted
$line
:
either string or reference to string, depending on whether a reference or a string
was intially passed in paramter $_[0]
.
isZ_Header
,isZ_Server
, and isZ_PID
; instead of testing
for these headers, they enable you to either delete or substitute another string for
them.
You might, for instance, find it useful to substitute the name of an institution for the name of a server:
$line = delZ_serverName($line, 0, "University of Manitoba Libraries");
get_ZRawRec
are used to retrieve raw record data, which
is returned when raw
is set to true and render
set to false in the
_params
array.
$recs = prep_Raw($array);
$array
:
reference to array of raw records passed into the callback function
when
render=>0
$recs
:
reference to string representing all records in records array
when raw
is true and render
is false.
This function ``preps'' an array of raw records for use with get_ZRawRec
.
To use this function and get_ZRawRec
you must set render=>0
in the
options
array.
$rec = get_ZRawRec($recs)
$recs
:
reference to a string representing array of record data
$rec
:
string representing the next record in array or undef
if no record is available.
get_ZRawRec
behaves as a ``get-next'' function:
with each access of get_ZRawRec
, the next record is returned and
deleted from the string of records created in prep_Raw
.
asyncZOptions isZ_MARC isZ_GRS isZ_RAW isZ_Error isZ_nonRetryable isZ_Info isZ_DEFAULT noZ_Response isZ_Header isZ_ServerName Z_serverName getZ_RecNum getZ_RecSize delZ_header delZ_pid delZ_serverName prep_Raw get_ZRawRec
isZ_MARC isZ_GRS isZ_RAW isZ_DEFAULT getZ_RecNum
isZ_Error isZ_nonRetryable
isZ_ServerName Z_serverName noZ_Response isZ_Header isZ_Info delZ_header delZ_pid delZ_serverName isZ_Info
suppressErrors
isSystem isNetwork isUnspecified isZ3950
For the record: A callback is a function which you supply and which AsyncZ calls upon as required.
AsyncZ uses two callback functions. One handles the general output of records fetched from the servers queried. The second formats individual lines of the record to your specifications. The format callback is not required.
$index
:
index of the server to which the current records belong, i.e.
the index of the server in the @servers array which you pass into
the constructor: servers=>\@servers
.
$array_ref
:
array of records which have been returned from the server
The output callback is called whenever records become available from one of the child processes. The most basic callback would be something like this:
sub output { my($index, $array_ref) = @_; foreach my $line(@$array_ref) { print "$line\n" if $line; } print "\n--------\n\n"; }
Note: It is important to note the sequence in which the parameters are passed to the callback:
my($index, $array_ref) = @_;
The array which is referenced by $array_ref contains all of the records fetched from the current server. Each element of the array holds either one line of the record or one of the AsyncZ headers. The headers separate the records, while the format of the record and its lines depends up two factors:
Here is typical output from the default Plain Text method:
<!--jasper.acadiau.ca--> <#--4498--> [MARC 4] 020 ISBN: 0472110101 (cloth : alk. paper) 050 LC call number: PS2123.A4 1999 100 author: James, Henry,1843-1916.Correspondence.Selections. 245 title: Dear munificent friends :Henry James's letters to four women /edited by Susan E. Gunter. 260 publication: Ann Arbor :University of Michigan Press,c1999. 300 description: xxiv, 288 p. ;24 cm. 650 subject: Authors, American19th centuryCorrespondence. 650 subject: Authors, American20th centuryCorrespondence. 700 auth, illus, ed: Gunter, Susan E.,1947- <!--130.17.3.75--> <#--4518--> [MARC 5] 020 ISBN: 080066755 050 LC call number: G62.T7 1968 245 title: Trends in geography;an introductory survey.Edited by Ronald U. Cooke and James H. Johnson. 250 edition: [1st ed.] 260 publication: Oxford,New York,Pergamon Press[1969] 300 description: x, 287 p.illus.23 cm. 500 note: Collection of essays originally presented at a conference organized by the University of London Institute of Education and held at University College London in 1968. 500 note: Pergamon Oxford geographies. 650 subject: Geography 700 auth, illus, ed: Johnson, James Henry,1930- 700 auth, illus, ed: Cooke, Ronald U.
The first three lines of each record are headers, indicating that you have encountered a new record. The headers hold the following information:
Server name pid of child process type of record and record number.
At the very least you would probably want to ignore the headers and add a newline to separate one record from another. The set of class methods provided by Net::Z3950::AsyncZ allows you to deal with the headers as you see fit: you can ignore them, you can identify the record type and extract the record number, and you can extract the server name.
If a server fails to return any records, the array will consist of one line of the following form:
{!-- library.anu.edu.au --}
This line does not tell us which server has failed, only that one of the child processes has not returned any records.
$index
$index
will enable you to track the servers you've queried. For
instance, you might want to create an array with the names of the
institutions at which servers are located, so that you can tell your
users that the current record is a response from Acadia University in
Wolfville, N.S., rather from jasper.acadiau.ca. Knowing the index in the
callback enables you to do this.
See Headers and basic_pretty.pl
, included with the distribution,
for some ways of testing for and handling headers.
$row
:
a reference to a 2 element array:
$row->[0]
:
a MARC tag or the null string if there is no tag
$row->[1]
:
the field's data string
Records are formatted one row at a time. There are two default behaviors-- plain text and HTML. The plain text is as illustrated in Output Callback:
050 LC call number: PS2123.A4 1999 100 author: James, Henry,1843-1916.Correspondence.Selections. 245 title: Dear munificent friends
The first column is a MARC tag, the second a string name for that tag, and the third is the field data. The HTML default would ouput the following:
<tr><td>ISBN<td>0472110101 (cloth : alk. paper) <tr><td>LC call number<td>PS2123.A4 1999 <tr><td>author<td>James, Henry,1843-1916.Correspondence.Selections. <tr><td>title<td>Dear munificent friends
In the HTML each field is placed within a <td>. It would then be up to you, in your output callback, to complete the HTML by adding the <TABLE>. . .</TABLE> tags and any attributes to those tags. You could also, for instance, format the table using CSS.
The functions which create this output are in Net::Z3950::AsyncZ::Report:
sub _defaultRecordRowHTML { my ($row) = @_; return "<tr><td>" . $MARC_FIELDS{$row->[0]} . "<td>" . $row->[1] . "\n"; }
sub _defaultRecordRow { my ($row) = @_; return $row->[0] . "\t" . $MARC_FIELDS{$row->[0]} . ":\t" . $row->[1] . "\n"; }
You can specify your own row formatter using the format
parameter of AsyncZ's constructor.
It will always be passed the reference to a two element array, but if there is no MARC tag,
then $row-
[0]> will be set to the null string and $row-
[1]> will hold whatever data
is available.
Tip: The default row formatter is _defaultRecordRow
. To make
_defaultRecordRowHTML
your default, set the constructor's format
parameter
to Net::Z3950::AsyncZ:Report::_defaultRecordRowHTML:
format=>\&Net::Z3950::AsyncZ::Report::_defaultRecordRowHTML
As noted under Output Callback there are four types of headers:
[1] server name:
<!--library.anu.edu.au-->
[2] pid of the child function which accessed the server:
<#--13076-->
[3] type of record and its record number:
[MARC 2]
[4] failure of the child process to return any records:
{!-- library.anu.edu.au --}
The first three headers occur at the start of each new record:
<!--library.anu.edu.au--> <#--13076--> [MARC 2] 020 ISBN: 0060154497 100 author: Henry, James F.,1930- 245 title: The manager's guide to resolving legal disputes 250 edition: 1st ed. 260 publication: New York :Harper & Row,c1985. 300 description: v, 162 p. ;22 cm.
But the fourth header occurs as a single line by itself:
{!-- library.anu.edu.au --}
This fourth header tells us that one of the servers failed to return records--but not which one failed.
library.anu.edu.au
is not the server which failed to respond but the last server which did respond.
(The reasons for this have to do with asynchononicity and shared memory.)
The following methods, detailed in Class Methods
, are used for
handling headers in the callback function:
Their use is demonstrated in the callback function from basic_pretty.pl
:
sub output { my($index, $array) = @_;
foreach my $line(@$array) { return if noZ_Response($line); next if isZ_Info($line); # remove internal data next if isZ_Header($line); # again remove internal data # you could first test for type of output: # isZ_MARC, etc. or extract the record number
# extract server name from header (print "\nServer: ", Z_serverName($line), "\n"), next if isZ_ServerName($line);
print "$line\n" if $line; }
print "\n--------\n\n";
}
This produces the following result:
Server: bison.umanitoba.ca 050 LC call number: PS2124.H46 245 title: Henry James review. -- 260 publication: [Louisville, KY :Dept. of English, University of Louisville/,1979- 300 description: v. ;25-28 cm. 650 subject: Ejournals -- UML 700 auth, illus, ed: Fogel, Daniel Mark,1948-
If you wanted to get the Record Number, you could replace
next if isZ_Header($line);
with
$recnum = getZ_RecNum($line) if isZ_Header($line);
This may be useful when you are requesting additional records for the same query. If you are getting 5 records at a time, in your second request to the server, the first of the records returned would be number 6.
If you wanted toget rid of the MARC tags and the following white space you could put each line through this filter:
$line =~ s/\d+\s+//;
Incorporating both these modifications would give us the following:
sub output { my($index, $array) = @_; my $recnum = 1;
foreach my $line(@$array) { return if noZ_Response($line); next if isZ_Info($line); # remove internal data if(isZ_Header($line)) { print "Record: ", getZ_RecNum($line),"\n"; next; } # extract server name from header (print "\nServer: ", Z_serverName($line), "\n"), next if isZ_ServerName($line); $line =~ s/\d+\s+//; print "$line\n" if $line; }
print "\n--------\n\n";
}
There are two sets of error messages in AsyncZ
:
[1] detailed messages for debugging and tracking:
these are handled by the Net::Z3950::AsyncZ::Errors
module
[2] informational messages for the user:
these are handled by Net::Z3950::AsyncZ::ErrMsg
The detailed messages contain a number of different kinds of information:
1. a trace back 3 levels 2. server name and query string 3. Z3950 error messages where available 4. system error messages
Detailed errors are either sent to a file or to the terminal or are suppressed.
How they are dealt with depends on the log
options of Net::AsnyncZ::new
and Net::Z3950::AsyncZ::Options::_params
. This means that you can have different
error reporting mechanisms for each of your servers as well as for the parent process.
The default behavior is to write all error messages to the terminal. To write them
to a log file you set log
to a filename:
log=>$filespec
NOTE: Do not open the file yourself. All files are automatically opened and
closed by AsyncZ
.
To suppress all errors you do the following:
log=>Net::Z3950::AsyncZ::Errors::suppressErrors()
Since suppressErrors()
is exported, you can do this:
use Net::Z3950::AsyncZ::Errors(suppressErrors); log=>suppressErrors()
System error messages and Perl library messages are routinely sent to STDERR;
AsyncZ
sends its error messages to STDOUT. This means that if you don't do
do something to redirect the AsyncZ
messages and you are operating in a web
browser, the AsyncZ
messages will go to the browser.
AsyncZ
keeps a record of which processes have returned records and which have not.
It also keeps track of the exit codes of each process.
For each process which has not returned records,it creates a Net::Z3950::AsyncZ::ErrMsg
object,
based on its exit code. There is a separate set of Net::Z3950::AsyncZ::ErrMsg
objects for
each of the two AsyncZ
cycles (See The Basic Mechanisms of Net::Z3950::AsyncZ).
A query which reported failure in the first cycle may have been successful in
its second attempt. Net::Z3950::AsyncZ::isZ_Error
returns true if a server has not
returned any records, false if it has.
the error number
the error string
System, Network, Z3950, Success
doRetry
doAbort
Net::Z3950::AsyncZ
supplies four methods, two Object Methods
and two Class Methods.
$err = $asyncZ->getErrors($index);
this method returns a reference to an array of two ErrMsg objects:
[$errors[$index]->[0], $errors[$index]->[1]]
$index is the index of the server in the servers=>\@servers
array.
$error_number = $asnycZ->getMaxErrors();
the maximum possible errors encountered: some of these may not if fact be errors and therefore
will not test true
in isZ_Error($err)
$retv = isZ_Error($err)
$bool = isZ_nonRetryable(isZ_Error($err))
Net::Z3950::AsyncZ::ErrMsg
supplies eight object methods, which
enable you to determine the general category under which an
error falls and how serious it is.
They all return true
or false
.
The basic syntax for all of these methods is:
$err->method();
Device or resource busy Too many users Permission denied Software caused connection abort Invalid argument
An ``Invalid argument'' will often come back when a query fails and a library routine attempts to do something which can't be done without the return value
Connection timed out Network is down Network is unreachable Connection refused
This applies to two cases: | |
[1] EAGAIN: the system error which returns a ``try again'' message | |
[2] a process which has been created but never gets far enough to | |
return an exit code, presumably because it has timed out. |
true
to isSuccess
is one for which the
exit code is 0, i.e. one in which the process ended without an
error but did not return any records.
true
to isZ_nonRetryable
.)
A very basic routine for handling errors is demonstrated in basic.pl
:
sub showErrors { my $asyncZ = shift; # [1] print "The following servers have not responded to your query: \n"; for(my $i=0; $i< $asyncZ->getMaxErrors();$i++) { my $err = $asyncZ->getErrors($i); # [2] next if !isZ_Error($err); # [3] print "$servers[$i]->[0]\n"; print " $err->[0]->{msg}\n" if $err->[0]->{msg}; # [4] print " $err->[1]->{msg}\n" if $err->[1]->{msg}; # [5] } }
[1] Get reference to the Net::Z3950::AsyncZ object [2] Get reference to array of ErrMsg Objects for index $i [3] Check to see whether this array holds a valid error [4] print the cycle 1 error if it exists (it should if you've gotten this far) [5] print the cycle 2 error if it exists (it will not, if cyle 1 was non-retryable)
A more useful error routine is demonstrated in basic_pretty.pl
:
sub showErrors { my $asyncZ = shift;
# substitute some general statement for a system level error instead # of something puzzling to the user like: 'illegal seek' my $systemerr = "A system error occurred on the server\n";
print "The following servers have not responded to your query: \n";
for(my $i=0; $i< $asyncZ->getMaxErrors();$i++) { my $err = $asyncZ->getErrors($i); # [1] next if !isZ_Error($err); # [2] print "$servers[$i]->[0]\n"; # [3] if($err->[0]->isSystem()) { print $systemerr; # [4] } else { print " $err->[0]->{msg}\n" if $err->[0]->{msg}; # [5] } if($err->[1] && $err->[1]->isSystem()) { print $systemerr; # [6] } else { print " $err->[1]->{msg}\n" # [7] if $err->[1]->{msg} && $err->[1]->{msg} != $err->[0]->{msg};
}
} }
The first three steps are a repeat of basic.pl
:
[1] Get reference to the Net::Z3950::AsyncZ object [2] Get reference to array of ErrMsg Objects for index $i [3] Check to see whether this array holds a valid error
Cycle 1 Error:
[4] If this is a system-type error, print a non-specialist message [5] Otherwise, print the error message for this error
Cycle 2 Error:
[6] If this is a system-type error, print a non-specialist message [7] Otherwise, print the error message for this error but only if the cycle 2 error message is not the same as the cycle one message
Myron Turner <turnermm@shaw.ca> or <mturner@ms.umanitoba.ca>
Copyright 2003 by Myron Turner
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.