NAME Parse::BBCode - Module to turn BBCode into HTML or plain text SYNOPSIS To parse a bbcode string, set up a parser with the default HTML defintions of Parse::BBCode::HTML: use Parse::BBCode; my $p = Parse::BBCode->new(); my $code = 'some [b]b code[/b]'; my $parsed = $p->render($code); Or if you want to define your own tags: my $p = Parse::BBCode->new({ tags => { # load the default tags Parse::BBCode::HTML->defaults, # add/override tags url => 'url:%{parse}s', i => '%{parse}s', b => '%{parse}s', noparse => '
%{html}s', code => sub { my ($parser, $attr, $content, $attribute_fallback) = @_; if ($attr eq 'perl') { # use some syntax highlighter $content = highlight_perl($content); } else { $content = Parse::BBCode::escape_html($$content); } "$content" }, test => 'this is klingon: %{klingon}s', }, escapes => { klingon => sub { my ($parser, $tag, $text) = @_; return translate_into_klingon($text); }, }, } ); my $code = 'some [b]b code[/b]'; my $parsed = $p->render($code); DESCRIPTION Note: This module is still experimental, the syntax is subject to change. I'm open for any suggestions on how to improve the syntax. See "TODO" for what might change. I wrote this module because HTML::BBCode is not extendable (or I didn't see how) and BBCode::Parser seemed good at the first glance but has some issues, for example it says that he following bbode [code] foo [b] [/code] is invalid, while I think you should be able to write unbalanced code in code tags. Also BBCode::Parser dies if you have invalid code or not-permitted tags, but in a forum you'd rather show a partly parsed text then an error message. What I also wanted is an easy syntax to define own tags, ideally - for simple tags - as plain text, so you can put it in a configuration file. This allows forum admins to add tags easily. Some forums might want a tag for linking to perlmonks.org, other forums need other tags. Another goal was to always output a result and don't die. I might add an option which lets the parser die with unbalanced code. METHODS new Constructor. Takes a hash reference with options as an argument. my $parser = Parse::BBCode->new({ tags => { url => ..., i => ..., }, escapes => { link => ..., }, close_open_tags => 1, # default 0 strict_attributes => 0, # default 0 ); tags See "TAG DEFINITIONS" escapes See "ESCAPES" close_open_tags If set to true (1), it will close open tags at the end or before block tags. strict_attributes If set to true (1), tags with invalid attributes are left unparsed. If set to false (0), the attribute for this tags will be empty. An invalid attribute: [foo=bar far boo]...[/foo] I might add an option to define your own attribute validation. Contact me if you'd like to have this. direct_attributes Default: true Normal tag syntax is: [tag=val1 attr2=val2 ...] If set to 0, tag syntax is [tag attr2=val2 ...] render Input: The text to parse Returns: the rendered text my $parsed = $parser->render($bbcode); parse Input: The text to parse. Returns: the parsed tree (a Parse::BBCode::Tag object) my $tree = $parser->parse($bbcode); render_tree Input: the parse tree Returns: The rendered text my $parsed = $parser->render_tree($tree); forbid $parser->forbid(qw/ img url /); Disables the given tags. permit $parser->permit(qw/ img url /); Enables the given tags if they are in the tag definitions. escape_html Utility to substitute <>&"' with their HTML entities. my $escaped = Parse::BBCode::escape_html($text); error If the given bbcode is invalid (unbalanced or wrongly nested classes), currently Parse::BBCode::render() will either leave the invalid tags unparsed, or, if you set the option "close_open_tags", try to add closing tags. If this happened "error()" will return the invalid tag(s), otherwise false. To get the corrected bbcode (if you set "close_open_tags") you can get the tree and return the raw text from it: if ($parser->error) { my $tree = $parser->get_tree; my $corrected = $tree->raw_text; } TAG DEFINITIONS Here is an example of all the current definition possibilities: my $p = Parse::BBCode->new({ tags => { '' => sub { my $e = Parse::BBCode::escape_html($_[2]); $e =~ s/\r?\n|\r/
%{html}s', quote => 'block:
%s', code => { code => sub { my ($parser, $attr, $content, $attribute_fallback) = @_; if ($attr eq 'perl') { # use some syntax highlighter $content = highlight_perl($$content); } else { $content = Parse::BBCode::escape_html($$content); } "$content" }, parse => 0, class => 'block', }, hr => { class => 'block', output => '
%{html}s' [noparse] [some]unbalanced[/foo] [/noparse] With this definition the output would be
[some]unbalanced[/foo]So inside a noparse tag you can write (almost) any invalid bbcode. The only exception is the noparse tag itself: [noparse] [some]unbalanced[/foo] [/noparse] [b]really bold[/b] [/noparse] Output: [some]unbalanced[/foo] really bold [/noparse] Because the noparse tag ends at the first closing tag, even if you have an additional opening noparse tag inside. The "%{html}s" defines that the content should be HTML escaped. If you don't want any escaping you can't say %s because the default is 'parse'. In this case you have to write "%{noescape}". Block tags quote => 'block:
%s', To force valid html you can add classes to tags. The default class is 'inline'. To declare it as a block add "'block:"" to the start of the string. Block tags inside of inline tags will either close the outer tag(s) or leave the outer tag(s) unparsed, depending on the option "close_open_tags". Define subroutine for tag All these definitions might not be enough if you want to define your own code, for example to add a syntax highlighter. Here's an example: code => { code => sub { my ($parser, $attr, $content, $attribute_fallback) = @_; if ($attr eq 'perl') { # use some syntax highlighter $content = highlight_perl($$content); } else { $content = Parse::BBCode::escape_html($$content); } "$content" }, parse => 0, class => 'block', }, So instead of a string you define a hash reference with a 'code' key and a sub reference. The other key is "parse" which is 0 by default. If it is 0 the content in the tag won't be parsed, just as in the noparse tag above. If it is set to 1 you will get the rendered content as an argument to the subroutine. The first argument to the subroutine is the Parse::BBCode object itself. The second argument is the attribute, the third the tag content as a scalar reference and the fourth argument is the attribute fallback which is set to the content if the attribute is empty. The fourth argument is just for convenience. Single-Tags Sometimes you might want single tags like for a horizontal line: hr => { class => 'block', output => '