NAME Parse::BBCode - Module to turn BBCode into HTML or plain text SYNOPSIS To parse a bbcode string, set up a parser with the default HTML defintions of Parse::BBCode::HTML: use Parse::BBCode; my $p = Parse::BBCode->new(); my $code = 'some [b]b code[/b]'; my $parsed = $p->render($code); Or if you want to define your own tags: my $p = Parse::BBCode->new({ tags => { # load the default tags Parse::BBCode::HTML->defaults, # add/override tags url => 'url:%{parse}s', i => '%{parse}s', b => '%{parse}s', noparse => '
%{html}s',
code => sub {
my ($parser, $attr, $content, $attribute_fallback) = @_;
if ($attr eq 'perl') {
# use some syntax highlighter
$content = highlight_perl($content);
}
else {
$content = Parse::BBCode::escape_html($$content);
}
"$content"
},
test => 'this is klingon: %{klingon}s',
},
escapes => {
klingon => sub {
my ($parser, $tag, $text) = @_;
return translate_into_klingon($text);
},
},
}
);
my $code = 'some [b]b code[/b]';
my $parsed = $p->render($code);
DESCRIPTION
Note: This module is still experimental, the syntax is subject to
change. I'm open for any suggestions on how to improve the syntax. See
"TODO" for what might change.
I wrote this module because HTML::BBCode is not extendable (or I didn't
see how) and BBCode::Parser seemed good at the first glance but has some
issues, for example it says that he following bbode
[code] foo [b] [/code]
is invalid, while I think you should be able to write unbalanced code in
code tags. Also BBCode::Parser dies if you have invalid code or
not-permitted tags, but in a forum you'd rather show a partly parsed
text then an error message.
What I also wanted is an easy syntax to define own tags, ideally - for
simple tags - as plain text, so you can put it in a configuration file.
This allows forum admins to add tags easily. Some forums might want a
tag for linking to perlmonks.org, other forums need other tags.
Another goal was to always output a result and don't die. I might add an
option which lets the parser die with unbalanced code.
METHODS
new Constructor. Takes a hash reference with options as an argument.
my $parser = Parse::BBCode->new({
tags => {
url => ...,
i => ...,
},
escapes => {
link => ...,
},
close_open_tags => 1, # default 0
strict_attributes => 0, # default 0
);
tags
See "TAG DEFINITIONS"
escapes
See "ESCAPES"
close_open_tags
If set to true (1), it will close open tags at the end or before
block tags.
strict_attributes
If set to true (1), tags with invalid attributes are left
unparsed. If set to false (0), the attribute for this tags will
be empty.
An invalid attribute:
[foo=bar far boo]...[/foo]
I might add an option to define your own attribute validation.
Contact me if you'd like to have this.
direct_attributes
Default: true
Normal tag syntax is:
[tag=val1 attr2=val2 ...]
If set to 0, tag syntax is
[tag attr2=val2 ...]
render
Input: The text to parse
Returns: the rendered text
my $parsed = $parser->render($bbcode);
parse
Input: The text to parse.
Returns: the parsed tree (a Parse::BBCode::Tag object)
my $tree = $parser->parse($bbcode);
render_tree
Input: the parse tree
Returns: The rendered text
my $parsed = $parser->render_tree($tree);
forbid
$parser->forbid(qw/ img url /);
Disables the given tags.
permit
$parser->permit(qw/ img url /);
Enables the given tags if they are in the tag definitions.
escape_html
Utility to substitute
<>&"'
with their HTML entities.
my $escaped = Parse::BBCode::escape_html($text);
error
If the given bbcode is invalid (unbalanced or wrongly nested
classes), currently Parse::BBCode::render() will either leave the
invalid tags unparsed, or, if you set the option "close_open_tags",
try to add closing tags. If this happened "error()" will return the
invalid tag(s), otherwise false. To get the corrected bbcode (if you
set "close_open_tags") you can get the tree and return the raw text
from it:
if ($parser->error) {
my $tree = $parser->get_tree;
my $corrected = $tree->raw_text;
}
TAG DEFINITIONS
Here is an example of all the current definition possibilities:
my $p = Parse::BBCode->new({
tags => {
'' => sub {
my $e = Parse::BBCode::escape_html($_[2]);
$e =~ s/\r?\n|\r/%{html}s',
quote => 'block:%s', code => { code => sub { my ($parser, $attr, $content, $attribute_fallback) = @_; if ($attr eq 'perl') { # use some syntax highlighter $content = highlight_perl($content); } else { $content = Parse::BBCode::escape_html($$content); } "$content" }, parse => 0, class => 'block', }, hr => { class => 'block', output => '
%{html}s'
[noparse] [some]unbalanced[/foo] [/noparse]
With this definition the output would be
[some]unbalanced[/foo]So inside a noparse tag you can write (almost) any invalid bbcode. The only exception is the noparse tag itself: [noparse] [some]unbalanced[/foo] [/noparse] [b]really bold[/b] [/noparse] Output: [some]unbalanced[/foo] really bold [/noparse] Because the noparse tag ends at the first closing tag, even if you have an additional opening noparse tag inside. The "%{html}s" defines that the content should be HTML escaped. If you don't want any escaping you can't say %s because the default is 'parse'. In this case you have to write "%{noescape}". Block tags quote => 'block:
%s', To force valid html you can add classes to tags. The default class is 'inline'. To declare it as a block add "'block:"" to the start of the string. Block tags inside of inline tags will either close the outer tag(s) or leave the outer tag(s) unparsed, depending on the option "close_open_tags". Define subroutine for tag All these definitions might not be enough if you want to define your own code, for example to add a syntax highlighter. Here's an example: code => { code => sub { my ($parser, $attr, $content, $attribute_fallback) = @_; if ($attr eq 'perl') { # use some syntax highlighter $content = highlight_perl($$content); } else { $content = Parse::BBCode::escape_html($$content); } "$content" }, parse => 0, class => 'block', }, So instead of a string you define a hash reference with a 'code' key and a sub reference. The other key is "parse" which is 0 by default. If it is 0 the content in the tag won't be parsed, just as in the noparse tag above. If it is set to 1 you will get the rendered content as an argument to the subroutine. The first argument to the subroutine is the Parse::BBCode object itself. The second argument is the attribute, the third the tag content as a scalar reference and the fourth argument is the attribute fallback which is set to the content if the attribute is empty. The fourth argument is just for convenience. Single-Tags Sometimes you might want single tags like for a horizontal line: hr => { class => 'block', output => '