HTML::TreeBuilder Examples

Create a new Tree and Parse a file or text

   1 use WWW::Mechanize;
   2 use HTML::TreeBuilder;
   3 
   4 my $tree = HTML::TreeBuilder->new; # empty tree
   5 
   6 $tree->parse($ua->content());  #parse user agent content (LWP::Mechanize)
   7 $tree->parse_file("htmltoparse.html"); #parse a file

Find Specific Text Within Tags Using HTML::TreeBuilder

Sample HTML

<html>
   <body>
      <span class="styleofspan">Find This Text</span>
   </body>
</html>

Code to find specific text

   1 # Return an array of nodes containing tags with the content "Find This Text" in it:
   2 my @nodescontainingtext = $tree->look_down("_tag", "span", sub {$_[0]->as_text eq "Find This Text"});

Find a specific tag or group of tags

   1 # Find all the "span" tags in an HTML doc:
   2 my @spantags = $tree->look_down("_tag", "span");

Find a tag with given attributes

   1 # Find all the tags with the class "styleofspan"
   2 my @tags = $tree->look_down("class", "styleofspan");

Notes

use look_down to search for HTML tags with given attributes. “_tag” is a virtual attribute containing the name of the tag.