Tuesday, 6 August 2013

Printing table contents using Html::TreeBuilder::XPath

Printing table contents using Html::TreeBuilder::XPath

I want to extract all the tables from an html file and print their
contents in the following way each cell seperated by \t , each row
seperated by \n and each table seperated by \n\n . The following is my
script , when i changed it to findvalues on tr then whole tr is inserted
as 1 element , i want to modify it to the above mentioned structure .
use strict;
use warnings;
use HTML::TreeBuilder::XPath;
my $tree= HTML::TreeBuilder::XPath->new;
$tree->parse_file( "html.html");
my @values=$tree->findvalues(q{//table//tr//td});
print $_, "\n" foreach(@values);

No comments:

Post a Comment