perl - Unable to get the web content using LWP::Simple but able to get content from LWP::UserAgent -
i trying run below code parse contents of html page below url
#!/usr/bin/perl use lwp::simple; use html::treebuilder; $response = get("http://www.viki.com/"); print $response;
nothing gets printed. working if emulated browser.
when try access http://www.viki.com
using lwp::useragent
following response:
<html><body><h1>403 forbidden</h1> request forbidden administrative rules. </body></html>
the get
subroutine in lwp::simple
implemented follows (at least in version 6.13).
sub ($) { $response = $ua->get(shift); return $response->decoded_content if $response->is_success; return undef; }
as can see, get
method return content if response success, otherwise return undef
.
the response lwp::useragent
403 error, in other words not success. therefore, lwp::simple
return undef
same url.
it appears website (http://www.viki.com
) checking user agent string , returning content "valid" user agents. lwp::simple
hard-coded use lwp::simple/$version
user agent.
if must use lwp::simple
force user agent this:
use lwp::simple qw/ $ua /; $ua->agent('mozilla/5.0 (windows nt 6.1; wow64; rv:37.0) gecko/20100101 firefox/37.0'); print get('http://www.viki.com');
lwp::simple
exposes lwp::useragent
instance uses internally optionally included $ua
variable. still necessary configure user agent on instance particular page load.
Comments
Post a Comment