perl - Unable to get the web content using LWP::Simple but able to get content from LWP::UserAgent -


i trying run below code parse contents of html page below url

#!/usr/bin/perl use lwp::simple; use html::treebuilder; $response = get("http://www.viki.com/"); print $response; 

nothing gets printed. working if emulated browser.

when try access http://www.viki.com using lwp::useragent following response:

<html><body><h1>403 forbidden</h1> request forbidden administrative rules. </body></html> 

the get subroutine in lwp::simple implemented follows (at least in version 6.13).

sub ($) {     $response = $ua->get(shift);     return $response->decoded_content if $response->is_success;     return undef; } 

as can see, get method return content if response success, otherwise return undef.

the response lwp::useragent 403 error, in other words not success. therefore, lwp::simple return undef same url.

it appears website (http://www.viki.com) checking user agent string , returning content "valid" user agents. lwp::simple hard-coded use lwp::simple/$version user agent.

if must use lwp::simple force user agent this:

use lwp::simple qw/ $ua /;  $ua->agent('mozilla/5.0 (windows nt 6.1; wow64; rv:37.0) gecko/20100101 firefox/37.0');  print get('http://www.viki.com'); 

lwp::simple exposes lwp::useragent instance uses internally optionally included $ua variable. still necessary configure user agent on instance particular page load.


Comments

Popular posts from this blog

android - MPAndroidChart - How to add Annotations or images to the chart -

javascript - Add class to another page attribute using URL id - Jquery -

firefox - Where is 'webgl.osmesalib' parameter? -