ruby - Mechanize form submission -
i have website attempting scrape using mechanize. when submit form, form submitted url of following format : https://www.website.com/login/options?returnurl=some_form_options (if enter url in browser, send me nice error page saying requested page not exist)
whereas, if submit form website, returned url of following format : https://www.website.com/topic/country/list_of_form_options
the website has login form not necessary fill in able submit search query.
any idea why different url submitting same form mechanize ? , how counter ? cannot process url after "mechanizing" form.
thanks!
you can find exact form want submit submit, if unable find path can add form field using mechanize , submit form. here code have used in project.
i had create rake task task:
namespace :test_namespace task :mytask => [:environment] site = "http://www.website.com/search/search.aspx?term=search term" # prepare user agent ua = mechanize.new page = ua.get("#{site}") while (true) page.search("//div[@class='resultsnobackground']").each |res| puts res.at("table").at('tr').at('td').text link_text =res.at_css('strong').at('a').text link_href = res.at_css('strong').at('a')['href'] link_href ="http://www.website.com"+link_href page_content='' res.css('span').each |ss| ss.css('strong').remove page_content=ss.text.gsub(/vi.*s\)/, '') end # puts "here summmer ......#{content_summery}" end if page.search("#ctl00_contentplaceholder1_ctrlresults_gvresults_ctl01_lbnext").count > 0 form = page.forms.first form.add_field! "__eventtarget", "ctl00$contentplaceholder1$ctrlresults$gvresults$ctl01$lbnext" form.add_field! "__eventargument", "" page = form.submit else break end end end end
Comments
Post a Comment