Scraping yell.com - Printable Version +- ScrapeBox Forum (https://www.scrapeboxforum.com) +-- Forum: ScrapeBox Main Discussion (https://www.scrapeboxforum.com/Forum-scrapebox-main-discussion) +--- Forum: General ScrapeBox Talk (https://www.scrapeboxforum.com/Forum-general-scrapebox-talk) +--- Thread: Scraping yell.com (/Thread-scraping-yell-com) |
Scraping yell.com - Markz - 08-09-2015 Hi, Has anyone had success scraping this yellow pages site for the external website links to the companies listed using the link extractor add on I`m getting a 405 error? After taking this link here for example https://www.yell.com/ucs/UcsSearchAction.do?find=Y&filter=2&contentFilter=true&keywords=accountants&location=uk and following the page numbering it gets the external links on the first page then errors. Ive tried many different settings and proxies etc. Any help appreciated. Thanks. RE: Scraping yell.com - loopline - 08-14-2015 Are you using the custom data grabber or the harvester or what? What information are you trying to scrape? RE: Scraping yell.com - Markz - 08-16-2015 Only the external links for the websites using the add on. RE: Scraping yell.com - loopline - 08-17-2015 I mean I tried like https://www.yell.com/ucs/UcsSearchAction.do?keywords=accountants&location=uk&contentFilter=true&filter=2&find=Y&pageNum=2 which is page 2 for your link, and it scraped 27 external links just fine. Are you using proxies? Are you using more then 1 connection? maybe they are just banning your ip for going too fast. RE: Scraping yell.com - Markz - 08-19-2015 Yes using proxies with 1 connection the first page seems to work fine but dies on the following pages quickly. RE: Scraping yell.com - loopline - 08-20-2015 Try slowing it down, my guess is they are banning your proxies. Try 1 connection and see what happens. RE: Scraping yell.com - Markz - 08-29-2015 Will do thanks for the input. RE: Scraping yell.com - loopline - 08-31-2015 Your welcome. |