ScrapeBox Forum
Scraping yell.com - Printable Version

+- ScrapeBox Forum (https://www.scrapeboxforum.com)
+-- Forum: ScrapeBox Main Discussion (https://www.scrapeboxforum.com/Forum-scrapebox-main-discussion)
+--- Forum: General ScrapeBox Talk (https://www.scrapeboxforum.com/Forum-general-scrapebox-talk)
+--- Thread: Scraping yell.com (/Thread-scraping-yell-com)



Scraping yell.com - Markz - 08-09-2015

Hi,

Has anyone had success scraping this yellow pages site for the external website links to the companies listed using the link extractor add on I`m getting a 405 error?

After taking this link here for example https://www.yell.com/ucs/UcsSearchAction.do?find=Y&filter=2&contentFilter=true&keywords=accountants&location=uk and following the page numbering it gets the external links on the first page then errors. Ive tried many different settings and proxies etc.

Any help appreciated.

Thanks.


RE: Scraping yell.com - loopline - 08-14-2015

Are you using the custom data grabber or the harvester or what?

What information are you trying to scrape?


RE: Scraping yell.com - Markz - 08-16-2015

Only the external links for the websites using the add on.


RE: Scraping yell.com - loopline - 08-17-2015

I mean I tried like

https://www.yell.com/ucs/UcsSearchAction.do?keywords=accountants&location=uk&contentFilter=true&filter=2&find=Y&pageNum=2

which is page 2 for your link, and it scraped 27 external links just fine. Are you using proxies?

Are you using more then 1 connection? maybe they are just banning your ip for going too fast.


RE: Scraping yell.com - Markz - 08-19-2015

Yes using proxies with 1 connection the first page seems to work fine but dies on the following pages quickly.


RE: Scraping yell.com - loopline - 08-20-2015

Try slowing it down, my guess is they are banning your proxies. Try 1 connection and see what happens.


RE: Scraping yell.com - Markz - 08-29-2015

Will do thanks for the input.


RE: Scraping yell.com - loopline - 08-31-2015

Your welcome.