I can't extract urls from google - Printable Version +- ScrapeBox Forum (https://www.scrapeboxforum.com) +-- Forum: ScrapeBox Main Discussion (https://www.scrapeboxforum.com/Forum-scrapebox-main-discussion) +--- Forum: General ScrapeBox Talk (https://www.scrapeboxforum.com/Forum-general-scrapebox-talk) +--- Thread: I can't extract urls from google (/Thread-i-can-t-extract-urls-from-google) |
I can't extract urls from google - danisuite - 11-10-2021 Hello, I put keywords so that scrapebox extracts the urls from the google serps of each word. I have 35 proxies set google passed, I have set scrape box in the white list of the antivirus but still I can't get it to extract well. Most of the time it doesn't get any results and the few times it gets results are very few for the amount of keywords I put in. What can I be doing wrong? Thank you very much RE: I can't extract urls from google - loopline - 11-11-2021 Im guessing proxies are blocked. If you go to help >> show error log >> harvester what are the errrors? 503, 429 and 403 are all ip bans. RE: I can't extract urls from google - danisuite - 11-12-2021 (11-11-2021, 06:06 PM)loopline Wrote: Im guessing proxies are blocked. If you go to help >> show error log >> harvester these are the errors I get, what could be the problem? Thank you very much 10/11/2021 21:29:32: HTTP: -1 Connect timed out., URL: https://www.google.it/search?complete=0&hl=it&q=la+storia+successo+del+trader+paul+baccaglini&num=100&start=0&filter=0&pws=0 10/11/2021 21:29:52: HTTP: -1 Read timed out., URL: https://www.google.it/search?complete=0&hl=it&q=come+avviare+un+nuovo+business&num=100&start=0&filter=0&pws=0 10/11/2021 21:29:59: HTTP: -1 Connect timed out., URL: https://www.google.it/search?complete=0&hl=it&q=fare+soldi+con+il+poker+online+e+possibile&num=100&start=0&filter=0&pws=0 10/11/2021 21:30:13: HTTP: -1 Read timed out., URL: https://www.google.it/search?complete=0&hl=it&q=guadagnare+con+le+web+serie&num=100&start=0&filter=0&pws=0 RE: I can't extract urls from google - loopline - 11-13-2021 Google will never timeout. So these are proxy errors, if your using proxies, especially if its public proxies. Else if its private proxies it could be still due to proxies but it could be security software. So make sure you add an exception in all security software, for the entire scrapebox folder. RE: I can't extract urls from google - zboo - 03-01-2022 Hello, I have the same problem here : I use Stormproxies backconnect rotating proxies and I had no problem since 2014 using this method, but since 2021 it appears it's not working anymore. Here is my detailed harvester log (extract) : 24/02/2022 14:09:34: HTTP: 429 HTTP/1.1 429 Too Many Requests, URL: https://www.google.com/search?complete=0&hl=en&q=site%3Ainstagram%2Ecom%20%22john%20durand%22%20photo&num=100&start=0&filter=0&pws=0 Proxy: 37.48.118.90:13042 It's the 429 error Loopline analyses as IP bans... Is there any new method or Proxy provider I should know because Scrapebox is useless to me right now. Thanks a lot. RE: I can't extract urls from google - loopline - 03-01-2022 usually detailed harvester will keep retrying forever, does it stop for you? google proxies are hard to find, because google bans faster then ever and they don't share the reasons why they ban or the data about it. So there are still good proxies in the back connect pool of proxies, but you have to do more retries. You should be able to run detailed harvester and let it run indefinitely and get results, although perhaps slow due to the back connect proxies. |