ScrapeBox Forum
Grab Emails By Crawling Sites Issue - Printable Version

+- ScrapeBox Forum (https://www.scrapeboxforum.com)
+-- Forum: ScrapeBox Main Discussion (https://www.scrapeboxforum.com/Forum-scrapebox-main-discussion)
+--- Forum: General ScrapeBox Talk (https://www.scrapeboxforum.com/Forum-general-scrapebox-talk)
+--- Thread: Grab Emails By Crawling Sites Issue (/Thread-grab-emails-by-crawling-sites-issue)



Grab Emails By Crawling Sites Issue - historyonfire - 08-29-2019

Hello,

I have scraped a few hundred thousand URLs for emails in the past 2 weeks. All the urls are instagram accounts. I've gotten over 30k emails using the "Grab/Check -> Check for emails by crawling sites."

Things have been working great until yesterday.

I am not getting any errors, but no emails are being collected. All the urls checked are shown to be "complete," rather than displaying an error message. I manually checked some of the links and there were emails on the pages. 

I'm not using any proxies. My delay is 0-2 seconds between actions. I am running on 1 thread. My depth is set to 1 level. I have tried using level 2 depth to see if it would fix my issue but it did not.

I haven't changed any of my settings since I started scraping 2 weeks ago. 

One of my associates (located in different state) is having the same exact issue as I am. 

Any ideas?

Thanks!


RE: Grab Emails By Crawling Sites Issue - loopline - 08-30-2019

Are you using proxies?

I suspect that Instatgram is now tossing up some sort of splash page. If your using EU proxies, try some from the USA, as it could be related to GDPR

or it could be something else.

Also you could use a program like http debugger pro, it has a free trial, and then run it and run scrapebox and you can see the exact same response that instagram is giving scrapebox.


RE: Grab Emails By Crawling Sites Issue - historyonfire - 08-30-2019

Thanks Loopline, I am going to try out both of your suggestions.

What's odd to me is that usually I could tell if I was scraping too fast because I would be getting errors like 404s. It's strange that all the processes are marked "Complete" yet clearly are not.

edit: I am not using proxies atm. I have some private USA ones so I will try to use those.


RE: Grab Emails By Crawling Sites Issue - loopline - 08-31-2019

Oh then they probably just cracked down on your ip. They are probably redirecting it to some page with a captcah or saying too much traffic from your network etc.. Which is why no emails are coming and its just giving you completed.