ScrapeBox Forum
Scrapebox stuck? - Printable Version

+- ScrapeBox Forum (https://www.scrapeboxforum.com)
+-- Forum: ScrapeBox Main Discussion (https://www.scrapeboxforum.com/Forum-scrapebox-main-discussion)
+--- Forum: General ScrapeBox Talk (https://www.scrapeboxforum.com/Forum-general-scrapebox-talk)
+--- Thread: Scrapebox stuck? (/Thread-scrapebox-stuck)



Scrapebox stuck? - rogerke - 10-21-2014

Still a Scrapebox newbie, so I have no idea what's going on. Whenever I try to scrape expired web2.0's it stops harvesting around 5k results. No matter how long I wait, it's just stuck and won't harvest any more URLs.

Here's a screenshot of it:

[Image: 21410k4.png]

I'm using 10 semi-dedicated proxies from buyproxies.org and they don't seem to be banned because they test out fine. Also using a huge Dutch keyword list (thanks for the tip loop!), but it also seems to happen with smaller keyword lists.

Any idea what it could be?


RE: Scrapebox stuck? - rogerke - 10-21-2014

Little update:

When I use site:tumblr.com instead of site:wordpress.com, it's going flawlessly. All other settings are the same. So it looks like the problem is related to the site:wordpress.com query. Very strange.


RE: Scrapebox stuck? - rogerke - 10-22-2014

Looks like I'm still encountering the same problems. After some time harvesting slows down tremendously. Bought another 30 proxies (have 40 total now) and still the same problem. When I uncheck multi-threaded harvester it shows a 302 error for some proxies, while they all pass when I test my proxies. Really strange.


RE: Scrapebox stuck? - loopline - 10-23-2014

The short answer is you are going too fast.

Your using an advanced operator and those get banned faster, so for 40 proxies I would try 2 connections, maybe 1. But you will have to wait 24-48 hours for them to get unbanned.

That said you should also be aware that google has different kinds of ip bans, so they can literally ban an ip for 1 query and not the next. The proxy tester only tests against basic keyword harvesting, but your using advanced operators. I have a video that shows it here:

https://www.youtube.com/watch?v=P9CbGhfc1aY

Thats why you get a 302 in the single harvester, because they are actually blocked for your query. So you can build a custom test in the proxy tester to see if they are google passed and use the site:wordpress in the query url.

Glad you found the keywords helpful. Smile


RE: Scrapebox stuck? - rogerke - 10-23-2014

(10-23-2014, 01:45 AM)loopline Wrote: The short answer is you are going too fast.

Your using an advanced operator and those get banned faster, so for 40 proxies I would try 2 connections, maybe 1. But you will have to wait 24-48 hours for them to get unbanned.

That said you should also be aware that google has different kinds of ip bans, so they can literally ban an ip for 1 query and not the next. The proxy tester only tests against basic keyword harvesting, but your using advanced operators. I have a video that shows it here:

https://www.youtube.com/watch?v=P9CbGhfc1aY

Thats why you get a 302 in the single harvester, because they are actually blocked for your query. So you can build a custom test in the proxy tester to see if they are google passed and use the site:wordpress in the query url.

Glad you found the keywords helpful. Smile

Thank you very much for your detailed response. I adjusted the maximum connections to 2 and will wait for the proxies to get unbanned now. Let's see how this plays out in 24-48 hours.


RE: Scrapebox stuck? - loopline - 10-24-2014

Sounds good. Good luck.


RE: Scrapebox stuck? - johnny666 - 10-25-2014

I am getting the same exact problem
I am using 12 pri proxies and I am using 1 connection...still pauses after a few minutes of running. its like clock work and I cannt for the life of me seem to work it out.

The search opperators I am using are
inurl:"keyword" "keyword 2"

I dont get any errors when its scraping instead it would normally scrape then just stop. mean while I test these proxies and they are fine against google

please help


RE: Scrapebox stuck? - loopline - 10-25-2014

(10-25-2014, 02:05 PM)johnny666 Wrote: I am getting the same exact problem
I am using 12 pri proxies and I am using 1 connection...still pauses after a few minutes of running. its like clock work and I cannt for the life of me seem to work it out.

The search opperators I am using are
inurl:"keyword" "keyword 2"

I dont get any errors when its scraping instead it would normally scrape then just stop. mean while I test these proxies and they are fine against google

please help

Its the exact same as I said above, your going to fast. 1 connection and 12 private proxies with that footprint is too fast, you need 20-30+ private proxies and 1 connection to use advanced operators.

Meaning if you got settings and uncheck to use multi threaded harvester and to use custom harvester and then harvest you will see 302 ip blocked under status. As mentioned above the google test is for basic keyword search not advanced operators. Anyway I explained all this so just scroll up and read my posts.