Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Issues/ features missed out in V2 (In my opinion)
#1
After experimenting with allot of the features in SB i noticed there are one or two small things missing that would complete this for me, firstly in the automaton plugin there seems to be no way of deleting duplicate proxies when you scrape them for a first time, this is a little bit of a concern when you scrape 250K~ and when you run a filter it gets down to 16k.... but there is no way to filter it automatically.

Secondly the rank tracker plugin seems to not work after a large scrape. When i run a scrape of around 1k keywords and then run a second one it dose not pull any information, even when i check to see the state of the proxies they seem to still be working so i was getting rather confused by this, but i found you have to wait up to 6-8 hours before it seems to be able to scrape again. P.S. This only seems to happen to google searches, bing and yahoo work grate.

Id love to get feed back on these points - John
Reply
#2
It should auto filter duplicate proxies. If you watch it on a test, does it not? It seemed to for me.

Yes the proxies may be working, but they are being blocked and when you wait they get unbanned. Its a matter of not enough proxies or too many connections. Overall just going to fast so you can try and slow it all down and at a point you will go slow enough they won't get blocked.
Reply
#3
Nope it dose not remove the proxies at all unfortunately

Here is the raw file pulled from the automator plugin after scraping all the sources
[Image: Ucpsql2.png]

Here is the count of the proxies again after i uploaded them into the proxy grid
[Image: pakPMMk.png]

And after i filter duplicates
[Image: qUbw8ie.png]

So unfortunately its not filtering dupes out - John
Reply
#4
I mailed support about it, and I think they are going to add it in a future version, but I thought of a way. You can go to settings and there is an option to only allow http/https urls to paste into the harvester url grid. Just make sure that is off and you can import all the proxies into the url grid - like import urls, but choose the proxy file, remove duplicates, save them off again and then load them in the proxy manager and test them.
Reply
#5
(10-11-2015, 05:11 PM)loopline Wrote: I mailed support about it, and I think they are going to add it in a future version, but I thought of a way. You can go to settings and there is an option to only allow http/https urls to paste into the harvester url grid. Just make sure that is off and you can import all the proxies into the url grid - like import urls, but choose the proxy file, remove duplicates, save them off again and then load them in the proxy manager and test them.

Thanks for this, i must be a pain xD but its just these little things stopping us from being able to use scrape box to its full
Reply
#6
All good. It makes sense. Smile
Reply




Users browsing this thread: 1 Guest(s)