Login

adam110 · 05-16-2018, 02:04 PM

Hi

I recently purchased the expired domain finder plugin and ran it on a couple of sites - I set it to crawl 6 levels deep - added 100 private proxies and ran it with about 7 threads - I also disable all the metrics so it only shows the expired domains.

It ran all night on my VPS but the seed list was no good and it only returned a few results so I decided to stop it and test it on a larger site.

I then got a better seed list (2 urls) I loaded them into the expired domain finder - I then set the crawl depth to 25 (max) and I set threads to 25 (using same 100 proxies) All metrics were disbaled (google, alexa, plus 1 etc) I lets it run on my VPS - I then logged back into the VPS about 3 hours later and the expired domain finder tool was gone (it had closed) scrapebox however was still open (I also had 1 more instance of scrapebox running that was scraping with the harvester - but the harvester uses different reverse proxies to what I was using in the expired domain finder)

So I started the tool again - opened the saved location and I saw the files ( from the previous scrapes ) - I didnt know what was wrong - I added https://github.com/ - https://www.theguardian.com as the seed urls with a depth of 25 - I thought it might be the number of threads

so I started the tool again - this time with 12 threads - added the same seed urls and let it run - today I logged into my VPS and the expired domain finder is not there again - It again closed - So I looked for the logs and found some error txt files inside of ScraoeBoxApplicationFolder > Plugins > Expired Domain Finder

In this folder I looked at the bugreport.txt file - Inside of it I noticed it had a NO near use proxies for crawler - I quickly realized that i had to enable the proxies for the crawler after I loaded them - So I have now done this.

I then looked at the errors crawler.txt file and saw a number of errors that refered to making too many requests (I guess this is due to not using proxies and having multiple threads)

Ive now started the expired domain finder up again - and this time I have enabled proxies for the crawler. But when i open the bugreport.txt file (it allows me to open this but does not allow me to open crawler error file because its being used) then it still shows use proxies for crawler NO - and the userinterface doesnt really tell me if proxies are actually being used (just shows me the number of proxies ive imported in the top)

My main question is this. after the expired domain finder has completed then does it automatically close? or should it remain open? if it should remain open then im guessing its not completing the scrapes - is there any optimum settings someone could reccoemend? (number of threqads- depth - proxies etc) Im happy to keep this running for days on the VPS (just like I do with the harvester)

If I need to send anything to support (such as files etc) then please let me know - I have however started a new scrape so im guessing the previous scrqape errors etc will no longer be there

thanks

adam110 · 05-16-2018, 02:21 PM

Ok its closed again - I started the tool while I was writing the last post - So it hasnt even ran for 30 mins and its closed - Image attached showing the crawler errors - it says too many connections -

adam110 · 05-16-2018, 02:29 PM

I closed scrapebox and restarted it again - I then opened the expired domain finder and under proxies it has use proxies for crawler enabled

So i looked through some more of the files and I saw the configurations file - I opened it shows that it should be using proxies - contents of Expireddomains.plugin.txt is noted below

[Settings]
CrawlDepth=25
AutoSave=1
DoNotCrawlSubDomains=0
HostMustMatch=0
Agent=Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.131 Safari/537.36
Connections=15
Timeout=20000
ConnectTimeout=10000
Delay=0
MajesticDelay=0
MajAddWWW=0
UseProxiesForCrawler=1

So im not entirely sure whats going on now and what to do from here

adam110 · 05-16-2018, 06:12 PM

Ok it keeps closing so I've emailed support with all log /error files

adam110 · 05-17-2018, 08:06 PM

(05-16-2018, 06:12 PM)adam110 Wrote: Ok it keeps closing so I've emailed support with all log /error files

Ok support replied and they have updated the tool . I have since purchased a me expired domain scraper which has been working like a dream but costs me a monthly fee - I will try and give tit another go when my current scrape hasn completed.

My only concern with expired domain scraper plugin is that I don't think it's using my proxies despite adding proxies and setting it to use proxies for scraping (each time the tool crashes it creates a bugreport and in that it says proxies are disabled ) I'm also getting messages from sites saying too many connections

**loopline** · 05-18-2018, 02:40 AM

The expired domain finder uses proxies for me, when its selected, but make sure you are running scrapebox from a folder on your desktop or a folder in your documents folder.

its possible that there are no write permissions.

adam110 · 05-22-2018, 03:54 PM

(05-18-2018, 02:40 AM)loopline Wrote: The expired domain finder uses proxies for me, when its selected, but make sure you are running scrapebox from a folder on your desktop or a folder in your documents folder.

its possible that there are no write permissions.

Thanks Loopline - Im using Scrapebox on the same VPS I have been using it on for many months now - I usually use it to harvest urls from the search engines (and i use it very successfully and works a treat) No reading writting issues - Its also saved on the desktop

I have just started a brand new crawl today - Im scraping 2 sites (25 depth) using 10 threads and 100 proxies - I will post back here after a few days or when the scrape is complete.. Im actually using 2 new domains to scrape from now because I already managed to scrape the last couple of domains using another tool

fingers crossed all goes well -

adam110 · 05-22-2018, 05:40 PM

(05-22-2018, 03:54 PM)adam110 Wrote:
(05-18-2018, 02:40 AM)loopline Wrote: The expired domain finder uses proxies for me, when its selected, but make sure you are running scrapebox from a folder on your desktop or a folder in your documents folder.

its possible that there are no write permissions.

Thanks Loopline - Im using Scrapebox on the same VPS I have been using it on for many months now - I usually use it to harvest urls from the search engines (and i use it very successfully and works a treat) No reading writting issues - Its also saved on the desktop

I have just started a brand new crawl today - Im scraping 2 sites (25 depth) using 10 threads and 100 proxies - I will post back here after a few days or when the scrape is complete.. Im actually using 2 new domains to scrape from now because I already managed to scrape the last couple of domains using another tool

fingers crossed all goes well -

ok so the scrape has already completed which suprised me - Here are the results

I scraped 2 sites

kickstarter.com and indiegogo.com - crawl depth was set to max (25)

scrapebox has not crashed but has stopped - so i believe it has completed the task - it only got me 61 expired domains (i was hoping for many more)

I have now loaded the same 2 sites into the other tool that I used for scraping the last urls that scrapebox was giving me issues with to see how it compares to the 61 results that scrapebox has provided. So far the other tool has been running for less than 10 mins and it has already found 10 sites (to be honest scrapebox found about 10 sites in 10 mins too)

I have had a look at the crawl error file and i continue to see the error too many requests - I cant understand how this can be the case - I have proxies added (100) runing qith 10 threads and I have it checked to work with proxies -

adam110 · 05-22-2018, 05:51 PM

(05-22-2018, 05:40 PM)adam110 Wrote:
(05-22-2018, 03:54 PM)adam110 Wrote:
(05-18-2018, 02:40 AM)loopline Wrote: The expired domain finder uses proxies for me, when its selected, but make sure you are running scrapebox from a folder on your desktop or a folder in your documents folder.

its possible that there are no write permissions.

Thanks Loopline - Im using Scrapebox on the same VPS I have been using it on for many months now - I usually use it to harvest urls from the search engines (and i use it very successfully and works a treat) No reading writting issues - Its also saved on the desktop

I have just started a brand new crawl today - Im scraping 2 sites (25 depth) using 10 threads and 100 proxies - I will post back here after a few days or when the scrape is complete.. Im actually using 2 new domains to scrape from now because I already managed to scrape the last couple of domains using another tool

fingers crossed all goes well -

ok so the scrape has already completed which suprised me - Here are the results

I scraped 2 sites

kickstarter.com and indiegogo.com - crawl depth was set to max (25)

scrapebox has not crashed but has stopped - so i believe it has completed the task - it only got me 61 expired domains (i was hoping for many more)

I have now loaded the same 2 sites into the other tool that I used for scraping the last urls that scrapebox was giving me issues with to see how it compares to the 61 results that scrapebox has provided. So far the other tool has been running for less than 10 mins and it has already found 10 sites (to be honest scrapebox found about 10 sites in 10 mins too)

I have had a look at the crawl error file and i continue to see the error too many requests - I cant understand how this can be the case - I have proxies added (100) runing qith 10 threads and I have it checked to work with proxies -

ok its official - the other tool im using has already scraped 93 expired domains and its still on the first site - I honestly feel that scrapebox isnot using my proxies -as a result the sites are blocking me
and the search is not being completed. Ive messaged support making notes of this but they said they found an issue and updated the tool - maybe they did but they didnt make any notes about proxies - (please note im using the exact same proxies in the other tool - im not naming the tool out of respect for this forum but am happy to work with support to get this resolved) not sure where to go from here. I will reply to the email support sent me and take it from there

**loopline** · 05-23-2018, 11:47 PM

Ill do some testing.

Login

Username:
Password:

Login

Username:
Password: