12-22-2019, 08:39 PM
Probably THE biggest annoyance about Scrapebox for me has been situations where a job refuses to end (even when you press stop) due to open connections. This monster rears its head in several places, but most often when running Check Links on a bunch of domains.
I've tried everything...
- reducing the number of connections to a crawl
- waiting for hours (and even a full day)
- hitting stop and waiting
- Shutting down Scrapebox and trying it again (and again and again)
- Writing the vendor
...and more
Nothing seems to help.
Right now, for example, I have a list of about 100,000 urls that I want to link check. The first pass made it through just fine. It found about 7000 successful links. I've found that I often need to run several more passes to check all the urls so I ran it a second time (with 150 threads)...it choked up leaving me 113 open threads when returned a few hours later. I tried it again...same result. I tried it again with 90 threads...same result. I'm in the middle of some other gymnastics at the moment.
I wrote the creator a few months ago and his answer really didn't seem satisfying....and could be summed up as "Yeah, there's no way to close down threads that remain open on Windows". First and foremost, that seems almost inconceivable. Surely there is some software way to simply terminate threads (especially after a period of time or after hitting stop). I can't imagine that Windows forces threads to remain open....indefinitely. But....the second issue is.... Even if the above were true and there's no way to force threads to close, I should at least be able to regain control of Scrapebox so I can save the data that just took hours to collect. I mean, when harvesting I'm able to save the URLs on a periodic basis (like every 10,000 for example)....and there's always the files in the /Harvester_Sessions directory. With Check Link, though, it seems like I cannot get any such files. If the Active Threads ceases (as it often does), I'm just out of luck. I cannot get a listing of my successful/unsuccessful links. I simply have to start over...and over...and over....sometimes finally taking the time to split up my large lists and processing them in groups of 10,000 instead of 100,000+. This is very time consuming.
Surely there is some reasonable, better way? Maybe I'm still not getting something fundamental?
Again, it's inconceivable to me that simply hitting stop doesn't.....uhmmm....stop. It's inconceivable that Windows forces the threads to remain open with no open of forcibly closing them and even more inconceivable that I cannot save my data when this happens (and have to simply shutdown the Scrapebox task).
So that's my rant today as I'm now experimenting with the forty-leventh method that I'm hoping my skirt this issue
Any thoughts, ideas?
I've tried everything...
- reducing the number of connections to a crawl
- waiting for hours (and even a full day)
- hitting stop and waiting
- Shutting down Scrapebox and trying it again (and again and again)
- Writing the vendor
...and more
Nothing seems to help.
Right now, for example, I have a list of about 100,000 urls that I want to link check. The first pass made it through just fine. It found about 7000 successful links. I've found that I often need to run several more passes to check all the urls so I ran it a second time (with 150 threads)...it choked up leaving me 113 open threads when returned a few hours later. I tried it again...same result. I tried it again with 90 threads...same result. I'm in the middle of some other gymnastics at the moment.
I wrote the creator a few months ago and his answer really didn't seem satisfying....and could be summed up as "Yeah, there's no way to close down threads that remain open on Windows". First and foremost, that seems almost inconceivable. Surely there is some software way to simply terminate threads (especially after a period of time or after hitting stop). I can't imagine that Windows forces threads to remain open....indefinitely. But....the second issue is.... Even if the above were true and there's no way to force threads to close, I should at least be able to regain control of Scrapebox so I can save the data that just took hours to collect. I mean, when harvesting I'm able to save the URLs on a periodic basis (like every 10,000 for example)....and there's always the files in the /Harvester_Sessions directory. With Check Link, though, it seems like I cannot get any such files. If the Active Threads ceases (as it often does), I'm just out of luck. I cannot get a listing of my successful/unsuccessful links. I simply have to start over...and over...and over....sometimes finally taking the time to split up my large lists and processing them in groups of 10,000 instead of 100,000+. This is very time consuming.
Surely there is some reasonable, better way? Maybe I'm still not getting something fundamental?
Again, it's inconceivable to me that simply hitting stop doesn't.....uhmmm....stop. It's inconceivable that Windows forces the threads to remain open with no open of forcibly closing them and even more inconceivable that I cannot save my data when this happens (and have to simply shutdown the Scrapebox task).
So that's my rant today as I'm now experimenting with the forty-leventh method that I'm hoping my skirt this issue
Any thoughts, ideas?