Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
It keeps crashing...
#1
So I'm becoming increasingly comfy with Scrapebox (thanks to this site and your youtube vids).  I've set myself up on a pretty robust vps.  I have a growing number of good proxies.  I'm moving along and finding all kinds of awesome uses for the software....but

I've run into a situation there it keeps crashing on me.

I have a list of about 1 million keywords that I wanted to scrape for URLs.  Clearly I can't just let it run until it finishes or I'll have a zillion gig url list.  At first, I was scraping manually and turning it off at about 4million urls, then I'd de-dupe and do other url cleaning.  All is well.  I wanted to automate the process though.  So I tried doing it in Automator and realized it wouldn't work because the software won't chunk it.  Rather, it won't move to the next step until all 1 million keywords have been run through.  That's cool...

I figured I'd use the option for it to auto-save about every 2 million urls...creating files of 2 million urls each.  Easy enough.

However, when I hit just over 4 million total (two lists of 2 million urls), Scrapebox starts going nutty on me, grinds to a halt, and eventually crashes.  I suspect it's keeping all of the urls in some kind of memory and cannot make room for more.  I can't find an option to adjust this though.  It'd be nice to sort of reset itself or something every time it saves a list of 2 million.

Any idea what's happening and how to fix it?  Maybe there's another way to harvest urls from such a large list of keywords?

I'm running about 800 threads and using the custom harvester to do it.  I'm scraping only from about 5 or 6 search engines.  Am I hitting my machine limitations maybe?  I'm paid up for this month but am going to be moving to a bigger, better dedicated server next month.

Thanks!!
Reply
#2
scrapebox is super memory efficient, so its not keeping all that in memory.

Does it give you some sort of error box or what exactly happens?

Or does the window all appear normal and you can move it around but the count does not change?

The only real issue I see is 800 threads. Windows won't like that, and over time that could cause an issue.

Your better off to stamp out multiple instances of scrapebox on the same machine (you can run unlimited on the same machine, assuming its windows and not mac) and set like each one to 150 or 200 threads. Windows will work better in this way, I know its the same thread count, but about 200 per program is about windows cap before it starts to have issues.

Also in this way if 1 crashes the rest don't and something is always working.

glad you like the videos and site and such. Smile Here is another video for you regarding above
https://www.youtube.com/watch?v=aZzdE6ybu38
Reply
#3
I can't find the other thread you were mentioning in the other thread (that sounds weird, lol). You mentioned in another thread your having issues with sbox crashing, and showed a pic of it not responding? Can you link the thread?

Sorry Im traveling and on my little 13 in screen and the internet kicked me off a few mins ago and yada yada yada. Anyway I did see a picture of

not responding

So when something goes not responding it can mean a few things, at least with scrapebox.
It could mean that something locked the primary thread and its not going to complete and your going to have to force close it.

However more often what it means is simply that windows is too busy to paint the GUI. Meaning its still working in the background and if you leave it it will eventually finish and come back, as it were.

So if this is a small run and it should be done, then its toast. But if its a large run its probably just busy. You can get some clues by looking in task manager if resources for the instance of scrapebox are still bouncing all around its probably still working. Also if its a function that has an auto save file, like the harvester, go into the harvester_sessions folder and look at the file its auto saving, is the file size changing over the period of a minute or so? If so its still working.

If there is no activity in task manager and/or the file size isn't changing then its probably toast.

If it keeps happning you can experiment with lower threads and thus lower resources, or you can work in smaller chunks and build some automation - note this is what google does even, works with lots of small machines. In fact its what I do, custom scripts and the automator and super small chunks. I have a developer if you need one, assuming you can get a basic understanding of python, which works pretty good, even bat files I use.

Lastly of course whitelist in security software and you can test by closing downthings like skype, utorrent, really anything that connects to the internet can lock a thread, but its worth a test to run only scrapebox. But link me the thread and post back how its going.
Reply
#4
Sorry I was slow to respond. I've been out of it for nearly a week because my terrible, terrible ScrapseboxSeNukevps.com (worst customer service ever....serviously) account has been down twice (for almost 90 hours). I also have a license for home, but fewer proxies there. Smile Thanks for replying .

As for the issue, I've recently been running 500 threads wit a few hundred proxies. I've come to learn that I have to chunk everything down in smaller files. Scrapebox seems to be choking on me any time I'm using files much larger than 500-900mb. I've been using lists averaging about 100mb now. With the exception that I have to tend to the jobs more frequently, it seems to be working. Maybe my VPS just needs more RAM (7gb at the moment). I'm about to dump Scrapeboxsenukevps when my first month expires soon and I'm looking for server alternatives. I don't know if I'm going to go with an actual Scrapebox server (I've been favoring SmartSEOVPS while researching options...because their prices seem reasonable, they have lots of add on options, and their staff have been amazing at answering pre-sales questions).

I'm also thinking of just getting a regular ol' Windows dedicated VPS with the speeds and RAM I desire. This is my first time playing with a VPS, so it's been fun to learn.
Do you have any recommendations on VPS services (especially dedicated) which are semi-affordable and powerful? Do you think the ones from Google, Amazon, and Microsoft would have a problem with me running Scrapebox? I'm just now looking at my options.

I just know I need something more powerful and more reliable than my first VPS attempt with Scrapeboxsenukevps.com. Working with them has been a royal pain. They were friendly until the moment I purchased....then they ghosted me for an entire week when I asked them very simple new user questions. Then they went down for 27 hours....and down again for about 60 hours. I would report the issue and they'd reply with 2-3 word answers about every 24-30 hours -- saying stuff like "it's working" and little more lol. I would create screen shots and videos showing it *not* working and they'd reply again in 24 hours with another 2-3 word phrase then ghost me again. This seems common for the company based on the reviews I read. I really should've looked at reviews before I orders lol. I immediately saw the vps would not be powerful enough for me so I requested the promised refund within 24 hours --- and that's when they ghosted me for a week. Every time since then that I mentioned the refund, they simply ignored my words. Clearly, they're not going to standby their refund guarantee haha... no worries. I have them for another 10 days and then I'll be done with them for good Wink

But yeah, I'd love vps recommendations. I'm researching them but since I don't yet know what I don't know, I'm still flying a bit blind.....but learning happens through mistakes Smile

Thanks!!!
Reply
#5
Yes experience keeps a dear school, you do indeed have to learn by mistakes.

So windows doesn't like dealing with a specific app thats running more then 200 threads. I tend to actually stay near 100 to 125 per instances. But with windows you can run an unlimited number of scrapebox instances, so you can could run 3 instances and put them all at like 175 and still be at more then 500 threads and it would run smoother.

Smaller files are better, I work in ultra small files, like less the 1 mb and then I have automation in place, but thats not practical without heavy automation, but in time the automator plugin could be valuable for you and off the top of my head its only like $20 and then you might want to build or hire out some scripts, but for $100 you could probably have the automator and scripts that fill in the gaps and have what your doing entirely hands off, or 99% hands off. Any way for down the track.

Scrapebox can handle massive files, like 100 million+ lines, but windows and server resources won't really like that. So more memory is always better, generally speaking, but smaller files are always better/more efficient. Even google works in small files and small servers.

This is who I use and recommend
https://scrapeboxfaq.com/solidseo.php

I have over a dozen servers with them and have been with them for several years. customer service is always great and prompt as well. Its just regular windows VPS, I have actually never used a "seo" companies vps, always just regular windows. Its pretty straight forward and I have a video on it as well.

To be honest I like dedicated servers, thats just me. I mean even a less expensive dedicated I "feel" out performs a more expensive VPS, plus you typically get more hard drive space on dedicated, which for me helps. I always wind up storing a lot files over time, historic data that I can use later I mean.

Its my 2 cents, but a little bit of it is trial and error, trying out some stuff to see what meets your needs. Being down for 90 hours is SUPER annoying though. I might have raised a item not as described ticket with my payment processor to get their attention. I don't like to do that, but if you cant' get a response I do sometimes, it gets their attention anyway.

But none the less, I think you are on a good track.
Reply
#6
(08-12-2019, 07:10 PM)loopline Wrote: Yes experience keeps a dear school, you do indeed have to learn by mistakes.  

So windows doesn't like dealing with a specific app thats running more then 200 threads.  I tend to actually stay near 100 to 125 per instances.  But with windows you can run an unlimited number of scrapebox instances, so you can could run 3 instances and put them all at like 175 and still be at more then 500 threads and it would run smoother.  

Smaller files are better, I work in ultra small files, like less the 1 mb and then I have automation in place, but thats not practical without heavy automation, but in time the automator plugin could be valuable for you and off the top of my head its only like $20 and then you might want to build or hire out some scripts, but for $100 you could probably have the automator and scripts that fill in the gaps and have what your doing entirely hands off, or 99% hands off.  Any way for down the track.  

Scrapebox can handle massive files, like 100 million+ lines, but windows and server resources won't really like that.  So more memory is always better, generally speaking, but smaller files are always better/more efficient.  Even google works in small files and small servers.  

This is who I use and recommend
https://scrapeboxfaq.com/solidseo.php

I have over a dozen servers with them and have been with them for several years.  customer service is always great and prompt as well.   Its just regular windows VPS, I have actually never used a "seo" companies vps, always just regular windows.  Its pretty straight forward and I have a video on it as well.  

To be honest I like dedicated servers, thats just me.  I mean even a less expensive dedicated I "feel" out performs a more expensive VPS, plus you typically get more hard drive space on dedicated, which for me helps.  I always wind up storing a lot files over time, historic data that I can use later I mean.  

Its my 2 cents, but a little bit of it is trial and error, trying out some stuff to see what meets your needs.  Being down for 90 hours is SUPER annoying though.  I might have raised a item not as described ticket with my payment processor to get their attention.  I don't like to do that, but if you cant' get a response I do sometimes, it gets their attention anyway.  

But none the less, I think you are on a good track.



thanks...unfortunately I paid via bitcoin so I couldn't do much about it Smile I made a mistake and learned. 

I do have Automator and I love it.  My current situation was simply a big job and an experiment (trying to quickly build my own dofollow lists).  Now that I'm back up and running, having his as an ongoing automator process is at the top of my list and a goal for this week). I'm not so sure I'm ready for a dozen servers (wow) but I was already daydreaming of a second one, though I'm not there quite yet.

Initially I was using my OneDrive account for cloud storage on the VPS but suddenly I began to not trust the owners and elected instead to buy a new dedicated OpenDrive account.  Storage is going to be a bit less of an issue, but I can imagine that it could still get crowded.

Regardless, I'm having a blast learning Smile 

I elected to mostly use SEO hosting only because I don't have to buy additional Scrapebox copies (can keep one at home on my pc and use their license for their's).  Also, I like the benefit of their other tools I can use to learn -- some I'd not even heard of before.  When it's all said and done, I'll probably  choose a host like your's for mine and just bring over the tools/software that I'm actually gonna use.  I did just check out SolidSEOVPS and they look great. 

As a side note, I did not know about the 200 vs 500 threads and Windows.  As I'm mainly experimenting/testing/learning, I've been maxing out features just to see what it could take.  I've noticed a number of unexpected things.

For example , when posting at 500 threads, my home pc is faster and with more successful results than the VPS.  However, with link checking, my VPS is several times more accurate and faster than my home pc.  Honestly, I figured the VPS would be faster at everything *if* the machine itself were faster.  This doesn't seem to be the case and may be the result of some kind of limitation they've placed on the server.  I'm on pretty fast broadband at home so I don't think it's a bandwidth or system resource issue.  It may also be a proxy issue (different proxies from different companies).

thanks again!
Reply
#7
Ahh yes, its great to be on the receiving end of bitcoin if your a business as you don't have to worry about fraud, but it does make it easy for them to totally ignore a guarantee with zero repercussions (except reviews, which is very important of course). Yeah lesson learned. Smile

The server "should" be faster at everything, all things being equal. I mean its always the little things that get me, so maybe there was some little thing that didn't get accounted for (just saying Im human too), but else then Id say it could be a server issue.

But there are a lot of things at play, like if the server has crappy DNS and your using Google DNS at home, or if you did not use proxies on link checking and you did use them on posting (although it still should have been faster) but lets say that the proxy server was over loaded when you posted with the server and not when you posted at home. I don't know how many tests you did, but given your experience with the server it could have well been that.

I can say though that in 10 years Ive never once noticed my home machine was faster, when all things were equal. Now if I have a monster home machine and a cheap vps and the resources are the issue on the vps, then yes of course. But then all things are not equal.

Sometimes its hard to say. I get your point of using a SEO VPS, it makes sense. I just always have bought the tools and kept them, but then a few times I might have actually saved some cash by testing on a seo vps first and then going. Hard to say, but its a great idea to test out and see what you need without buying a bunch of stuff for sure.

Hopefully whatever server you get next works well for you. Smile

A dozen servers is great, although its a bit to manage of course, but I started on a home machine and then grew from there, so its all just a matter of time.
[-] The following 1 user says Thank You to loopline for this post:
  • DigitalMu
Reply




Users browsing this thread: 1 Guest(s)