Load WordPress Sites in as fast as 37ms!

Welcome, Guest
You have to register before you can post on our site.

Username
  

Password
  





Search Forums

(Advanced Search)

Forum Statistics
» Members: 3,412
» Latest member: Demian
» Forum threads: 2,604
» Forum posts: 34,094

Full Statistics

Online Users
There are currently 149 online users.
» 0 Member(s) | 147 Guest(s)
Baidu, Google

Latest Threads
[Dichvusocks.us] Service ...
Forum: ScrapeBox Proxies
Last Post: dichvusocks.us
28 minutes ago
» Replies: 7,668
» Views: 2,764,045
{Tisocks.com} - Socks5 Pr...
Forum: ScrapeBox Proxies
Last Post: tisocks
10-18-2020, 02:34 PM
» Replies: 2,866
» Views: 597,305
[Vn5socks.net] Service Se...
Forum: Sell Services
Last Post: DavidRock99999
10-18-2020, 03:42 AM
» Replies: 175
» Views: 25,891
Scrapebox + Automator : H...
Forum: General ScrapeBox Talk
Last Post: loopline
10-17-2020, 09:05 PM
» Replies: 1
» Views: 79
Download source code of t...
Forum: General ScrapeBox Talk
Last Post: loopline
10-17-2020, 09:02 PM
» Replies: 1
» Views: 38
*Only $12 per year - cPan...
Forum: Sell Services
Last Post: Pocomaster
10-17-2020, 03:59 PM
» Replies: 0
» Views: 35
Hostpoco.com -Windows Res...
Forum: Sell Services
Last Post: Pocomaster
10-16-2020, 10:58 AM
» Replies: 0
» Views: 52
Contact Form Submitter - ...
Forum: General ScrapeBox Talk
Last Post: loopline
10-16-2020, 05:21 AM
» Replies: 3
» Views: 112
[Dreamwebhosts] - $1 Linu...
Forum: Sell Services
Last Post: Dreamwebhosts
10-15-2020, 04:37 PM
» Replies: 0
» Views: 61
[Dreamwebhosts] KVM Linux...
Forum: Sell Products
Last Post: Dreamwebhosts
10-15-2020, 04:35 PM
» Replies: 0
» Views: 82

 
  Google Scrape Keeps Timing Out
Posted by: TacoLoco - 01-28-2020, 10:10 PM - Forum: General ScrapeBox Talk - No Replies

hey guys,
I've got a problem with my scrape and would really appreciate any help anyone can give me.
Whenever i try to scrape Google, i get through about 30 searches, then Scrapebox starts returning zero results for everything else.

I'm using:

  • 50 semi-dedicated proxies from buyproxies.org
  • Detailed harvester
  • Scraping Google
  • 30 second delay
  • ~3,000 keywords to search

Why can't i scrape using my entire list?

Is it an issue with my proxies?
Can anyone recommend a reliable source of proxies for scraping?

thanks!

Print this item

  help scrab a website number phone.
Posted by: issa abou emira - 01-21-2020, 11:43 AM - Forum: General ScrapeBox Talk - Replies (1)

Hello everybody ,

 Can you help me? I would like to scrape the telephone numbers on an individual advertising site.

The problem is that each advertisement is followed by a link and this link leads to the individual's advertisement which contains the telephone number etc ...

So how do you get scrapebox to extract all these numbers from this site?

I am a beginner thank you.

Print this item

  Yellow page error for a specific keyword
Posted by: rivermannv - 01-18-2020, 09:36 PM - Forum: General ScrapeBox Talk - Replies (3)

I have used this keyword in YP a few times successfully, however just lately It will run about 96 rows and sends back 20 errors consistently. As a result it ends early. I've tried resuming from previous jobs and again it's the same thing. The error is:

404 HTTP/1.1 404 Not Found accessing https://www.yellowpages.com/search?searc...&s=default

there are 20 all specific to a location.

If I change the keyword it works and I'll get 500K but many of the records aren't related. Once again, I've used the single keyword in the past and it worked, now it errors out.

Has anyone seen this happen?

Thanks!

Print this item

  Custom data grabber with regex issue
Posted by: Splendens - 01-15-2020, 08:56 PM - Forum: General ScrapeBox Talk - Replies (2)

Hello,

I'm looking to use Scrapebox to scrape all domain name mentions on a list just shy of 4000 web page urls.

The domain names are formatted on the pages like so:

Scrapeboxforum.com
Scrapeboxinfo.net
Scrapeboxhub.org

The domain names are plain text. They are not hyperlinks.

If it helps, they are also always in between <td> and </td> elements.  

I already have my list of almost 4000 urls I want to scan.

I am using 5 private proxies that have been tested and saved.
I think they're being applied when using the Custom Data Grabber, but honestly I struggle with Scrapebox.

I created inbound and outbound rules for Scrapebox in Windows Firewall.
I can do other things using Scrapebox that do work. Like grabbing internal links on the domain I'm getting the urls from.  

I created a Custom Data Grabber Module and under that a Module Mask:

https://imgur.com/a/TpER4Q3


I tried several regex examples and found this one:

Code:
^(?=.{1,253}\.?$)(?:(?!-|[^.]+_)[A-Za-z0-9-_]{1,63}(?<!-)(?:\.|$)){2,}$


Source: https://stackoverflow.com/a/41193739/5048548


I tested it using the tool on https://regex101.com/ and 3 sample urls come up as matches (as far as I can tell?):

https://imgur.com/iVR422q


However, when I run my Module all I get is this:

https://imgur.com/dGgD3Ft


The Module data folder contains a csv for every time I run the Module, containing two odd characters in the first cell:

https://imgur.com/OS3uupX


I ran several of the urls through browseo.net and the domain names on those urls are readable according to that tool.

Does anyone know where I'm going wrong here?
Or is there a better way to scrape domain name MENTIONS from a list of urls?

Thank you in advance!

Print this item

  Harvester Problem
Posted by: Nosh - 01-08-2020, 08:33 PM - Forum: General ScrapeBox Talk - Replies (18)

Hi,
with my harvester configuration for Google, which worked all the time I get now only this result:

https://accounts.google.com/ServiceLogin...er=0&pws=0

For every keyword. What could be the problem?

Print this item

  *Real* site crawling with Scrapebox?
Posted by: DigitalMu - 01-05-2020, 09:26 PM - Forum: General ScrapeBox Talk - Replies (1)

So I see SB has the site crawler where you can crawl one domain at a time (a deathly slow process)....or you can use link extractor (over and over and over and over) in an inefficient way to grab most links at a time) or you can use the "search google" method to see what pops up....but

Isn't there just like a regular ol' webcrawler?  You give it a list of urls, it spidered through all the links, pulls out the links on a domain, and ding - turkey's done.

Am I missing something obvious? It seems like such an obvious tool that should be included.  Instead, I find myself now poling aroudn the web to find some alternative.

Any ideas?
Thanks!!

Print this item

  Is it possible to get a list of an Instagram accounts followers using Scrape Box ?
Posted by: jamesmel - 01-01-2020, 09:51 PM - Forum: General ScrapeBox Talk - Replies (2)

Is it possible to use Scrape Box to get a full list of all followers of a 3rd party Instagram user (eg not an account i control) through the instagram website ?


I know i can click followers, but it only loads around 100 followers then you have to keep scrolling to load the rest, for a large account this could takes ages.

I note that if you inspect the followers number at the top of the page it shows a URL like this 

Code:
https://www.instagram.com/USERNAME/followers/

but clicking that just loads the modal window where you get the 100 followers+ that you have to scroll through, if you try to navigate direct to the URL it redirects to 

Code:
https://www.instagram.com/USERNAME/

Print this item

Wink Experiments Always Fail
Posted by: sayuti - 12-29-2019, 07:41 AM - Forum: General ScrapeBox Talk - Replies (1)

I started using scrapebox 2 weeks ago. nothing I have done has worked except the keyword scraper.

I was trying to find an available domain through domainlookup. and an error message indicates a timed connection error. can you help what should i do?
Sleepy
https://prnt.sc/qh27vx
https://prnt.sc/qh284u

Print this item

  "Crawl loaded list" (Email Scraper Premium) does not work anymore
Posted by: Nosh - 12-27-2019, 01:24 PM - Forum: General ScrapeBox Talk - Replies (5)

Hi,
the Email Scraper (Crawl Loaded List) does not work anymore.
This happens: Work threads running goes down from 100 to 1. Sites in queue varies between 40 and 370. 
I have a URL list to process of 99 (depth level 2).
What could be the reason? 
Thanks !

Print this item

  Connections that never end...
Posted by: DigitalMu - 12-22-2019, 08:39 PM - Forum: General ScrapeBox Talk - Replies (3)

Probably THE biggest annoyance about Scrapebox for me has been situations where a job refuses to end (even when you press stop) due to open connections.  This monster rears its head in several places, but most often when running Check Links on a bunch of domains.

I've tried everything...
  - reducing the number of connections to a crawl
  - waiting for hours (and even a full day)
  - hitting stop and waiting
  - Shutting down Scrapebox and trying it again (and again and again)
  - Writing the vendor
...and more

Nothing seems to help.

Right now, for example, I have a list of about 100,000 urls that I want to link check. The first pass made it through just fine. It found about 7000 successful links.  I've found that I often need to run several more passes to check all the urls so I ran it a second time (with 150 threads)...it choked up leaving me 113 open threads when returned a few hours later.  I tried it again...same result.  I tried it again with 90 threads...same result.  I'm in the middle of some other gymnastics at the moment.

I wrote the creator a few months ago and his answer really didn't seem satisfying....and could be summed up as "Yeah, there's no way to close down threads that remain open on Windows".  First and foremost, that seems almost inconceivable. Surely there is some software way to simply terminate threads (especially after a period of time or after hitting stop).  I can't imagine that Windows forces threads to remain open....indefinitely.  But....the second issue is....  Even if the above were true and there's no way to force threads to close, I should at least be able to regain control of Scrapebox so I can save the data that just took hours to collect.  I mean, when harvesting I'm able to save the URLs on a periodic basis (like every 10,000 for example)....and there's always the files in the /Harvester_Sessions directory.  With Check Link, though, it seems like I cannot get any such files.  If the Active Threads ceases (as it often does), I'm just out of luck.  I cannot get a listing of my successful/unsuccessful links.  I simply have to start over...and over...and over....sometimes finally taking the time to split up my large lists and processing them in groups of 10,000 instead of 100,000+.  This is very time consuming.

Surely there is some reasonable, better way?  Maybe I'm still not getting something fundamental?

Again, it's inconceivable to me that simply hitting stop doesn't.....uhmmm....stop.  It's inconceivable that Windows forces the threads to remain open with no open of forcibly closing them and even more inconceivable that I cannot save my data when this happens (and have to simply shutdown the Scrapebox task).

So that's my rant today as I'm now experimenting with the forty-leventh method that I'm hoping my skirt this issue Smile

Any thoughts, ideas? Smile

Print this item


Looplines Scrapebox List