ScrapeBox Forum
How to scrape Bulk articles on one topic - Printable Version

+- ScrapeBox Forum (https://www.scrapeboxforum.com)
+-- Forum: ScrapeBox Main Discussion (https://www.scrapeboxforum.com/Forum-scrapebox-main-discussion)
+--- Forum: ScrapeBox Tutorials (https://www.scrapeboxforum.com/Forum-scrapebox-tutorials)
+--- Thread: How to scrape Bulk articles on one topic (/Thread-how-to-scrape-bulk-articles-on-one-topic)



How to scrape Bulk articles on one topic - Riz - 11-02-2018

Hi,

I want to know how to scrape bulk articles on one topic like "web design" in new article scraper plugin?

In the tutorial of loopline he just let us know to scrape a single article.

Looking forward,

Kind Regards,
Riz


RE: How to scrape Bulk articles on one topic - loopline - 11-03-2018

I have a video Ill put below on the new article plugin and addon.

If you want to scrape bulk articles, just load in lots of urls.



https://www.youtube.com/watch?v=NHI9-0oNSlw&t=10s


RE: How to scrape Bulk articles on one topic - Riz - 11-03-2018

(11-03-2018, 04:12 AM)loopline Wrote: I have a video Ill put below on the new article plugin and addon.

If you want to scrape bulk articles, just load in lots of urls.  



https://www.youtube.com/watch?v=NHI9-0oNSlw&t=10s

I use the pattern as you define and it's work, I use my own footprint to scrape articles on a same keyword ex "site:ezinearticles.com "Christmas" " and grab the url's. 

I used 20 connection but when I stop, the article scraper plugin hang until I stopped in task manager. I tried article scraper with 1 connection but still article scarper hanged. 

For more check the attach screen shot.

Kindly do let me know the solution?


RE: How to scrape Bulk articles on one topic - loopline - 11-04-2018

That means that something has locked 1 or more of the threads. This can be security software such as anti-virus, malware checkers and firewalls. So you should whitelist scrapebox in all security software and then you can whitelist the entire scrapebox folder as well.

Further any program that accesses the internet can lock threads, things like skype, utorrent etc… So you can try closing down any unneeded programs. Then if its working you can turn programs back on 1 by 1 to find the culprit.

Further pc optimization software can lock threads so you can shut any such software down.

Take note that disabling security software (such as anti-virus, malware checkers and firewalls) often only stops new rules form forming, but allows existing rules to still fire. So you have to fully whitelist in the security software or uninstall the security software(as a test).

Further some security softwar requires you to whitelist in more then one place before it takes effect.

Also note that disabling a router firewall, does actually fully disable it.


Basically you have to sort out what is locking the threads, because scrapebox is forced to wait until all threads are released. On occasion it can be windows that does it, so you can try restarting your machine.


RE: How to scrape Bulk articles on one topic - Riz - 11-04-2018

(11-04-2018, 06:29 AM)loopline Wrote: That means that something has locked 1 or more of the threads.  This can be security software such as  anti-virus, malware checkers and firewalls.   So you should whitelist scrapebox in all security software and then you can whitelist the entire scrapebox folder as well.  

Further any program that accesses the internet can lock threads, things like skype, utorrent etc…  So you can try closing down any unneeded programs.  Then if its working you can turn programs back on 1 by 1 to find the culprit.  

Further pc optimization software can lock threads so you can shut any such software down.  

Take note that disabling security software (such as anti-virus, malware checkers and firewalls) often only stops new rules form forming, but allows existing rules to still fire.  So you have to fully whitelist in the security software or uninstall the security software(as a test).  

Further some security softwar requires you to whitelist in more then one place before it takes effect.  

Also note that disabling a router firewall, does actually fully disable it.


Basically you have to sort out what is locking the threads, because scrapebox is forced to wait until all threads are released.  On occasion it can be windows that does it, so you can try restarting your machine.

I exclude ScrapeBox in my Security Softwares but didn't Work.

The problem is ScrapeBox Article Scraper is not stopping after stopped the button and it continuously scrapping the articles.


RE: How to scrape Bulk articles on one topic - loopline - 11-04-2018

The article scraper is the same as anything else in scrapebox, it uses raw sockets and threads. These are not controlled by scrapebox but are controlled by windows. So scrapebox asks windows to open them and must wait for windows to release them.

If something locks the thread, it must wait till its unlocked. When you press stop you are asking scrapebox to ask windows to release all the threads, however scrapebox can not stop until all the threads are released. If something locks a thread or windows locks a thread, then scrapebox must wait, potentially indefinitely, until the thread is released, its how it works.

Scrapebox can't control windows, so it must wait on windows. So teh issue is outside of scrapeboxes control. This would hold true of any scrapebox function and any program out there that uses these types of sockets/threads.


RE: How to scrape Bulk articles on one topic - jctopzeo - 08-29-2020

(11-04-2018, 06:29 AM)loopline Wrote: That means that something has locked 1 or more of the threads. This can be security software such as anti-virus, malware checkers and firewalls. So you should whitelist scrapebox in all security software and then you can whitelist the entire scrapebox folder as well.

Further any program that accesses the internet can lock threads, things like skype, utorrent etc… So you can try closing down any unneeded programs. Then if its working you can turn programs back on 1 by 1 to find the culprit.

Further pc optimization software can lock threads so you can shut any such software down.

Take note that disabling security software (such as anti-virus, malware checkers and firewalls) often only stops new rules form forming, but allows existing rules to still fire. So you have to fully whitelist in the security software or uninstall the security software(as a test).

Further some security softwar requires you to whitelist in more then one place before it takes effect.

Also note that disabling a router firewall, does actually fully disable it.


Basically you have to sort out what is locking the threads, because scrapebox is forced to wait until all threads are released. On occasion it can be windows that does it, so you can try restarting your machine.

Hola loopline,

Quisiera saber como excluir en la lista "white or black" un software como el antivirus de mi pc.


Gracias.


RE: How to scrape Bulk articles on one topic - loopline - 09-02-2020

(08-29-2020, 02:25 PM)jctopzeo Wrote:
(11-04-2018, 06:29 AM)loopline Wrote: That means that something has locked 1 or more of the threads.  This can be security software such as  anti-virus, malware checkers and firewalls.  So you should whitelist scrapebox in all security software and then you can whitelist the entire scrapebox folder as well. 

Further any program that accesses the internet can lock threads, things like skype, utorrent etc…  So you can try closing down any unneeded programs.  Then if its working you can turn programs back on 1 by 1 to find the culprit. 

Further pc optimization software can lock threads so you can shut any such software down. 

Take note that disabling security software (such as anti-virus, malware checkers and firewalls) often only stops new rules form forming, but allows existing rules to still fire.  So you have to fully whitelist in the security software or uninstall the security software(as a test). 

Further some security softwar requires you to whitelist in more then one place before it takes effect. 

Also note that disabling a router firewall, does actually fully disable it.


Basically you have to sort out what is locking the threads, because scrapebox is forced to wait until all threads are released.  On occasion it can be windows that does it, so you can try restarting your machine.

Hola loopline,

Quisiera saber como excluir en la lista "white or black" un software como el antivirus de mi pc.


Gracias.

It depends, for the most part you would need to check with your security software provider.  They will probably have it in their manual or on their website or you can contact them.  

That said you generally are looking to add an exception/whitelist.  Different programs call it different things.  But you want to exclude the entire scrapebox folder.