ScrapeBox Forum
Cleaning Up Harvested URL's - Printable Version

+- ScrapeBox Forum (https://www.scrapeboxforum.com)
+-- Forum: ScrapeBox Main Discussion (https://www.scrapeboxforum.com/Forum-scrapebox-main-discussion)
+--- Forum: General ScrapeBox Talk (https://www.scrapeboxforum.com/Forum-general-scrapebox-talk)
+--- Thread: Cleaning Up Harvested URL's (/Thread-cleaning-up-harvested-url-s)



Cleaning Up Harvested URL's - tirmizi - 11-28-2018

Hi there,

I am sure there would be an option but am not sure which one or how it would be done. Like we harvest lots of url's , and the process is we remove duplicates.

Then I want to remove the url's with certain words like ;

youtube.
wiki
cnn
bbc


So what I want is perhaps create a file or I did find a blacklist word and edited , put  those words in it , and removed those but those url's still remained , so maybe there is something wrong with how I am doing it.

Also would be great to know if you guys could guide how I can harvest so that these url's containing those stop words are not harvested.

Thanks again


RE: Cleaning Up Harvested URL's - loopline - 11-29-2018

You want to put those words in a file, 1 per line.

Then put your urls in the urls harvested grid in the upper right hand quadrant of scrapebox.

Then go to remove/filter >> remove urls containing entries from. Then select your file.


RE: Cleaning Up Harvested URL's - tirmizi - 11-29-2018

(11-29-2018, 05:17 AM)loopline Wrote: You want to put those words in a file, 1 per line.  

Then put your urls in the urls harvested grid in the upper right hand quadrant of scrapebox.  

Then go to remove/filter >> remove urls containing entries from.  Then select your file.

Perfect Mate . Thanks


RE: Cleaning Up Harvested URL's - loopline - 12-01-2018

your welcome. Cheers!