So I've had scrapebox for awhile now and have had good success with it. But I've been wondering how to best keep from harvesting the same sites over and over again.
When you harvest a new list of sites, how do you make sure you haven't already harvested these before to make sure you're always scraping fresh sites?
I've thought about keeping a master list of all scraped sites, and checking each new list against that, but I don't know how to do that without merging master with the new list and removing duplicates, which wouldn't leave me with a fresh list of never before posted to forums.
I'd like to hear how you take care of this issue. Thanks.
When you harvest a new list of sites, how do you make sure you haven't already harvested these before to make sure you're always scraping fresh sites?
I've thought about keeping a master list of all scraped sites, and checking each new list against that, but I don't know how to do that without merging master with the new list and removing duplicates, which wouldn't leave me with a fresh list of never before posted to forums.
I'd like to hear how you take care of this issue. Thanks.