Looplines Scrapebox List

Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
scraped url count
#1
hi

i scraped over 30 millions urls from various search engines - once the scrape is complete you are shown a summery screen that shows you how many urls were scraped from the varius engines for the given keywords - - After I clicked the exit button then all of the urls are moved to the scrapebox harvester screen - for some reason its only showing 9 million urls and not the 30 million i I scraped


does scrapebox automatically delete duplicate urls when they come into the harvester or is there another reason for this ?
[-] The following 1 user says Thank You to adam110 for this post:
  • amritabodyspa
Reply
#2
If you have the automatically delete duplicates enabled then yes. But they are all saved in the harvester sessions folder (Which is in your main scrapebox folder).

The auto remove duplicates is under options menu at the top of the main scrapebox window.
[-] The following 1 user says Thank You to loopline for this post:
  • amritabodyspa
Reply
#3
Yes i realized they all got saved in the Harvester sessions folder - Lucky they did - I had many more unique urls that scrapebox did not pull into the window - I guess its due to how large the file because
[-] The following 1 user says Thank You to adam110 for this post:
  • amritabodyspa
Reply
#4
there may have been a load error, because the main grid should handle 30 million urls. But there may be some characters in the file that caused scrapebox to think it was the end of the file. Hard to say.
[-] The following 1 user says Thank You to loopline for this post:
  • amritabodyspa
Reply




Users browsing this thread: 1 Guest(s)
Looplines Scrapebox List