10-16-2011, 10:10 AM
This will scrape only indexed URLs which is good but I prefer to scrape as many urls as possible even the non idexed ones. As the original poster said he wants to send the URLs for indexing so there is a high chance that the site have a lot of non indexed urls.
I load ScrapeBox Links Extractor Addon, set the connections at 30-50 , and tick Internal button only and then load a file with url list I want to harvest all of their pages. I hit start and when is done I end up with a list of urls. If the list is relative small < less than 100-200K you may click show/edit links and delete the duplicates or un wanted links. Then save the list and repeat the procedure but LOAD every time the new file you just saved. this way you harvest the urls like a spider.. When you do not get any new urls simply stop , remove duplicates and you are done
I load ScrapeBox Links Extractor Addon, set the connections at 30-50 , and tick Internal button only and then load a file with url list I want to harvest all of their pages. I hit start and when is done I end up with a list of urls. If the list is relative small < less than 100-200K you may click show/edit links and delete the duplicates or un wanted links. Then save the list and repeat the procedure but LOAD every time the new file you just saved. this way you harvest the urls like a spider.. When you do not get any new urls simply stop , remove duplicates and you are done