Getting entire list of site URLs? - Printable Version +- ScrapeBox Forum (https://www.scrapeboxforum.com) +-- Forum: ScrapeBox Main Discussion (https://www.scrapeboxforum.com/Forum-scrapebox-main-discussion) +--- Forum: Scrapebox Footprints (https://www.scrapeboxforum.com/Forum-scrapebox-footprints) +--- Thread: Getting entire list of site URLs? (/Thread-getting-entire-list-of-site-urls) |
Getting entire list of site URLs? - andrewmp - 08-12-2015 Hi, i want to get a list with all URLs of a website, then i will use it to scan external links. The problem is the limit of results. Take this example: site:business.yahoo.com Google shows 22,200 results How can i do it for the given example? Andrew RE: Getting entire list of site URLs? - loopline - 08-17-2015 You can do it a few days. All engines limit you to a max of 100 results, and google often cuts you off with a soft limit of 400-600 results. So you would want to tack on keywords and force google to return different sets of results from its database. site:business.yahoo.com site:business.yahoo.com a site:business.yahoo.com b site:business.yahoo.com c site:business.yahoo.com 1 site:business.yahoo.com 2 site:business.yahoo.com 3 site:business.yahoo.com car site:business.yahoo.com green etc... Then just remove duplicate urls when done. Also you can then use the link extractor to extract internal and external links. RE: Getting entire list of site URLs? - andrewmp - 08-18-2015 Thank you loopline. I really like the product. RE: Getting entire list of site URLs? - loopline - 08-20-2015 Your welcome. Glad you like it. |