ScrapeBox Forum
Getting entire list of site URLs? - Printable Version

+- ScrapeBox Forum (https://www.scrapeboxforum.com)
+-- Forum: ScrapeBox Main Discussion (https://www.scrapeboxforum.com/Forum-scrapebox-main-discussion)
+--- Forum: Scrapebox Footprints (https://www.scrapeboxforum.com/Forum-scrapebox-footprints)
+--- Thread: Getting entire list of site URLs? (/Thread-getting-entire-list-of-site-urls)



Getting entire list of site URLs? - andrewmp - 08-12-2015

Hi,

i want to get a list with all URLs of a website, then i will use it to scan external links. The problem is the limit of results.

Take this example:
site:business.yahoo.com
Google shows 22,200 results

How can i do it for the given example?

Andrew


RE: Getting entire list of site URLs? - loopline - 08-17-2015

You can do it a few days. All engines limit you to a max of 100 results, and google often cuts you off with a soft limit of 400-600 results.

So you would want to tack on keywords and force google to return different sets of results from its database.
site:business.yahoo.com
site:business.yahoo.com a
site:business.yahoo.com b
site:business.yahoo.com c
site:business.yahoo.com 1
site:business.yahoo.com 2
site:business.yahoo.com 3
site:business.yahoo.com car
site:business.yahoo.com green
etc...

Then just remove duplicate urls when done.

Also you can then use the link extractor to extract internal and external links.


RE: Getting entire list of site URLs? - andrewmp - 08-18-2015

Thank you loopline. I really like the product.


RE: Getting entire list of site URLs? - loopline - 08-20-2015

Your welcome. Glad you like it. Smile