ScrapeBox Forum
Extracting Links that are not links - Printable Version

+- ScrapeBox Forum (https://www.scrapeboxforum.com)
+-- Forum: ScrapeBox Main Discussion (https://www.scrapeboxforum.com/Forum-scrapebox-main-discussion)
+--- Forum: General ScrapeBox Talk (https://www.scrapeboxforum.com/Forum-general-scrapebox-talk)
+--- Thread: Extracting Links that are not links (/Thread-extracting-links-that-are-not-links)



Extracting Links that are not links - ambromfg - 02-03-2020

I have a source that I need to crawl where the websites are listed like this "website.com" There is no www, no http:// etc. And, they are not actually links, they are just text.  I would like to crawl the site and capture all of these website addresses, even though they are plain text.  Is there a way to do that?


RE: Extracting Links that are not links - loopline - 02-04-2020

Sure, use the link extractor to crawl the sites

https://www.youtube.com/watch?v=Ed3SGP_ch3Q

and then use the custom data scraper to get the links

https://www.youtube.com/watch?v=X3Ep-NXg4kY