ScrapeBox Forum
Scrape a directory - Home advisor, Houzz, White Pages, Manta etc - Printable Version

+- ScrapeBox Forum (https://www.scrapeboxforum.com)
+-- Forum: ScrapeBox Main Discussion (https://www.scrapeboxforum.com/Forum-scrapebox-main-discussion)
+--- Forum: General ScrapeBox Talk (https://www.scrapeboxforum.com/Forum-general-scrapebox-talk)
+--- Thread: Scrape a directory - Home advisor, Houzz, White Pages, Manta etc (/Thread-scrape-a-directory-home-advisor-houzz-white-pages-manta-etc)



Scrape a directory - Home advisor, Houzz, White Pages, Manta etc - exclusivescrape - 01-30-2017

Hello all

I run a small SEO Company.
We have recently installed a predictive dialer and I'm the CMO in charge of sourcing our leads etc.

I have tried scraping google for keywords, in certain cities etc and I'm spending countless hours trying to add to my blacklist and URL Filter in terms of the URLs that I dont want.

With that said, I noticed u posted a video on customizing modules for specific sites or directories. In other words, I want to scrape for Name, Business name, phone #, website url and email.

I assume the custom modules is the way to go ??? Correct ?

Also, I'm nota programmer and didnt really understand the regx stuff recommended in the video. Can you offer me any advice on saving time w/ this so I dont pull my hair out.


What is the best ay to scrape all these directories to avoid getting lame results with google scraping. It also takes too long to run the WHOISData plugin in to then get about 10% full info leads since alot of urls are private.

Any help is appreciated


RE: Scrape a directory - Home advisor, Houzz, White Pages, Manta etc - loopline - 01-31-2017

Your likley to have to build a mask for each piece of data you scrape, so one for name, another for business name, another for phone etc..

Each mask ouputs its data on its own line so you would get like

name
business name
phone
website
email
name
business name
phone
website
emailname
business name
phone
website
email
etc...

but with any pieces that are missing they will be skipped. So you would need to do post processing once its done. Scraping phone numbers will likley require a regex, and Im not expert either on regex, but you could probably pay someone $5 on fiverr to make you some regexes, with some instruction to them on what you want.

But for the plug and play solution, you could try scrapeboxes yellow pages plugin. Assuming scraping yellow pages is an option for you, its the only real plugin and play solution there is.


RE: Scrape a directory - Home advisor, Houzz, White Pages, Manta etc - exclusivescrape - 01-31-2017

So after I create the actual module and then run it to scrape the results don't get generated to an Excel sheet I'm a little bit confused


RE: Scrape a directory - Home advisor, Houzz, White Pages, Manta etc - loopline - 02-01-2017

No. This is a basic scraper, it was intended for basic scraping. Its a long story, but the components needed for a full fledged scraper don't exist.

So each mask saves its data on its own line and outputs as a text file. You would have to do post processing in excel etc.. to get it into columns however you want it.

here is a video
https://www.youtube.com/watch?v=X3Ep-NXg4kY