Scraping Emails from behance.net - Printable Version +- ScrapeBox Forum (https://www.scrapeboxforum.com) +-- Forum: ScrapeBox Main Discussion (https://www.scrapeboxforum.com/Forum-scrapebox-main-discussion) +--- Forum: General ScrapeBox Talk (https://www.scrapeboxforum.com/Forum-general-scrapebox-talk) +--- Thread: Scraping Emails from behance.net (/Thread-scraping-emails-from-behance-net) |
Scraping Emails from behance.net - gavner25 - 10-20-2019 Hi guys, I thinks i need to help here with this site. I have scraped 150K urls from Behance.net but none of the email scraper options will scrape emails from the pages. I have the Email Scraper Plugin and i have tried with the grab/craw option but nothing works. All other sites are fine excpet Behance, but ironically this is the most important site for me. Here is an example list of urls below. If anyone can scrape emails from these pages, tell me. https://www.behance.net/jessiewhitmill/followers https://www.behance.net/michelapicchi https://www.behance.net/anusard/resume https://www.behance.net/Carelessconundrum/resume https://www.behance.net/Maryfergin https://www.behance.net/insborges https://www.behance.net/mrkc/resume https://www.behance.net/yousefah/followers https://www.behance.net/saysomething https://www.behance.net/wallisonmedeiros https://www.behance.net/Arcy22/appreciated https://www.behance.net/AlexiaLou/collections_following https://www.behance.net/AJVillacentino https://www.behance.net/frederiquegravier/resume https://www.behance.net/RickyDP https://www.behance.net/zoshuacolah/followers RE: Scraping Emails from behance.net - loopline - 10-21-2019 Im checking with support. Behance has some sort of javsacript and is displaying a generic message of <h1 id="we-noticed">We notice you are using an outdated version of Internet Explorer.</h1> <h2 id="browser-not-supported">This version is not supported by Behance.</h2> when you run the email scraper. I tried Apple iphone user agents and the latest chrome user agent, and it always just gives the above message. So I suspect there is some javascript qualifier that is going to prohibit it from working with scrapebox. This is basically scraping protection on the site, so it probably won't work. however I am not 100% certain yet. RE: Scraping Emails from behance.net - gavner25 - 10-21-2019 Great thanks for the reply. RE: Scraping Emails from behance.net - loopline - 10-22-2019 so just to confirm if you send the exact same headers as a regular browser sends, the same thing happens. So this is their generic response to javascript being turned off. So this site will not work as scrapebox uses raw sockets and threads and these do not support javascript. RE: Scraping Emails from behance.net - gavner25 - 10-25-2019 (10-22-2019, 10:16 PM)loopline Wrote: so just to confirm if you send the exact same headers as a regular browser sends, the same thing happens. So this is their generic response to javascript being turned off. So this site will not work as scrapebox uses raw sockets and threads and these do not support javascript. I apprecite the info. Thanks anyway RE: Scraping Emails from behance.net - loopline - 10-26-2019 your welcome, cheers! |