Scraping Google News - Printable Version +- ScrapeBox Forum (https://www.scrapeboxforum.com) +-- Forum: ScrapeBox Main Discussion (https://www.scrapeboxforum.com/Forum-scrapebox-main-discussion) +--- Forum: General ScrapeBox Talk (https://www.scrapeboxforum.com/Forum-general-scrapebox-talk) +--- Thread: Scraping Google News (/Thread-scraping-google-news) Pages:
1
2
|
Scraping Google News - Nosh - 08-30-2020 Hi everybody ! I would like to scrape results of spanish Google News. I tried this query string but I get 0 results. Anybody can help ? https://www.google.com/search?q={KEYWORD}&complete=0&hl=es&pws=0&sxsrf=ALeKk03HvZBF2uo3jX6CcGFg93SSRRAksg:1598728775740&source=lnms&tbm=nws&sa=X&ved=2ahUKEwizr5TmkMHrAhXfyTgGHeIiBcAQ_AUoAnoECGwQBA&biw=1440&bih=489 RE: Scraping Google News - serialscraper - 08-30-2020 Start here and let me know if you run into any problems https://news.google.com/search?q={KEYWORD}&hl=es RE: Scraping Google News - Nosh - 08-31-2020 Hi, there is no "real" Google News in Spain, that's the problem RE: Scraping Google News - serialscraper - 08-31-2020 Are you looking for a Spanish news website? if so, let me know which one so that we have a starting point. RE: Scraping Google News - Nosh - 09-01-2020 I want to scrape this: https://www.google.es/search?q=bla+bla&client=safari&sxsrf=ALeKk01TSBVY3_J38pz6FucI2kv5X6zYFw:1598959266718&source=lnms&tbm=nws&sa=X&ved=2ahUKEwjRlOm468frAhVTTcAKHZ_9A2wQ_AUoBHoECHgQBg&biw=1694&bih=984 In Spain its not a "real" Google News section because of this: https://www.newsmediaalliance.org/google-news-shutdown-in-spain-not-as-bad-as-google-would-have-you-believe/ RE: Scraping Google News - Nosh - 09-08-2020 Should be something like this: https://www.google.se/search?q={KEYWORD}&source=lnms&tbm=nws but I get 0 results RE: Scraping Google News - Nosh - 09-11-2020 can anybody help me out with this one ? RE: Scraping Google News - loopline - 09-11-2020 When you do the engine test, what happens? It can also help on the test page (which is on the screen where you setup the engine) to save the raw html after the test. Because sometimes the html that scrapebox ultimately sees is different then what you see in a browser. RE: Scraping Google News - Nosh - 09-12-2020 It says: "Error 0. No links could be retrieved" RE: Scraping Google News - loopline - 09-12-2020 ok, so then save the raw html. Once you do that then you can find the markers. Because its 1 of 3 things 1 - your getting some sort of general error like 404 or a IP block type error, in which case looking at the raw html should show that pretty easy. 2 - the links are rendered with javacript, in which case scrapebox won't be able to see them. So check the raw html for the links, and see if they are indeed there. 3 - The links are there but your before/after markers are wrong. So you can look at the raw html and determine the correct before and after markers. |