ScrapeBox Forum
Scrape Google news and shop section? - Printable Version

+- ScrapeBox Forum (https://www.scrapeboxforum.com)
+-- Forum: ScrapeBox Main Discussion (https://www.scrapeboxforum.com/Forum-scrapebox-main-discussion)
+--- Forum: General ScrapeBox Talk (https://www.scrapeboxforum.com/Forum-general-scrapebox-talk)
+--- Thread: Scrape Google news and shop section? (/Thread-scrape-google-news-and-shop-section)



Scrape Google news and shop section? - walkallone - 02-03-2023

Hello,

is it possible to scrape google's news and shop section?

i tried adding  tbm=nws  for news and tbm=shop  for shop, but does not seem to work. 

any help?


RE: Scrape Google news and shop section? - loopline - 02-03-2023

well its probably possible, but you might also need to adjust the before and/or after url markers. But you can test those pages in your browser with javascript off, can you see the urls you want? I know it has worked in the past though.


RE: Scrape Google news and shop section? - walkallone - 02-06-2023

I have tested without javascript and yes, the links are clearly visible in the page, even though the layout differes a bit

is in this case only a url problem? (i am not setting it correctly)


RE: Scrape Google news and shop section? - walkallone - 02-06-2023

so, i tried ot make it work, but so far failed....

the url i want to extract is in the
<a class="Lq5OHe eaGTj translate-content" data-what="1" href="/url?url=https://bottino.ro/produs/ghete-casual-dama-din-piele-naturala-de-culoare-negru-cu-fermoar-3/%3Fattribute_pa_culoare%3Dnegru-box%26attribute_pa_marimi%3D38%26utm_source%3DGoogle%2520Shopping%26utm_campaign%3DFeed%2520-%2520Sorin%2520G%2520Ads%26utm_medium%3Dcpc%26utm_term%3D40579&amp;rct=j&amp;q=&amp;esrc=s&amp;sa=U&amp;ved=0ahUKEwjt-sDQuoH9AhVjQfEDHaAsAMQQ2SkIgAw&amp;usg=AOvVaw2PLqJcCBzkJuSag_0DoURr" jsaction="trigger.HWpvL">

i tried putting in the just before field "href="/url?url=" and right after "%"

but it didnt worked

any ideea?


RE: Scrape Google news and shop section? - walkallone - 02-17-2023

also, did a few tests.
played around with before and/or after url markers
i was not able to extract literally anything from the page.

i believe scrapebox is somehow not able to read the page..

is there anyway to "see" the page scrapebox is extracting?


RE: Scrape Google news and shop section? - loopline - 02-27-2023

Yes, just go to settings >> harvester engine configuration and click on the engine your working with.

then at the bottom click test engine. Then it wll prompt you to save the raw html as you go thru the test. Save that and you can see what scrapebox sees.