ScrapeBox Forum
Similar site - Printable Version

+- ScrapeBox Forum (
+-- Forum: ScrapeBox Main Discussion (
+--- Forum: General ScrapeBox Talk (
+--- Thread: Similar site (/showthread.php?tid=45191)

Pages: 1 2

Similar site - Nosh - 06-24-2019

I want to harvest Similar site, but I don't get any result. Somebody could help? Please check the Screenshot.
Thanks !

RE: Similar site - Nosh - 06-24-2019

[Image: similarsite.png]

RE: Similar site - loopline - 06-25-2019

What happens when you click test engine?

If it doesn't work then it will give you the option to save the raw html, so then you can see what scrapebox sees.

RE: Similar site - Nosh - 06-25-2019

I get this
[Image: simsite.png]

RE: Similar site - loopline - 06-25-2019

Then you need to save off the raw html and look at it and see where its redirecting too. Because 301 is a redirect. Its quite possible they are sending you to a block page with a captcha as well.

RE: Similar site - Nosh - 06-26-2019

I don´t know how to "save off the raw html". ¿Could you explain a bit?

RE: Similar site - loopline - 06-29-2019

When you go to settings >> harvester engine configuration. Then click on your engine and click test engine (down at the bottom), then once it tests it, there will be a button to save the raw html. IF you do that you can see the exact html that scrapebox sees.

RE: Similar site - Nosh - 06-30-2019

Sorry to insist, but I really cannot find it... Where can I save Raw HTML?
[Image: Captura-de-pantalla-2019-06-30-a-las-10.57.27.png]

RE: Similar site - loopline - 06-30-2019

It should be like this

My guess is because it needs a 200 response to save the raw html, but your getting a 301. 

I will mail support.

What you can try is a program called http debugger pro.  You can google it, it has a 7 day trial, I use it regularly.  It has a bit of a learning curve, but clicking on the url you can easily see the html response code that scrapebox sees.

RE: Similar site - loopline - 07-02-2019

A 301 is a permanent redirect, it’s not a page that gets loaded. Just adding this URL to will work{KEYWORD}

There is no pagination, they just return around 11 results so no need for the pagenum variable.