Similar site - Printable Version +- ScrapeBox Forum (https://www.scrapeboxforum.com) +-- Forum: ScrapeBox Main Discussion (https://www.scrapeboxforum.com/Forum-scrapebox-main-discussion) +--- Forum: General ScrapeBox Talk (https://www.scrapeboxforum.com/Forum-general-scrapebox-talk) +--- Thread: Similar site (/Thread-similar-site) Pages:
1
2
|
RE: Similar site - Nosh - 07-18-2019 (07-02-2019, 01:47 AM)loopline Wrote: A 301 is a permanent redirect, it’s not a page that gets loaded. Just adding this URL to similarsitesearch.com will work You mean like this ? RE: Similar site - loopline - 07-19-2019 Yes if the markers are correct for before/after that should work. RE: Similar site - Nosh - 07-19-2019 (07-19-2019, 06:07 PM)loopline Wrote: Yes if the markers are correct for before/after that should work. What do you mean exactly ? With this configuration I don't get results RE: Similar site - loopline - 07-20-2019 It worked at the time I posted it, which was some time ago. You had posted your setup I think and the url change is all that needed changed. However since then the site may have changed the before/after markers. So double check that the before and after markers are still correct with the current html. RE: Similar site - Nosh - 07-21-2019 (07-20-2019, 09:33 PM)loopline Wrote: It worked at the time I posted it, which was some time ago. Do you mean something like in the screenshot ? RE: Similar site - loopline - 07-22-2019 Maybe. Its been a while since this thread was started, what exact element are you trying to extract? RE: Similar site - Nosh - 07-22-2019 (07-22-2019, 08:01 PM)loopline Wrote: Maybe. Its been a while since this thread was started, what exact element are you trying to extract? Only the URLs of the links [img]<a href=[/img]" /> RE: Similar site - loopline - 07-26-2019 so, similar site only uses 1 page of results I believe, so there is no point in a next page marker, and thats the point of the harvester. So why not just use the merge feature and merge all your keywords into the urls and load them all into the link extractor. Then you don't have to mess around with a custom harvester engine that you have to update every time similar site changes something. Merge info http://scrapeboxfaq.com/how-do-i-use-tokens-with-the-m-merge-option and link extractor https://www.youtube.com/watch?v=t6pxt-4C6Xc&t=2s RE: Similar site - Nosh - 07-27-2019 (07-26-2019, 09:28 PM)loopline Wrote: so, similar site only uses 1 page of results I believe, so there is no point in a next page marker, and thats the point of the harvester. So why not just use the merge feature and merge all your keywords Sounds good. But does not work RE: Similar site - loopline - 07-28-2019 The site is blocking you. I just rebuilt the entire engine from scratch and saved off the test html and similar sites is returning just this Code: <!DOCTYPE html> Code: <div id="distilIdentificationBlock"> </div> Its blocked. So I tried some different user agent, but that didn't do it. So you can monkey around with the header data and user agent and see if you can get it to work, but otherwise, they may just simply have a good enough blocking system that its not going to work. |