Faster WordPress Hosting

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
to scrape archive.org
#1
hello
how to use scrapebox to scrape a site on archive.org

thank you
Reply
#2
hello friends
i need your help ineed any footprint to do that
Reply
#3
I mean what are you wanting to scrape? the expired domain finder has an archive.org downloader but not a scraper

Can you be expressly more specific about what you want to scrape. Examples would be helpful.
Reply
#4
hello i wante to scrape a content site

thank you
Reply
#5
thank you i will buy the expired domain finder
Reply
Faster WordPress Hosting

#6
hello
with this plugin can -I download all articles from a site on archive.org
Reply
#7
No, it does not download articles. nothing in scrapebox will download articles from archive.org

If you buy an expired domain the archive.org downloader in the expired domain finder will download the entire site you can upload it again and then use that site, but not just articles.
Reply
#8
And while we're at it: I've used the archive.org grabber to download a site yesterday. But it didn't download everything that's available in web.archive.org for that domain.
I'm trying to get the rest, too. So I thought I'd rename the download folder and let it work again, using a different date in the snapshot date fields.
But now it's only downloading the home page, not any other pages. What can I do to steer it to a certain missing page?
Reply
#9
There isn't really anything you can do to steer it to a specific page. you could drop a line to scrapebox support and give the specific link to the page thats missing along with an other helpful specific data and perhaps screenshots and see if its something they can compensate for or not.

Ive run into a couple cases where it just doesn't come out perfect. There are just too many possibilities out there and things people can do wrong to a site that cause the downloader, on occasion, to not be able to download some part of the site.
Reply
#10
Thanks for your answer, @loopline, it's not such a tragic, I will download those few pages manually and reintegrate them. It's helpful to know it's not always me ;-)
Reply
Faster WordPress Hosting





Users browsing this thread: 1 Guest(s)