View Full Version : myspace question
Beyerstein00
04-28-2007, 04:11 PM
is there a web crawler thing (other than google) that caches myspace profiles
Troll
04-28-2007, 05:14 PM
Myspace uses robot.txt to tell automated bots not to crawl its site.
http://www.myspace.com/robots.txt
However, search engines still caches some myspace pages.. And as google probably has the biggest cache of myspace pages then i only s***est using google.. (or a simular search engine; such as msn)
Archive.org could work (i haven't checked), but as myspace uses a robot.txt it's highly unlikely to work...
Sorry i couldn't help much..
Ezekiel
04-28-2007, 05:31 PM
The file Troll mentioned (robots.txt) is a way of preventing bots crawling your website. You can disallow all crawlers, or some with a specific user agent.
On Myspace's robots.txt, they block ia_archiver. I just Googled, and it's the user agent string of www.archive.org. I was going to s***est that as a place to check, but they're blocked from caching Myspace pages.
Other than that, try Coral:
http://www.coralcdn.org/
... or, just search Google for "search engine" and try the cached versions on all the search engines you can find. Yahoo, Live Search, et al.
Powered by vBulletin® Version 4.1.8 Copyright © 2024 vBulletin Solutions, Inc. All rights reserved.