Watched a video where James Corbett explained a little bit about how to find other sources for dead links and thought i'd expand on it, so here's Link Rot: How to find new sources for busted links.
There are a variety of ways to find what you're looking for when you come across a broken link on the interwebs. Here's a few methods i like to use.
The first thing you should know is how to use a search engine. Various search engines will attach a special meaning to certain characters and these 'search operators' as they're called can be really helpful. Here's some handy examples that work for Google as well as some other search engines (and no, you shouldn't be using Google directly):
OR : 'OR', or the pipe "
|" character, tells the search engine you want to search for this OR that. For example
cat|dog will return results containing 'cat' or 'dog', as will
cat OR dog.
( ) : Putting words in a group separated by
| produces the same result as just described, however you can then add words outside of the group that you always want to see in the results. For example,
(red|pink|orange) car will return results that have 'car' in them, as well as either red or pink or orange.
" ": If you wrap a "word" in double quotes, you are telling the search engine that the word is really important. If you wrap multiple words in double quotes, you are telling the search engine to look for pages containing "that exact phrase."
site: : If you want to search only a particular domain, such as 12bytes.org, append
site:12bytes.org to your query, or don't include any search terms if you want it to return a list of pages for the domain. You can do the same when preforming an image search if you want to see all the images on a domain. You can also search a TLD (Top-Level Domain) using this operator. For example, to search the entire .gov TLD, just append
site:.gov to your query.
-: If you prefix a word with a -hyphen, you are telling the search engine that you are not interested in results containing that word. You can do the same -"with a phrase" also.
cache:: Prefixing a domain with
cache:, such as
cache:12bytes.org, will return the most recent cached version of a page.
intitle: : If you prefix a word or phase with
intitle:, you are telling the search engine that the word or phrase must be contained in the titles of the results.
allintitle: : Words prefixed with
allintitle: tells the search engine that all words following this operator must be contained in the titles of the search results.
See this page for more examples.
Searching the archives
One of the simplest methods of finding the original target of a busted link is to copy the link location (right click the link and select 'Copy Link Location') and plug that into one of the web archive services. The two most popular, general archives that i'm aware of are the Internet Archive and Archive.is. The Internet Archive provides options to filter your search results for particular types of content, such as web pages, videos, etc.. In either case, just paste the copied link in the input field they provide and press your Enter key. If the link is 'dirty', cleaning it up may provide better results. For example, let's say the link is something like:
The archive may not return any results for the URL, but it might if you clean it up by removing everything after 'hunt'.
There are also web browser extensions you can install to make accessing the archive services easier. For Firefox i like the View Page Archive & Cache add-on by 'Armin Sebastian'. When you find a dead link, just right-click it and from the 'View Page Archive' context menu you can select to search all of the enabled archives or just a specific one. Even if the page isn't dead you can right-click in the page and retrieve a cached copy or archive the page yourself. Another cool feature of this add-on is that it will place an icon in the address bar if you land on a dead page and you can just search for an archived version from the icon context menu.
Of these two services, the Internet Archive has a far more extensive library, but there's a very annoying caveat with it that defeats the purpose of an archive which is why i much prefer Archive.is. The Internet Archive follows robot.txt directives. I won't go into why i think this is stupid, suffice to say that content that is stored on the Internet Archive can be removed even if it does not break any of their rules.
Dead links and no clues
If all you have is a dead link with no title or description and you can't find a cached copy in one of the archives, you may still be able to find copy of the document somewhere. For example let's say the link is
https://example.com/pages/my-monkey-stole-my-car.html. The likely title of the document you're looking for is right in the URL —
my-monkey-stole-my-car — and you can plug that into a search engine just as it is, or remove the hyphens and wrap the title in double quotes to perform a phrase search. Also see some of the other examples here.
Dead links with some clues
If you come across a dead link that has a title or description, but isn't cached in an archive, you can use that to perform a search. Just select the title, or a short but unique phrase from the description (which preferably doesn't contain any punctuation), then wrap it in double quotes and perform a phrase search.
Dead internal website links
If you encounter a website that contains a broken link to another page on the same site and you have some information about the document, like a title or excerpt, you can do a domain search to see if a search engine may link to a working copy. For example, let's assume the title of the page we're looking for is 'Why does my kitten hate me?' on the domain 'example.com'. Copy the title, wrap it in double quotes and plug it into a search engine that supports phrase searches, add a space, then append
site:example.com. This will tell the search engine to look for results only on example.com. Also see some of the other examples here.
YouTube videos you know exist but can't find
Because there is a remarkable amount of censorship taking place at YouTube, they will sometimes hide sensitive videos from their search results when you use the search engine provided by YouTube. To get around this, use another search engine to perform a domain search as described in the 'Dead internal website links' section.
In some cases, such as with a link that points to a removed YouTube video, you may not have any information other than the URL itself, not even a page title. Using the YouTube link as an example,
youtube.com/watch?v=abc123xyz, wrap it in double quotes and plug that into your preferred search engine. You will often find a forum or blog post somewhere that will provide helpful clues, such as the video title or description which you can use to search for a working copy of the video. And the first place to look for deleted YouTube videos is YouTube! You can also search the Internet Archive as well as other video platforms that are more censorship resistant than YouTube, including Dailymotion, BitChute, DTube, LEEKWire and many others.
Broken links on your own website
I don't know about you, but i have nearly 4,000 links on 12bytes.org as of this writing and many of them point to resources which TPTB (The Powers That [shouldn't] Be) would rather you knew nothing about. As such, many of the resources i link to are taken down and so i have to deal with broken links constantly, many of them deleted YouTube videos. If you run WordPress (self-hosted – i don't know about a wordpress.com site) you will find Broken Link Checker by 'ManageWP' in the WordPress plugin repository and it's job is to constantly scan your site to look for broken links. While it is not a bug-free plugin (the developer is not at all responsive and doesn't seem to fix anything in a timely manner), it is by far the most comprehensive tool of its type that i'm aware of. There are also many external services you could use whether you run WordPress or not.
I made several small changes, updates and corrections to Firefox Search Engine Cautions, Recommendations.
LookSeek was added to Alternative Search Engines That Respect Your Privacy. I consider LookSeek to be a somewhat important addition since the company runs its own crawler and apparently has no ties to any other search engine. Also they claim no political or social bias, which is great in the age of the filter bubbles.