Link Rot: How to find new sources for busted links

There are a variety of ways to find what you're looking for when you come across a broken link on the interwebs. Here's a few methods i like to use.

Search operators

The first thing you should know is how to use a search engine. Various search engines will attach a special meaning to certain characters and these 'search operators' as they're called can be really helpful. Here's some handy examples that work for Google as well as some other search engines (and no, you shouldn't be using Google directly):

OR : 'OR', or the pipe ( | ) character, tells the search engine you want to search for this OR that. For example cat|dog will return results containing 'cat' or 'dog', as will cat OR dog.

( ) : Putting words in a group separated by OR or | produces the same result as just described, however you can then add words outside of the group that you always want to see in the results. For example, (red|pink|orange) car will return results that have red, pink or orange cars.

" ": If you wrap a "word" in double quotes, you are telling the search engine that the word is really important. If you wrap multiple words in double quotes, you are telling the search engine to look for pages containing "that exact phrase."

site: : If you want to search only a particular domain, such as, append to your query, or don't include any search terms if you want it to return a list of pages for the domain. You can do the same when preforming an image search if you want to see all the images on a domain. You can also search a TLD (Top-Level Domain) using this operator. For example, to search the entire .gov TLD, just append to your query.

-: If you prefix a word with a -hyphen, you are telling the search engine to omit results containing this word. You can do the same -"with a phrase" also.

cache:: Prefixing a domain with cache:, such as, will return the most recent cached version of a page.

intitle: : If you prefix a word or phase with intitle:, you are telling the search engine that the word or phrase must be contained in the titles of the results.

allintitle: : Words prefixed with allintitle: tells the search engine that all words following this operator must be contained in the titles of the search results.

See this page for more examples.

Searching the archives

One of the simplest methods of finding the original target of a busted link is to copy the link location (right click the link and select 'Copy Link Location') and plug that into one of the web archive services. The two most popular, general archives that i'm aware of are the Internet Archive and The Internet Archive provides options to filter your search results for particular types of content, such as web pages, videos, etc.. In either case, just paste the copied link in the input field they provide and press your Enter key. If the link is 'dirty', cleaning it up may provide better results. For example, let's say the link is something like:

The archive may not return any results for the URL, but it might if you clean it up by removing everything after 'hunt'.

There are also web browser extensions you can install to make accessing the archive services easier. For Firefox i like the View Page Archive & Cache add-on by 'Armin Sebastian'. When you find a dead link, just right-click it and from the 'View Page Archive' context menu you can select to search all of the enabled archives or just a specific one. Even if the page isn't dead you can right-click in the page and retrieve a cached copy or archive the page yourself. Another cool feature of this add-on is that it will place an icon in the address bar if you land on a dead page and you can just search for an archived version from the icon context menu.

Of these two services, the Internet Archive has a far more extensive library, but there's a very annoying caveat with it that defeats the purpose of an archive which is why i much prefer The Internet Archive follows robot.txt directives. I won't go into why i think this is stupid, suffice to say that content that is stored on the Internet Archive can be removed even if it does not break any of their rules.

Dead links and no clues

If all you have is a dead link with no title or description and you can't find a cached copy in one of the archives, you may still be able to find copy of the document somewhere. For example let's say the link is The likely title of the document you're looking for is right in the URL -- my-monkey-stole-my-car -- and you can plug that into a search engine just as it is, or remove the hyphens and wrap the title in double quotes to perform a phrase search. Also see some of the other examples here.

Dead links with some clues

If you come across a dead link that has a title or description, but isn't cached in an archive, you can use that to perform a search. Just select the title, or a short but unique phrase from the description (which preferably doesn't contain any punctuation), then wrap it in double quotes and perform a phrase search.

Dead internal website links

If you encounter a website that contains a broken link to another page on the same site and you have some information about the document, like a title or excerpt, you can do a domain search to see if a search engine may link to a working copy. For example, let's assume the title of the page we're looking for is 'Why does my kitten hate me?' on the domain ''. Copy the title, wrap it in double quotes and plug it into a search engine that supports phrase searches, add a space, then append This will tell the search engine to look for results only on Also see some of the other examples here.

YouTube videos you know exist but can't find

Because there is a remarkable amount of censorship taking place at YouTube, they will sometimes hide sensitive videos from their search results when you use the search engine provided by YouTube. To get around this, use another search engine to perform a domain search as described in the 'Dead internal website links' section.

Deleted videos

In some cases, such as with a link that points to a removed YouTube video, you may not have any information other than the URL itself, not even a page title. Using the YouTube link as an example,, copy, wrap it in double quotes and plug that into your preferred search engine. You will often find a forum or blog post somewhere that will provide helpful clues, such as the video title or description which you can use to search for a working copy of the video. And the first place to look for deleted YouTube videos is YouTube! You can also search the Internet Archive as well as other video platforms that are more censorship resistant than YouTube, including Dailymotion, BitChute, DTube, LEEKWire and many others.

Broken links on your own website

I don't know about you, but i have nearly 4,000 links on as of this writing and many of them point to resources which TPTB (The Powers That [shouldn't] Be) would rather you knew nothing about. As such, many of the resources i link to are taken down and so i have to deal with broken links constantly, many of them deleted YouTube videos. If you run WordPress (self-hosted - i don't know about a site) you will find Broken Link Checker by 'ManageWP' in the WordPress plugin repository and it's job is to constantly scan your site to look for broken links. While it is not a bug-free plugin (the developer is not at all responsive and doesn't seem to fix anything in a timely manner), it is by far the most comprehensive tool of its type that i'm aware of. There are also many external services you could use whether you run WordPress or not.

Alternative Search Engines That Respect Your Privacy

It's time to stop relying on corporations which do not respect our privacy. Here are some search engines that, unlike Google, Bing and Yahoo, have a stronger focus on protecting your privacy.

See the recent changes at the end.

Following are some search engines which are more privacy-centric than those offered by the privacy-hating mega-corporations like Google, Bing and Yahoo. Note that several of those listed here are partially or wholly 'meta' search engines, meaning they do not index (crawl) the web themselves and instead rely either partially or entirely upon third parties for their search results, especially Google and Bing. Although these meta search engines are often referred to as "alternative search engines", they are not true alternatives, however they can provide a valuable service by acting as a proxy between you and the third party services, thus  insulating you from the privacy risks associated with using the big search companies directly.

If you have any search engines you would like to suggest, please leave a comment (you need not be logged in).

To install the search engines for Firefox, see Firefox Search Engine Cautions, Recommendations.


  • Decentralized: whether the service is controlled by a single entity, such as Google, or distributed among its users, such as YaCy
  • Type: meta: uses 3rd party search indexes, such as Google, to deliver search results
    index: crawls the web and indexes content without relying on 3rd party search engines
    hybrid: a combination of both the above
  • Requires JS / Cookies: whether the web interface requires JavaScript and/or cookies (web storage) in order to function
  • Self Contained: whether the website uses 3rd party resources, such as Google fonts, etc.
  • Client Required: whether you have to install client software in order to use the service
  • Privacy Policy: a link to their privacy policy
  • Privacy / Overall: 1 to 4 star ratings for both the strength of their privacy policy and the overall usability of the search engine


Search Page
DecentralizedTypeRequires JS / CookiesSelf ContainedClient RequiredPrivacy PolicyPrivacy / Overall
search pagenometaJS: no 2
Cookies: no 3

Disconnect apparently pulls results from Yahoo, Bing and DuckDuckGo, however when i tested it, Disconnect forwarded all searches to DuckDuckGo. Their privacy policy is is OK, but nothing to brag about. Personally i see no advantage to using Disconnect over any of the other meta search engines listed here.


Search Page
DecentralizedTypeRequires JS / CookiesSelf ContainedClient RequiredPrivacy PolicyPrivacy / Overall
search pagenohybridJS: yes
Cookies: no 3

DuckDuckGo claims to pull its search results from over 400 sources including Wikipedia, Bing, Yahoo and Yandex, as well as from its own crawler, however its primary search results come from Bing and Yandex. The interface is similar to Google. Revenue is generated by displaying ads in the search results, though this can be disabled in the settings, and also by inserting affiliate links for certain e-commerce sites. DuckDuckGo offers a 'lite' version which does not use JavaScript or cookies. Worth watching is their promotional video, Google vs. DuckDuckGo.

If you want to save search settings without storing cookies, you'll find their URL parameters here. You might want to use a browser extension that will redirect your searches and load the parameters automatically.


Search Page
DecentralizedTypeRequires JS / CookiesSelf ContainedClient RequiredPrivacy PolicyPrivacy / Overall
search pagenoindexJS: yes
Cookies: no
yesno?? /2

Gigablast is a free and open source search engine that maintains its own index. While you can install and run Gigablast on your own server, it appears the source code may be outdated. The search interface offers some useful options, such as selecting the format of the output, several interesting sorting options, time span options and file type options. I couldn't find a privacy policy, but decided to include it anyway since it is open source.

Good Gopher

Search Page
DecentralizedTypeRequires JS / CookiesSelf ContainedClient RequiredPrivacy PolicyPrivacy / Overall
search pagenohybrid?JS: no
Cookies: no

Good Gopher was apparently developed by Mike Adams, editor of the website, and appears to be unmaintained. As stated in the Good Gopher privacy policy, their search results are censored in that they filter out what they and their users consider to be "corporate propaganda and government disinfo", while simultaneously promoting the undisputed heavyweight king of alt-media propaganda and disinformation, Alex "Bullhorn" Jones of InfoWars. It is unclear whether Good Gopher indexes the web or is a hybrid. The core of their privacy policy consists of a few vague paragraphs, the bulk of which has nothing to do with user privacy. Revenue is generated by displaying ads in the search results, though they state they are very particular about who may advertise on the platform.


Search Page
DecentralizedTypeRequires JS / CookiesSelf ContainedClient RequiredPrivacy PolicyPrivacy / Overall
search pagenoindexJS: no
Cookies: no

LookSeek appears to be owned by Applied Theory Networks LLC and apparently has been around a while. The software seems to be propitiatory, but they do have a decent, clear and brief privacy policy. The search interface is rudimentary, to say the least, and there doesn't appear to be any configuration options. What is most attractive is that LookSeek operates its own crawler and apparently does not rely on any 3rd party indexes. They also state that they have "no political or social bias".


Search Page
DecentralizedTypeRequires JS / CookiesSelf ContainedClient RequiredPrivacy PolicyPrivacy / Overall
search pagenohybrid?JS: no 2
Cookies: no 3

MetaGer, which has been around for a couple decades, is a free, open source search engine that claims to pull results from up to 50 other search engines. The search interface is plain and there are very few options for tailoring your search. Their privacy policy is lengthy, but strong and clear. Revenue is generated by displaying ads in the search results as well as donations.


Search Page
DecentralizedTypeRequires JS / CookiesSelf ContainedClient RequiredPrivacy PolicyPrivacy / Overall
search pagenoindexJS: no
Cookies: no

Mojeek promises to return unbiased results. Though the search interface is plain, they do offer some options to customize how searching works. Mojeek is a UK based company that has a solid privacy policy.


Search Page
DecentralizedTypeRequires JS / CookiesSelf ContainedClient RequiredPrivacy PolicyPrivacy / Overall
search pagenometaJS: no
Cookies: no

Oscobo has a plain interface with no options other than to search the web, images, etc.. Their privacy policy is vague regarding exactly what information they collect. Oscobo pulls its results from Bing and revenue is generated by displaying ads in the search results.


Search Page
DecentralizedTypeRequires JS / CookiesSelf ContainedClient RequiredPrivacy PolicyPrivacy / Overall
search pagenometaJS: yes
Cookies: no 3

Peekier provides an interesting, though feature limited interface in the form of zoomable text and thumbnail images of the web pages corresponding to your search results, thus allowing you browse the results before actually visiting the page. Peekier appears to pull its results from Bing only, which is unfortunate. Their privacy policy is clear, strong and brief.


Search Page
DecentralizedTypeRequires JS / CookiesSelf ContainedClient RequiredPrivacy PolicyPrivacy / Overall
search pagenohybridJS: yes
Cookies: no 3

Qwant, based in France, is an interesting search engine and index that claims to provide unbiased results. It is currently a hybrid in that they use crawlers, but they pull some results from Bing until their index is more complete. The interface is pleasant, colorful and easy to use, though there are not many configuration options. Quant also has a lite version that does not require JavaScript. Their privacy policy is solid. Revenue is generated by displaying ads in the search results.

'The Hated One' posted an interesting audio interview with European Mozilla founder and Qwant CEO Tristan Nitot on his BitChute channel: Qwant CEO and Mozilla founder Tristan Nitot on Microsoft monopoly, surveillance and Internet freedom. Discussion about Qwant starts around the 46:00 mark.


Search Page
DecentralizedTypeRequires JS / CookiesSelf ContainedClient RequiredPrivacy PolicyPrivacy / Overall
search pagenometaJS: no
Cookies: no 3
yesno?? / 3

Searx is a free, open source meta search engine which i have found to be the best of its type because of its ability to pull results from a wide array of third party services and the comprehensive options it offers. The Searx interface is clean, highly customizable and intuitive. Anyone can run a Searx instance on their own server (see their GitHub page) if they wish, or use any of the existing Searx instances run by others.


Search Page
DecentralizedTypeRequires JS / CookiesSelf ContainedClient RequiredPrivacy PolicyPrivacy / Overall
search pagenometaJS: no 2
Cookies: no 3

Startpage pulls its search results primarily from Google, though they are apparently in the process of making some changes. They have a strong privacy policy and an extensive Q&A page regarding privacy. Revenue is generated by displaying ads in the search results. Startpage also offers an email service called Startmail. Although Startpage uses 1x1 pixel GIF images, i was told they are not used for tracking purposes. [1]


Search Page
DecentralizedTypeRequires JS / CookiesSelf ContainedClient RequiredPrivacy PolicyPrivacy / Overall
search pagenoindex?JS: yes
Cookies: no

The Swisscows servers are located in... take a guess... and the company has a solid privacy policy. The search interface is modern and interesting in that they use machine learning to evaluate your search terms in order to provide better results. Swisscows is described as "... the first intelligent answer engine because it is based on semantic information recognition and offers users intuitive help in their search for answers." The results are censored in that violent and pornographic content is filtered with no apparent option to disable this. Swisscows is funded by donations and advertising.


Search Page
DecentralizedTypeRequires JS / CookiesSelf ContainedClient RequiredPrivacy PolicyPrivacy / Overall
search pageyesindexJS: yes
Cookies: no
yesoptional?? / 2

While YaCy doesn't produce a lot of search results since not enough people use it yet, i think it's the most interesting search engine listed here. YaCy is a decentralized, distributed, censorship resistant search engine and index powered by free, open-source software. For those wanting to run their own instance of YaCy, see their home and GitHub pages. This article from Digital Ocean may also be helpful.

My personal choices

I tend to use Startpage as my primary search engine and fallback to DuckDuckGo depending on what kind of content i'm looking for. Though largely devoid of advanced settings, Startpage offers a clean, themed interface and returns reasonably relevant results. Another advantage with Startpage is that it handles phrase searches, so searching for "a nice country drive" (double quotes included) tends to return pages that contain that exact phrase, whereas with search engines that use Bing, such as DuckDuckGo, phrase searches are largely ignored. If you're looking for more obscure content however, engines that use Bing will often return relevant results where Startpage may return few or none at all.

Upcoming search engines

  • Presearch: A decentralized search engine powered by the community. UPDATE, 15-Apr-2020: I'm not sure what to think about Presearch. Their domain,, which uses resources from Google and Cloudflare, forwards to which uses resources from,, flowcdn and a couple of others. Then there's another domain,, which uses resources from Bing. Both domains require JavaScript and both return very different results. When i test search engines the first search term i generally use is "holohoax" because that key word is very politically charged and, given what i know about the Jewish holocaust, i think it's a great key word to test the accuracy of the results. Searching from returns some of the mainstream results including the potential mother of all political disinformation, Wikipedia, however running the same search on returns very different results and what one should discover when searching for that term.

Please leave a comment if you know of any others.

Search engines that didn't make the cut

  • Cliqz - The Cliqz search engine, which is an index and not a proxy, is largely owned by Hubert Burda Media. The company offers a "free" web browser built on Firefox. It appears there are two primary privacy policies which apply to the search engine and both are a wall of text. As is often the case, they begin by telling readers how important your privacy is ("Protecting your privacy is part of our DNA") and then spend the next umpteen paragraphs iterating all the allegedly non-personally identifying data they collect and the 3rd party services they use to process it, which have their own privacy policies. In 2017 Mozilla made the mistake of partnering with Cliqz and suffered significant backlash when it was discovered that everything users typed in their address bar, along with a lot of other data, was being sent to Cliqz, not that this behavior is entirely unique to Cliqz. You can read more about this on HN, as well as a reply from Cliqz, also on HN.
  • Gibiru - I was anxious to try this engine after seeing it listed in NordVPN's article, TOP: Best Private Search Engines in 2019! and so i loaded the website and i liked what they had to say. Unfortunately, Gibiru not only depends on having JavaScript enabled, it depends on having it enabled for Google as well. Fail! It seems Gibiru is little more than a Google front-end and a poor one at that.
  • Search Encrypt - I added Search Encrypt to the list and later removed it because, to put it bluntly, its f'n garbage. Their selling point is "Search Encrypt encrypts your search terms between your computer and". Well imagine the novelty of that. Or not, because so does every other search engine and website that uses SSL. The site uses cookies and JavaScript by default. Their ToS is a wall of corporate gibberish and their privacy policy is weak. Lastly, Search Encrypt doesn't seem to provide any information about how they obtain their search results, though both the results and interface reek of Google and reading between the lines clearly indicates it is a meta search engine. Search Encrypt is also recommended by NordVPN.
  • Yippy - Like Search Encrypt, Yippy is another typical ethically retarded company with a poor privacy policy looking to attract investors. Yippy uses cookies by default and won't function without JavaScript. This one is also recommended by NordVPN.

Other search engines

There are many general and specialized search engines listed in the Searx wiki.

Lessons learned from the Findx shutdown

The founder of the Findx search engine, Brian Rasmusson, shut down operations and detailed the reasons for doing so in a post titled, Goodbye – Findx is shutting down. I think the post is of significant interest not only to the end user seeking alternatives to the ethically corrupt mega-giants like Google, Bing, Yahoo, etc., but also to developers who have an interest in creating a privacy-centric, censorship resistant search engine index from scratch. Following are some highlights from the post:

Many large websites like LinkedIn, Yelp, Quora, Github, Facebook and others only allow certain specific crawlers like Google and Bing to include their webpages in a search engine index (maybe something for European Commissioner for Competition Margrethe Vestager to look into?) Other sites put their content behind a paywall.


Most advertisers won’t work with you unless you either give them data about your users, so they can effectively target them, or unless you have a lot of users already.

Being a new and independent search engine that was doing the time-consuming work of growing its index from scratch, and being unwilling to compromise on our user’s privacy, Findx was unable to attract such partners.


We could not retain users because our results were not good enough, and search feed providers that could improve our results refused to work with us before we had a large userbase … the chicken and the egg problem.


From forbidding crawlers to index popular and useful websites and refusing to enter into advertising partnerships without large user numbers, to stacking the requirements for search extension behaviour in browsers, the big players actively squash small and independent search providers out of their market.

I think the reasons for the Findx shutdown highlight the need for decentralized, peer-to-peer solutions like YaCy. If we consider the problems Findx faced with the data harvesting, social engineering giants like Google, Facebook and the various CDN networks like Cloudflare, i think they are the sort of problems that can be easily circumvented with crowdsourced solutions. Any website can block whatever search crawler they want and there can be good reasons for doing so, but as Brian points out, there are also stupid and unethical reasons for doing so. With a decentralized P2P solution anyone could run a crawler and this could mitigate a lot of problems, plus force the walled garden giants such as Facebook to have their content scraped.



[1] Startpage uses 1x1 pixel transparent GIF images in the page that serves search results. I had assumed these were tracking pixels and originally stated so in the notes above, however a representative from Startpage contacted me and explained that i was incorrect. Following is a Q&A from a couple of emails i exchanged with them:

Startpage: BTW StartPage/Ixquick do *not* use tracking images. What you noted are non-tracking clear GIFs. Here's a KB article about that.

Me: regarding the 1x1 gif images, i don't understand how an image can be used to prevent a 3rd party from setting a cookie - can you explain?

Startpage: We have a proxy service that lets you view a result anonymously (by clicking `Proxy` near a result). When you view a webpage this way, our servers load the page on your behalf, and then provide the content to you. That way the website you are viewing won't see you. Their website content is served through our domain. Webpages have many ways to set cookies - through Javascript and otherwise. When we proxy the webpage on your behalf, we take many steps to prevent them from doing so. (If they did successfully set a cookie, the cookie would be stored on our domain.) To add extra protection, we then display this extra 1x1 image from our domain that includes cookie headers to *clear* any such cookies. That way, if any external website you viewed through our proxy manages to set a cookie on our proxy's domain, we immediately clear that cookie.

Me: why several 1x1 images are used - why not just 1?

Startpage: It is simpler to offer a different image for each different aggregate count we are keeping.

Me: why do the file names appear to contain a UIN that changes with every search apparently?

Startpage: There is no identifier. Rather, there is something called an "anticache" parameter that has a random number. This prevents the image from being "cached" by the browser - as browser caching would prevent the loading - hence would prevent the aggregate counts from being correct.

Me: why are these clear gif's are not loaded when 0 results are returned?

Startpage: A different part of the code is used when there are no results, so it might not include the same aggregate counts.

Recent changes

  • removed unnecessary footnotes
  • added section 'Other search engines'
  • added link to 'The Search Engine Map' website
  • minor edits

New tut: Firefox Search Engine Cautions and Recommendations

A new tutorial has been published titled Firefox Search Engine Cautions and Recommendations which covers the risks to your privacy when using any of the major search engines in general, but specifically when using the default search engine plugins that are packaged with the Firefox web browser, though this problem is certainly not limited to Firefox. I also cover how to circumvent the risks to your privacy when using the default Firefox search engine plugins, as well as make suggestions for alternative search engines.

I have to say that i'm becoming more and more disillusioned with the multi-million dollar Mozilla corporation and its flagship product, Firefox. Firefox was never a great web browser in my opinion, but it is/was appealing to many because of how completely customizable it is. In it's earlier days it was just a little slow and buggy, but more recently Mozilla is making highly unethical choices with regard to the privacy-hating corporations they willingly partner with and how these partnerships have manifested and have been monetized in Firefox is a result of utter stupidity and greed in my opinion. I stuck with Firefox all these years because it has always been one of the most hackable browsers out there, but these days i stick with it primarily because i'm not (yet) able to reproduce the functionality i have added to it via add-ons with any other browser, and Chrome is out of the question, much less Google's spyware version of it.

It's sad and frustrating that a company who produced a decent, super-highly customizable browser for a niche market has lost its way and turned its back on the very market it once served by deciding to become a Google Chrome clone in order to appeal to the masses.

Screw you Mozilla.

But let's end on a lighter note, shall we? Here, have a look.