See the recent changes at the end.
Unlike meta search engines such as DuckDuckGo, Startpage, etc., which rely either partially or entirely upon third parties for their results (primarily Bing and Google), all search engines listed here maintain their own indexes meaning they actively crawl the web in search of new and updated content to add to their catalogs. A few are hybrids, meaning they rely partially upon a 3rd party engine.
Although meta search engines are often referred to as alternative search engines, they are not true alternatives since they are subject to the same censorship/de-ranking practices of the company upon which they rely. Such search engine companies are really proxies in that they may provide a valuable service by insulating you from privacy intrusive third party services, however this is not always the case.
If you are going to use a meta search engine which relies upon a 3rd party, those which depend on Microsoft's Bing will generally return better results than those which rely on Google, especially when searching for sensitive and censored information.
If you have any indexing search engines you would like to suggest, leave a comment (you need not be logged in). To install search engine plugins for Firefox, see Firefox Search Engine Cautions, Recommendations.
- Decentralized: (yes/no) whether or not the service depends upon centralized servers or is distributed among its users, such as YaCy
- Type: (index/hybrid) indexing search engines crawl the web and index content without relying on a 3rd party, whereas hybrid search engines are a combination of both meta and index
- Self Contained: (yes/no) whether or not the website uses 3rd party resources, such as Google fonts, etc.
- Client Required: (yes/no) whether or not you have to install client software in order to use the service
- License: (proprietary/<license type>) whether the source code is available and, if so, the license type
|search page||no||hybrid||JS: no 1
Cookies: no 2
Brave Search is in the process of building its own index, however until that is complete it also pulls results from 3rd parties, though it is unclear what 3rd parties are queried.
The search interface is attractive and intuitive. Unfortunately there are few options for tailoring the search results or the interface, however some of the more important options are in place, including regional and date search options.
|search page||no||index||JS: yes
|yes||no||Apache License 2.0||?|
Gigablast is a free and open source search engine that maintains its own index.
The search interface offers some useful options, such as selecting the format of the output, several interesting sorting options, time span options, file type options and plenty of advanced syntax options.
You can install and run Gigablast on your own server.
|search page||no||index||JS: no
Good Gopher was apparently developed by Mike Adams, editor of the NaturalNews.com website, and appears to be unmaintained.
Revenue is generated by displaying ads in the search results, though they state they are very particular about who may advertise on the platform.
|search page||no||index||JS: no
The search interface is rudimentary, to say the least, and there doesn't appear to be any configuration options.
LookSeek states they have "no political or social bias".
|search page||no||index||JS: no
Marginalia Search is a very interesting, open source, niche search engine which describes itself as "an independent DIY search engine that focuses on non-commercial content, and attempts to show you sites you perhaps weren't aware of in favor of the sort of sites you probably already knew existed".
One very useful aspect of Marginalia Search is that it allows you to choose the search result ranking algorithm which compiles the search results in different ways, such as by focusing on blogs and personal websites, academic sites, popular sites, etc..
|search page||no||index||JS: no
Cookies: no 2
Mojeek is a UK based company which operates it's own crawler and promises to return unbiased results. I think Mojeek is the most usable search engine listed here and future enhancements are in the works.
The search interface is clean and they offer several options to customize how searching works and how the interface looks. One can also configure how many search results per domain are returned and if more than that number are available, Mojeek adds a link under the results which will open a new page with those results when clicked. If you enter a domain as the search term, Mojeek offers the option to search within that domain.
Mojeek offers advanced search options which add some useful parameters to configure what results are returned.
The engine supports some search operators including
since:, the latter of which is similar to the
date: operator used by Google.
Q: Are your results provided entirely by your crawler, or are they supplemented by any 3rd parties, such as Bing, etc.?
A: Our organic search results are provided entirely by our own crawler; Mojeek is dependent upon no other search engines when it comes to the act of providing organic search results.
Q: Can you give me any details/statistics regarding your index, such as its size or any other information you care to share.
A: Currently we have over 4.7-billion pages indexed and we are growing at a rate of about 5-million pages a day.
Q: If you care to comment, i'd be interested in knowing the challenges Mojeek faces in light of 'big tech' - i understand that the Google's of the world are making it very difficult for the little guy to actually crawl the www.
A: A relevant issue to this would be that we follow the rules of robots.txt, and every now and again this means we come across sites that are set up with Google and/or Bing crawlers okayed in their robots.txt file but often not other crawlers. As we are a good actor this is normally a case of contacting the site owner and getting it changed, but this can be quite a lot of work for a small search engine such as ourselves.
|search page||no||index||JS: yes
|search page||no||index||JS: no
Right Dao is a U.S. based company.
The search interface is bare and there are no options other than the ability to perform an advanced search. There are only two scopes of searches, they being web and news.
Right Dao searches seem to be fairly comprehensive and so this search engine is a solid choice when looking for politically sensitive information that Google and others censor. While the engine accepts phrase searches, that functionality seems to be very broken.
|search page||no||index||JS: no
Wiby is an interesting, open-source search engine which is building an index of personal, rather than corporate websites. The interface is very plain and there was only one option in the settings, however it was designed to work well with older hardware.
|search page||yes||index||JS: yes
While YaCy doesn't produce a lot of search results since not enough people use it yet, i think it's one of the more interesting search engines listed here.
YaCy is a decentralized, distributed, censorship resistant search engine and index powered by free, open-source software. For those wanting to run their own instance of YaCy, see their home and GitHub pages. This article from Digital Ocean may also be helpful.
- Refusing to accept cookies may result in settings not being saved.
Upcoming search engines
Alexandria is a very new, open-source search engine with its own index, though it's currently built using a 3rd party. The first version of the source code appeared on GitHub in late 2021. The index is very small at the moment and therefore the service isn't really useful yet.
The interface is sparse and there are currently no options for customizing anything, however there are plans to improve the service.
Alexandria is worth keeping an eye on.
I contacted Alexandria in April of 2022 with some questions. Following is our exchange:
Q: what are your values regarding user privacy?
A: We care a lot about user privacy and plan to let users decide how much they want to share. We run Alexandria.org as a non-profit so we have no incentive to store any info other than to make the search results better.
Q: i see that you have a dependency on rsms.me - depending on 3rd parties is always a privacy and security concern and i think it is often unnecessary - it looks like it's only css that's being imported at the moment, but do you plan on adding any other 3rd party dependencies?
A: Yes we use the Inter font which is open source, we just think it is a nice looking font. We generally have a high threshold for using a 3rd party dependencies but I think it is impossible to build everything ourselves so if there are things other people are better at than us and it is not in our core mission to build it we will use third party solutions. For example we depend on Hetzner for servers, we depend on commoncrawl for high volume scraping. But it's quite likely that we remove that dependency when we redesign the website next time.
Q: what are the long-term goals for Alexandria?
A: The long terms goal is to make knowledge as accessible as possible to as many people as possible. We want to give the users of alexandria.org info that are in their best interest without having to think about advertisers or other third parties.
Q: will you offer unbiased results?
A: Our bias should be to show the results that are likely to be the most useful for users, so that is what we are aiming for.
Q: do you respect robots.txt? personally i'm fine with it if you do not since it seems Big Tech is making it difficult for the little guy to compete in this market
A: Our index is primarily built with data from Common Crawl. But when we do crawling our self we respect robots.txt. Our main problem with scraping is not robots.txt, but that many big/valuable sources of information are behind cloudflare and similar services or otherwise closed to scarping.
Q: how do you plan to finance the project?
A: In the long term we hope to be able to finance it with donations.
Q: what is the current size of your index roughly (pages) and at what rate is it growing?
A: Right now we are just using a very small index while rebuilding big parts of the system. The current index is around 100 million urls. Pretty soon we plan to have 10 billion urls indexed.
Q: what search operators will you/do you support (site:, title:, date: etc.)?
A: None right now. The first one we will implement is site: since it is quite simple.
Q: because the code is available, will anyone be able to run Alexandria on their own server and how will that work? will each instance be independent, or might the indexes be shared across all servers?
A: Our index is not open source at the moment. So anyone who want's to create their own search engine will have to create their own index by crawling the web themselves or downloading data from common crawl or similar.
Presearch is (currently) yet another meta search engine which is ultimately powered by Big Tech in that it relies on multiple corporate giants for its search results.
Presearch is currently centralized, though decentralization is a stated goal. In the future Presearch is to be powered largely or entirely by the community in that anyone can run a node and help build an index with content curated by users.
Presearch uses code from several 3rd parties including bootstrapcdn.com, coinmarketcap.com, cloudfront.net and hcaptcha.com. Such dependencies are often unnecessary, resulting in bloated and potentially insecure platforms which may not be privacy friendly.
Presearch incorporates "PRE" tokens, yet another form of digital currency apparently used for a variety of purposes, one of them being to incentivize the growth of infrastructure and another being to insure the integrity of the platform. I'm not convinced this is right path, but then i don't fully understand it either.
I think Presearch has potential, but the realization of its goals of decentralization and the building of its own index need to be met before it becomes a viable service.
De-listed search engines
DuckDuckGo has openly admitted to censoring and de-ranking search results as well as working with Microsoft's Bing in order to influence their results (DuckDuckGo relies heavily on Bing). In one instance they blacklisted voat.co, a former free speech social platform, and on March 10, 2022, DuckDuckGo's CEO, Gabriel Weinberg, tweeted the following:
Like so many others I am sickened by Russia’s invasion of Ukraine and the gigantic humanitarian crisis it continues to create. #StandWithUkraine️ At DuckDuckGo, we've been rolling out search updates that down-rank sites associated with Russian disinformation.
Weinberg apparently had no problem when the U.S. invaded Iraq, Syria, Libya, etc., nor any problem with Black Lives Matter and Antifa terrorists burning and looting cities throughout the U.S., but he suddenly developed a selective crises of conscious when Russia invades Ukraine, which happens to be full of U.S. supported terrorists.
DuckDuckGo also admitted to influencing Microsoft's Bing search results according to a New York Times article:
DuckDuckGo said it "regularly" flagged problematic search terms with Bing so they could be addressed.
DuckDuckGo continues its race to the bottom. From an April 15, 2022, TorrentFreak article:
Privacy-centered search engine DuckDuckGo has completely removed the search results for many popular pirates sites including The Pirate Bay, 1337x, and Fmovies. Several YouTube ripping services have disappeared, too and even the homepage of the open-source software youtube-mp3 is unfindable.
On or around 25 May, 2022, it was discovered that DuckDuckGo was allowing tracking by Microsoft:
DuckDuckGo's founder Gabriel Weinberg has admitted to the company's agreement with Microsoft for allowing them to track the user's activity. He further stated that they are taking to Microsoft to change their agreement clause for users' confidentiality.
The trouble with DuckDuckGo began much earlier with its Jewish founder, Gabriel Weinberg:
DDG's founder (Gabriel Weinberg) has a history of privacy abuse, starting with his founding of Names DB, a surveillance capitalist service designed to coerce naive users to submit sensitive information about their friends. (2006)
#UkraineRussiaWar In accordance with the EU sanctions, we have removed the Russian state media RT and Sputnik from our results today. The neutral web should not be used for war propaganda.
For more information see:
- Search Engines - which one to choose? (search for "Qwant")
- Qwant admits censorship
As of somewhere around 2018 or 2019, Startpage was partially bought out by Privacy One Group/System1 which appears to be a data collection/advertising company. Source: Software Removal | Startpage.com
Other search engines
The Search Engine Party website by Andreas is well worth visiting. He has done an excellent job of compiling a large list of search engines and accompanying data. Also see the 'A look at search engines with their own indexes' page by Rohan Kumar who did an excellent job of compiling a list of engines that maintain their own index, however do note that privacy was not considered.
Reader suggested search engines that didn't make the cut
The Cliqz search engine, which is an index and not a proxy, is largely owned by Hubert Burda Medi. The company offers a "free" web browser built on Firefox.
It appears there are two primary privacy policies which apply to the search engine and both are a wall of text. As is often the case, they begin by telling readers how important your privacy is ("Protecting your privacy is part of our DNA") and then spend the next umpteen paragraphs iterating all the allegedly non-personally identifying data they collect and the 3rd party services they use to process it, which then have their own privacy policies.
In 2017 the morons at Mozilla corporate made the mistake of partnering with Cliqz and suffered significant backlash when it was discovered that everything users typed in their address bar was being sent to Cliqz. You can read more about this on HN, as well as a reply from Cliqz, also on HN.
Lastly, Search Encrypt doesn't seem to provide any information about how they obtain their search results, though both the results and interface reek of Google and reading between the lines clearly indicates it is a meta search engine.
Search Encrypt was also recommended by NordVPN who seems happy to promote such garbage.
Evaluating search engines
There are several tests that you can perform in order to determine the viability of a search engine. To get a sense of whether the results are biased, i often search for highly controversial subjects such as "holocaust revisionism". If you preform such a search using Google, Bing or DuckDuckGo, with or without quoting it, most or all of the first results link only to mainstream sources which attempt to debunk the subject rather than provide information regarding it. If you perform the same query using Mojeek however, the difference quite dramatic. Rohan Kumar also offers several great tips for evaluating search engines in his article, A look at search engines with their own indexes:
- "vim", "emacs", "neovim", and "nvimrc": Search engines with relevant results for "nvimrc" typically have a big index. Finding relevant results for the text editors "vim" and "emacs" instead of other topics that share the name is a challenging task.
- "vim cleaner": should return results related to a line of cleaning products rather than the Correct Text Editor.
- "Seirdy": My site is relatively low-traffic, but my nickname is pretty unique and visible on several of the highest-traffic sites out there.
- "Project London": a small movie made with volunteers and FLOSS without much advertising. If links related to the movie show up, the engine’s really good.
- "oppenheimer": a name that could refer to many things. Without context, it should refer to the physicist who worked on the atomic bomb in Los Alamos. Other historical queries: "magna carta" (intermediate), "the prince" (very hard).
Lessons learned from the Findx shutdown
The founder of the Findx search engine, Brian Rasmusson, shut down operations and detailed the reasons for doing so in a post titled, Goodbye – Findx is shutting down. I think the post is of significant interest not only to the end user seeking alternatives to the ethically corrupt mega-giants like Google, Bing, Yahoo, etc., but also to developers who have an interest in creating a privacy-centric, censorship resistant search engine index from scratch. Following are some highlights from the post:
Many large websites like LinkedIn, Yelp, Quora, Github, Facebook and others only allow certain specific crawlers like Google and Bing to include their webpages in a search engine index (maybe something for European Commissioner for Competition Margrethe Vestager to look into?) Other sites put their content behind a paywall. [...]Most advertisers won’t work with you unless you either give them data about your users, so they can effectively target them, or unless you have a lot of users already. Being a new and independent search engine that was doing the time-consuming work of growing its index from scratch, and being unwilling to compromise on our user’s privacy, Findx was unable to attract such partners. [...]We could not retain users because our results were not good enough, and search feed providers that could improve our results refused to work with us before we had a large userbase … the chicken and the egg problem. [...]From forbidding crawlers to index popular and useful websites and refusing to enter into advertising partnerships without large user numbers, to stacking the requirements for search extension behaviour in browsers, the big players actively squash small and independent search providers out of their market.
I think the reasons for the Findx shutdown highlight the need for decentralized, peer-to-peer solutions like YaCy. If we consider the problems Findx faced with the data harvesting, social engineering giants like Google, Facebook and the various CDN networks like Cloudflare, i think they are the sort of problems that can be easily circumvented with crowdsourced solutions. Any website can block whatever search crawler they want and there can be good reasons for doing so, but as Brian points out, there are also stupid and unethical reasons for doing so. With a decentralized P2P solution anyone could run a crawler and this could mitigate a lot of problems, plus force the walled garden giants such as Facebook to have their content scraped.
- 5 Best Search Engines That Respect Your Privacy - BestVPN.com
- 12 Private Search Engines that Do Not Track You - Hongkiat
- Alternative Search Engines | Oregon Computer Solutions
- Distributed Search Engines - P2P Foundation
- P2P Search as an Alternative to Google: Recapturing network value through decentralized search » The Journal of Peer Production
- Search Engine Party
- Search Engines - which one to choose?
- The Search Engine Map
- added "License" table columns
- minor edit
- added Marginalia Search
- added more info about DuckDuckGo
- added Wiby
- de-listed Qwant
- minor edits