Somewhere around 2002 i sold a PC to a very nice, older fella who said he had worked for the government either directly or as a contractor. I don’t recall which and he didn't state what department he worked for. He said he had a security clearance and, as i recall, it was a crypto clearance. He left me with the strong impression that he wasn't going to provide a lot of detail as to what exactly he did, however i had no reason to disbelieve anything he said since he seemed genuine and very matter-of-fact. Our time together was short because he had to be somewhere, but we chatted a while and he touched upon some very interesting topics that i wanted to know more about and so i suggested we continue our conversation through encrypted email. He looked at me and responded with, "Encryption is useless.". Those words stuck with me ever since.
Obviously encryption is not useless, but i suspect what he meant was that the "intelligence" community has the ability to break possibly any encryption that existed at the time. While i was somewhat skeptical about his statement, that skepticism has since evaporated. First of all we have to consider the computing power that the intelligence communities have access to. Let's assume that you're encrypting an email using a modern encryption algorithm along with a very long and secure passphrase, and let's further assume that it would take roughly 10,000 years for the average computer to break it. Would you feel confidant using such encryption? Well, what happens if that code breaking computer is 100,000 times more powerful than than your PC? And what if you chain together 100 of those computers? Decrypting that email may now be possible in a few hours or seconds. Does the NSA not have access to computers that are orders of magnitude more powerful than anything in the public sphere? And what might they have that we don't know about? What about quantum computers? Without the ability to know what the enemy possesses, one must assume that no encryption is safe.
Whether encryption is useless or not depends upon the threat we want to mitigate. For example, if you wanted to download copyrighted content whilst avoiding having your ISP send you nasty-grams, then encryption is certainly not useless. However given what i have read and heard over the years, i strongly suspect that encryption is not effective if, for example, it is the NSA that decides to target you and i think that multiple statements and documents released by Edward Snowden and Bill Binney strongly suggest this. There is perhaps another possibility here though. What if, as some suspect, Snowden was allowed to leak what he did, sort of as a limited hangout. Personally i think Snowden is genuine, but that doesn't mean that the information in the documents he released wasn't intended to be released. Furthermore, there is certainly classified and compartmentalized technology that Snowden knows absolutely nothing about. What if the U.S. intelligence community wanted to quell a potential uprising by 'we the people'? It is apparently a historic fact that one way to accomplish this is to make people think they are being surveilled which, in turn, compromises their ability to communicate effectively due to self-censorship.
While i think it is smart to assume that everything we say or do over a network, or while in the presence of electronic devices capable of recording us such as a smartphone or smart assistants, even if the encrypted data we send and receive were secure, that data can be stored indefinitely until some time in the future when the encryption can be broken. One may assume that the immediate problem with storing that amount of data is processing it and developing coherent intelligence, however this is seemingly quite doable with advanced technology such as quantum computing and artificial intelligence (AI). Both Binney and Snowden have stated that the massive, ongoing and patently illegal and unconstitutional data collection practices as employed by intelligence communities are not effective in preventing threats because of the wide net cast by the programs, but i'm not sure they considered AI or other advancements in technology, or even know about some the hardware which may be in play.
If you're a target i don't believe there's any way you can eliminate the risk. I mean in fact i don't think there's anything you can do to stop it. If they're after you they're going to get you one way or the other. I mean there's so many...if they can't get it through the internet, through the tapping of the lines, or anything like that through a commercial means, and they're unsure about you, they can get it by close access means, uh coming in and actually bugging your house or bugging your, um, or putting monitors in your system...in your house or on your computer, they can use your computer video to look back at you, or they can monitor um, within a certain distance the keystrokes your making on your computer or what you're putting on your computer screen and if that's not enough they can come in through the firewall you think you have but don't and go through your operating system that you think protects you but doesn't and read your uh, encrypted email that you thought was secure but isn't, or, they can simply wait for you to do decrypts if you've done that and pull them off and use your unused CPU while you're on the computer to drain it. It's called active attack. So if you're a target there's virtually nothing you can do. And if they fail in their electronic means they can always send the FBI at you to do a sneak-and-peak and take your photograph or do whatever they want.
New documents released by Edward Snowden show that the NSA and its British equivalent, GCHQ (pictured above), have cracked VPNs, SSL, and TLS -- the encryption technologies that keep your data secure on the internet. The NSA program, dubbed Bullrun, took 10 years to crack the web's encryption technologies, before finally reaching a breakthrough in 2010 that made "vast amounts" of previously unreadable data accessible. Perhaps more worryingly, the NSA has an ongoing program to place backdoors in commercial products (websites, routers, encryption programs, etc.) to enable easy snooping on encrypted communications. The documents, which contain some choice phrases such as, "work has predominantly been focused this quarter on Google due to new access opportunities being developed," almost completely undermines the very basis of the internet, obliterating the concept of trust online.
The documents outline a three-pronged plan to ensure the NSA can access the bulk of the internet's encrypted traffic: Influencing the development of new encryption standards to introduce weaknesses, using supercomputers to break encryption, and collaborating with ISPs and tech companies to gain backdoor access.
Despite the threats we face we must never be dissuaded from communicating. We must have dialog because without it, as Binney states, society stagnates and self-destructive behavior is one of the results.
video: NSA Whistleblower William Binney The Future of Freedom
video: They're Watching You
video: NSA Whistleblower: Government Collecting Everything You Do
Where did you buy that latest gadget from? Is it fake? Here's why you should be very careful about where you spend your clams.
If you buy stuff from online stores such as Banggood, GearBest, Newegg, or any other of a laundry list of China-based wholesalers, you may be surprised at what is going on behind closed doors with these businesses. I wasn't aware of the scale of this problem until recently when i came across a video about electric motors for the remote-controlled hobby industry while researching parts for a build-it-yourself multi-rotor/drone aircraft. But first, a little of the back-story...
Remember when the big-box stores such as Sears, Grants, and later, Walmart, actually sold quality products? Unless you're in your 50s or older, probably not, but i happen to have been around long enough to be able to observe the massive decline in the quality of the products they sell. You see, it used to be that when you bought a refrigerator, a washer, or a lawn mower from a store like Sears, your kids might have inherited it because it was built to last. I have seen many appliances that were still operating fine after three to five decades of use. Today however, in our "consumer" driven society, it is common to have to replace our appliances, tools and electronics every few years because we live in a world of planned obsolescence where products are specifically designed to fail. Worse, it seems the public at large doesn't mind spending their cash on the same gadget over and over again, as though this is the new normal. Of course planned obsolescence makes perfect monetary sense from a business point of view because it is obviously far more profitable to sell consumers the same product multiple times than it is to design products they never have to replace.
Ignoring the ethical problem of planned obsolescence for a moment, we can realize a much larger problem in the fact that we live and depend upon a precious little planet which holds a finite amount of resources such as coal and oil, both of which are critical in the manufacturing of the widgets we buy, and yet our economic systems are based on an infinite growth model. Obviously this cannot continue because it is simply not possible and yet manufactures and wholesalers, like Amazon, Banggood and many others, continue to add to the problem by designing and marketing junk which is largely manufactured by cheap Chinese labor. And it gets worse. It isn't just that they sell a lot of junk, they also sell a lot of fake junk that is marketed as the real junk. These Chinese wholesalers are targeting products which are in high demand and hiring China-based companies to clone them using sub-par components and then selling the copies on their international websites for the same price as the originals. Like myself, you may have known that the Chinese are great at replicating things, but what you may not have known, and i certainly didn't, is the scope of the problem. These cheap forgeries are showing up en masse among some of the largest on-line retailers on the web and it gets even funnier, or sadder, depending on how you look at it.
As i wrote earlier, i learned about these shenanigans whilst watching a video of a guy talking about the electric motors that are used in the hobby industry, specifically the multi-rotor aspect of it (so-called "drones" or "quad copters"), but in no way whatsoever is this problem limited to the hobby industry. In the video below the author discloses information that was obtained directly from various engineers regarding these Chinese wholesalers, some who live and work in China. The video picks up at the 26:50 mark where he begins discussing the topic, but if you happen to be an RC enthusiast who has a deep interest in electric motor design, you might want to watch it in its entirety.
Update, 14-Mar-20: In addition to this update, there is an earlier update further down. As for this most recent update, i no longer recommend Waterfox, nor any other fork of Firefox. In the interest of privacy i recommend using only the official release of Firefox as provided by Mozilla. This is absolutely not because i think Firefox is great piece of work (it isn't), or because i like Mozilla (i don't), but rather because it's the only mainstream and capable web browser that is suitable for the degree of privacy hardening that i subject it to. You can read more about that in one of my Firefox guides. Now, on with the story...
Back in the day, Firefox was sort of a hackers power browser that fit a niche market. It was probably the most tweakable mainstream web browser on the planet for both geeks and average users alike. Although it is still highly customizable, it has become less so since Mozilla decided to terminate support for so-called "legacy" add-ons and replace them with WebExtensions of the same type as used by Google Chrome. Matter of fact, Firefox has become a Google Chrome clone as far as i'm concerned and some of us -- a core Firefox audience that liked running something different and something that wasn't 'Googlized' -- didn't want anything to do with Google, much less their Chrome web browser.
In its [not so] slow, steady decline and separation from its core values, Mozilla has dumbed-down Firefox to the point where it is hardly recognizable and changed its add-on API several times, thus forcing developers to rewrite their code in order to comply with yet another new standard. The developer of the much loved Search WP extension had this to say:
I'd love to support Firefox 57 (with all my extensions) but
1) Webextensions are just *too* limited. You simply can't do anything useful with them until somebody adds an API just for you. It already starts with the most basic functionality of SearchWP: there does not seem to be a way to modify the search bar.
2) Mozilla ruthlessly breaking all existing extensions on purpose and removing customization possibilities with every new version of Firefox made me loose trust in the foundation and the browser itself - I'm not willing to spend my spare time on a project that has set a course that goes against everything Firefox once stood for.
And the stupidity continues...
For some time Mozilla has been packaging extensions with Firefox in the form of system add-ons, or "features" as Mozilla calls them. Not only is the option to uninstall these add-ons absent from the user interface, but most people aren't even aware they exist since they're hidden from the Add-Ons panel (if you want to know more about system add-ons and how to remove them, read the article, Firefox Configuration Guide for Privacy Freaks and Performance Buffs).
In its latest burst of stupidity, Mozilla is now installing yet another add-on without consulting users, but this time, to their undeserved credit, they have made it removable apparently. 'Looking Glass' appears to be some kind of metrics collection add-on disguised as an augmented reality game created by the PUG Experience Group, whoever the hell they are, and it is part of a series of "Shield Studies" conducted by Mozilla. To see what studies Mozilla has foisted upon you that you didn't agree to, enter about:studies in the address bar and then about:preferences#privacy to opt out. Better yet, stop using the Mozilla version of Firefox altogether.
I no longer suggest using Firefox, at least not the version distributed by Mozilla. If you want Firefox with the privacy disrespecting garbage removed, consider using Waterfox, which is a more privacy-centric, 64 bit fork of Firefox that will apparently continue to support XUL (legacy) extensions in addition to the newer WebExtensions. Some of the features of Waterfox are:
Disabled Encrypted Media Extensions (EME)
Removed data collection
Removed startup profiling
Allow running of all 64-Bit NPAPI plugins
Allow running of unsigned extensions
Removal of Sponsored Tiles on New Tab Page
UPDATE: Mozilla apologizes.
On 18-Dec., after many users complained about the inclusion of the Looking Glass add-on, for which almost nothing was known at the time it was distributed, Mozilla published an apology, moved the add-on to the Mozilla add-on repository and published the source code. The post opened with the following nonsensical statement which raises more questions than it answers:
Over the course of the year Firefox has enjoyed a growing relationship with the Mr. Robot television show and, as part of this relationship, we developed an unpaid collaboration to engage our users and viewers of the show in a new way: Fans could use Firefox to solve a puzzle as part of the alternate reality game (ARG) associated with the show.
Does this sound remotely like anything that should be included in an internet web browser? What is the nature of Mozilla's relationship with Mr. Robot? We already know that Mozilla has a habit of adding unnecessary functionality through its inclusion of 3rd party services for monetary gain and using its relationships with many privacy destroying corporations, such as Google, to monetize necessary functionality, yet they packaged the Looking Glass add-on with Firefox for no other reason other than, what? They like Mr. Robot? They wanted to make sure you weren't bored by giving you a game to play? Utter bullshit. And why wasn't the source code published before the add-on was shipped? And how do we know that the published code is identical to the unpublished code?
The rollout did not meet the standards to which we hold ourselves causing concern that was surfaced through our Firefox community.
Yes it did because Mozilla sacrificed its standards long ago. The only reason they published this apology is because enough users complained.
We received feedback regarding the transparency of the rollout and the processes that govern our auto-install mechanism for add-ons. In response we immediately started our internal review, [...]
Good thing most users have no clue about the several system add-ons and "features" that ship with Firefox which are forcefully installed, activated, not easily uninstalled, and are used to collect data. Of course we know that no internal review will be performed to address this glaring privacy issue.
We’re sorry for the confusion and for letting down members of our community. While there was no intention or mechanism to collect or share your data or private information [...]
When one considers exactly what Mozilla defines as "user data" and "private information", one realizes how hollow this misleading claim rings. If they're so concerned about their users, why aren't they concerned about the data that is still being collected by the forcefully installed system add-ons of which users are largely unaware? Why aren't these add-ons removed and placed in the add-on repository?
Unlike meta search engines such as DuckDuckGo, Startpage, etc., which rely either partially or entirely upon third parties for their results (primarily Bing and Google), all search engines listed here maintain their own indexes meaning they actively crawl the web in search of new and updated content to add to their catalogs. A few are hybrids, meaning they rely partially upon a 3rd party engine.
Although meta search engines are often referred to as "alternative" search engines, they are not true alternatives since they are subject to the same censorship/de-ranking practices of the companies upon which they rely. Such search engine companies are really proxies in that they may provide a valuable service by insulating you from privacy intrusive third party services, however this is not always the case. To gain some insight as to the relationships between search engines, see the excellent info-graphic provided by The Search Engine Map website.
If you are going to use a meta search engine which relies upon a 3rd party, those which depend on Microsoft's Bing seem to return generally better results than those which rely upon Google, especially when searching for sensitive and censored information, though i don't expect this to last since Bing and DuckDuckGo are working together to censor Bing's results.
Brave Search maintains its own index. The search interface is attractive and intuitive. Unfortunately there are few options for tailoring the search results or the interface, however some of the more important options are in place, including regional and date search options.
Good Gopher was apparently developed by Mike Adams, editor of the NaturalNews.com website, and appears to be unmaintained.
Revenue is generated by displaying ads in the search results, though they state they are very particular about who may advertise on the platform.
Marginalia Search is a very interesting, open source, niche search engine which describes itself as "an independent DIY search engine that focuses on non-commercial content, and attempts to show you sites you perhaps weren't aware of in favor of the sort of sites you probably already knew existed".
One very useful aspect of Marginalia Search is that it allows you to choose the search result ranking algorithm which compiles the search results in different ways, such as by focusing on blogs and personal websites, academic sites, popular sites, etc..
Mojeek is a UK based company founded in 2004. The company operates it's own crawler and promises to return unbiased results. I think Mojeek is currently the most usable and one of the most promising of all the search engines listed here. Mojeek is very open about how they operate and development of the search engine and its algorithms are driven in part by soliciting input from users.
The search interface is clean and they offer quite a few options to customize how searching works and how the interface looks. Also available are advanced search options and another tool it calls 'Focus' which can direct search terms to specific domains. One can also configure how many search results per domain are returned and if more than that number are available, Mojeek adds a link under the result which will open a new page with those results when clicked. If you enter a domain as the search term, Mojeek offers the option to search within that domain. The engine also supports some search operators including site: and since:, the latter of which is similar to the date: operator used by Google.
The search interface is bare and there are no options other than the ability to perform an advanced search. There are only two scopes of searches, they being web and news.
Right Dao searches seem to be fairly comprehensive and so this search engine is a solid choice when looking for politically sensitive information that Google and others censor. While the engine accepts phrase searches, that functionality seems to be very broken.
Wiby is an interesting, open-source search engine which is building an index of personal, rather than corporate websites. The interface is very plain and there was only one option in the settings, however it was designed to work well with older hardware.
While YaCy doesn't produce a lot of search results since not enough people use it yet, i think it's one of the more interesting search engines listed here.
YaCy is a decentralized, distributed, censorship resistant search engine and index powered by free, open-source software. For those wanting to run their own instance of YaCy, see their home and GitHub pages. This article from Digital Ocean may also be helpful.
Refusing to accept cookies may result in settings not being saved.
Alexandria is a very new, open-source search engine with its own index, though it's currently built using a 3rd party. The first version of the source code appeared on GitHub in late 2021. The index is very small at the moment and therefore the service isn't really useful yet.
The interface is sparse and there are currently no options for customizing anything, however there are plans to improve the service.
Alexandria is worth keeping an eye on.
I contacted Alexandria in April of 2022 with some questions. Following is our exchange:
Q: what are your values regarding user privacy? A: We care a lot about user privacy and plan to let users decide how much they want to share. We run Alexandria.org as a non-profit so we have no incentive to store any info other than to make the search results better.
Q: i see that you have a dependency on rsms.me - depending on 3rd parties is always a privacy and security concern and i think it is often unnecessary - it looks like it's only css that's being imported at the moment, but do you plan on adding any other 3rd party dependencies? A: Yes we use the Inter font which is open source, we just think it is a nice looking font. We generally have a high threshold for using a 3rd party dependencies but I think it is impossible to build everything ourselves so if there are things other people are better at than us and it is not in our core mission to build it we will use third party solutions. For example we depend on Hetzner for servers, we depend on commoncrawl for high volume scraping. But it's quite likely that we remove that dependency when we redesign the website next time.
Q: what are the long-term goals for Alexandria? A: The long terms goal is to make knowledge as accessible as possible to as many people as possible. We want to give the users of alexandria.org info that are in their best interest without having to think about advertisers or other third parties.
Q: will you offer unbiased results? A: Our bias should be to show the results that are likely to be the most useful for users, so that is what we are aiming for.
Q: do you respect robots.txt? personally i'm fine with it if you do not since it seems Big Tech is making it difficult for the little guy to compete in this market A: Our index is primarily built with data from Common Crawl. But when we do crawling our self we respect robots.txt. Our main problem with scraping is not robots.txt, but that many big/valuable sources of information are behind cloudflare and similar services or otherwise closed to scarping.
Q: how do you plan to finance the project? A: In the long term we hope to be able to finance it with donations.
Q: what is the current size of your index roughly (pages) and at what rate is it growing? A: Right now we are just using a very small index while rebuilding big parts of the system. The current index is around 100 million urls. Pretty soon we plan to have 10 billion urls indexed.
Q: what search operators will you/do you support (site:, title:, date: etc.)? A: None right now. The first one we will implement is site: since it is quite simple.
Q: because the code is available, will anyone be able to run Alexandria on their own server and how will that work? will each instance be independent, or might the indexes be shared across all servers? A: Our index is not open source at the moment. So anyone who want's to create their own search engine will have to create their own index by crawling the web themselves or downloading data from common crawl or similar.
Presearch is (currently) yet another meta search engine which is ultimately powered by Big Tech in that it relies on multiple corporate giants for its search results.
Presearch appears to be largely centralized at the moment, though decentralization is a stated goal. In the future Presearch is to be powered largely or entirely by the community in that anyone can run a node and help build an index with content curated by users.
Presearch uses code from several 3rd parties including bootstrapcdn.com, coinmarketcap.com, cloudfront.net and hcaptcha.com. Such dependencies are often unnecessary, resulting in bloated and potentially insecure platforms which may not be privacy friendly.
Presearch incorporates "PRE" tokens, yet another form of digital currency which is apparently used for a variety of purposes including to incentivize people to use Presearch, financing the growth of infrastructure and to insure the integrity of the platform. While people can apparently earn "PRE" when using the search engine, withdrawing their earnings appears to be a convoluted process which is not always successful (see here and here for example).
While Presearch may have potential, the realization of its goals of decentralization and the building of its own index need to be met before it becomes a viable service.
De-listed search engines
DuckDuckGo has openly admitted to censoring and de-ranking search results as well as working with Microsoft's Bing in order to influence their results (DuckDuckGo relies heavily on Bing). In one instance they blacklisted voat.co, a former free speech social platform, and on March 10, 2022, DuckDuckGo's CEO, Gabriel Weinberg, tweeted the following:
Like so many others I am sickened by Russia’s invasion of Ukraine and the gigantic humanitarian crisis it continues to create. #StandWithUkraine️ At DuckDuckGo, we've been rolling out search updates that down-rank sites associated with Russian disinformation.
Weinberg apparently had no problem when the U.S. invaded Iraq, Syria, Libya, etc., nor any problem with Black Lives Matter and Antifa terrorists burning and looting cities throughout the U.S., but he suddenly developed a selective crises of conscious when Russia invaded Ukraine, which happens to be full of U.S. and Israel sponsored terrorists.
DuckDuckGo also admitted to influencing Microsoft's Bing search results according to a New York Times article:
DuckDuckGo said it "regularly" flagged problematic search terms with Bing so they could be addressed.
DuckDuckGo continues its race to the bottom. From an April 15, 2022, TorrentFreak article:
Privacy-centered search engine DuckDuckGo has completely removed the search results for many popular pirates sites including The Pirate Bay, 1337x, and Fmovies. Several YouTube ripping services have disappeared, too and even the homepage of the open-source software youtube-mp3 is unfindable.
On or around 25 May, 2022, it was discovered that DuckDuckGo was allowing tracking by Microsoft:
DuckDuckGo's founder Gabriel Weinberg has admitted to the company's agreement with Microsoft for allowing them to track the user's activity. He further stated that they are taking to Microsoft to change their agreement clause for users' confidentiality.
DDG's founder (Gabriel Weinberg) has a history of privacy abuse, starting with his founding of Names DB, a surveillance capitalist service designed to coerce naive users to submit sensitive information about their friends. (2006)
#UkraineRussiaWar In accordance with the EU sanctions, we have removed the Russian state media RT and Sputnik from our results today. The neutral web should not be used for war propaganda.
As of somewhere around 2018 or 2019, Startpage was partially bought out by Privacy One Group/System1 which appears to be a data collection/advertising company. Source: Software Removal | Startpage.com
Other search engines
The Search Engine Party website by Andreas is well worth visiting. He has done an excellent job of compiling a large list of search engines and accompanying data. Also see the 'A look at search engines with their own indexes' page by Rohan Kumar who did an excellent job of compiling a list of engines that maintain their own index, however do note that privacy was not considered.
Reader suggested search engines that didn't make the cut
The Cliqz search engine, which is an index and not a proxy, is largely owned by Hubert Burda Medi. The company offers a "free" web browser built on Firefox.
It appears there are two primary privacy policies which apply to the search engine and both are a wall of text. As is often the case, they begin by telling readers how important your privacy is ("Protecting your privacy is part of our DNA") and then spend the next umpteen paragraphs iterating all the allegedly non-personally identifying data they collect and the 3rd party services they use to process it, which then have their own privacy policies.
In 2017 the morons at Mozilla corporate made the mistake of partnering with Cliqz and suffered significant backlash when it was discovered that everything users typed in their address bar was being sent to Cliqz. You can read more about this on HN, as well as a reply from Cliqz, also on HN.
Lastly, Search Encrypt doesn't seem to provide any information about how they obtain their search results, though both the results and interface reek of Google and reading between the lines clearly indicates it is a meta search engine.
Evaluating search engines
There are several tests that you can perform in order to determine the viability of a search engine. To get a sense of whether the results are biased, i often search for highly controversial subjects such as "holocaust revisionism". If you preform such a search using Google, Bing or DuckDuckGo, with or without quoting it, most or all of the first results link only to mainstream sources which attempt to debunk the subject rather than provide information regarding it. If you perform the same query using Mojeek however, the difference quite dramatic. Rohan Kumar also offers several great tips for evaluating search engines in his article, A look at search engines with their own indexes:
"vim", "emacs", "neovim", and "nvimrc": Search engines with relevant results for "nvimrc" typically have a big index. Finding relevant results for the text editors "vim" and "emacs" instead of other topics that share the name is a challenging task.
"vim cleaner": should return results related to a line of cleaning products rather than the Correct Text Editor.
"Seirdy": My site is relatively low-traffic, but my nickname is pretty unique and visible on several of the highest-traffic sites out there.
"Project London": a small movie made with volunteers and FLOSS without much advertising. If links related to the movie show up, the engine’s really good.
"oppenheimer": a name that could refer to many things. Without context, it should refer to the physicist who worked on the atomic bomb in Los Alamos. Other historical queries: "magna carta" (intermediate), "the prince" (very hard).
Lessons learned from the Findx shutdown
The founder of the Findx search engine, Brian Rasmusson, shut down operations and detailed the reasons for doing so in a post titled, Goodbye – Findx is shutting down. I think the post is of significant interest not only to the end user seeking alternatives to the ethically corrupt mega-giants like Google, Bing, Yahoo, etc., but also to developers who have an interest in creating a privacy-centric, censorship resistant search engine index from scratch. Following are some highlights from the post:
Many large websites like LinkedIn, Yelp, Quora, Github, Facebook and others only allow certain specific crawlers like Google and Bing to include their webpages in a search engine index (maybe something for European Commissioner for Competition Margrethe Vestager to look into?) Other sites put their content behind a paywall. [...]Most advertisers won’t work with you unless you either give them data about your users, so they can effectively target them, or unless you have a lot of users already. Being a new and independent search engine that was doing the time-consuming work of growing its index from scratch, and being unwilling to compromise on our user’s privacy, Findx was unable to attract such partners. [...]We could not retain users because our results were not good enough, and search feed providers that could improve our results refused to work with us before we had a large userbase … the chicken and the egg problem. [...]From forbidding crawlers to index popular and useful websites and refusing to enter into advertising partnerships without large user numbers, to stacking the requirements for search extension behaviour in browsers, the big players actively squash small and independent search providers out of their market.
I think the reasons for the Findx shutdown highlight the need for decentralized, peer-to-peer solutions like YaCy. If we consider the problems Findx faced with the data harvesting, social engineering giants like Google, Facebook and the various CDN networks like Cloudflare, i think they are the sort of problems that can be easily circumvented with crowdsourced solutions. Any website can block whatever search crawler they want and there can be good reasons for doing so, but as Brian points out, there are also stupid and unethical reasons for doing so. With a decentralized P2P solution anyone could run a crawler and this could mitigate a lot of problems, plus force the walled garden giants such as Facebook to have their content scraped.