(top picture by @firstname.lastname@example.org)
The following poem, which according to the poet, @email@example.com, has been “floating around the fediverse”, perfectly describes what’s wrong with the world wide web today:
roses are red violets are blue in surveillance capitalism a poem reads you and shows you ads for flower shops and tracks your clicks and never stops it cares not about if privacy's harmed the money is green when people are farmed twitter is cyan facebook is blue your friends are the product and so are you
In surveillance capitalism, companies compete by collecting as much data as possible through surveillance. They sell that data, or they use the data to predict who is most likely to buy a product or service.
Why do companies use surveillance? They will (have to) explain this in their terms-of-service: You know, the lengthy disclaimer that you accepted without reading. Typically, the terms-of-service of companies like Google and Facebook claim that collecting user data is necessary for at least two reasons: 1) to improve the service and 2) to show targeted advertisements.
1. To improve the service
An example of a search service that is improved with user data is Google’s query autocompletion service. Google stores the queries of all its users to predict what you search for. Autocompletions are an effective feature. They help users to formulate better queries, and they save users keystrokes: Very helpful if you are searching on your mobile phone. The image below shows autocompletions for the prefix “surveillance”.
2. To show targeted advertisements
The main reason for collecting user data is, of course, to show targeted advertisements. Why are targeted advertisements so profitable? This vintage news paper advertisement for a pain killer is only effective to people that are in pain. In the old days, the only way to target such an advertisement, was to choose a news paper that is read more by people in pain, and target pages in the news paper that are read most by the same people, for instance the health pages. Today, Facebook and Google will be able to find the individuals among their users that are likely in pain, and show the advertisement only to them. This makes targeted advertisements much more profitable
Companies like Facebook and Google would like you to think that they must collect data to improve services and show targeted advertisements. We work at Searsia to debunk this.
Query autocompletions without tracking users
Let’s have a closer look at query autocompletions. Using user data to predict query autocompletions causes several problems. For instance, Google actively promoted nazi propaganda for innocent queries like “did the h”, and even worse, the first results for those queries would be pages from neo-nazi sites like Stormfront that deny that the holocaust happened.
Many small organizations and individuals were damaged by autocompletions. There have been lawsuits in France, Italy, Germany, and Japan, where individuals could for instance not get a job because Google suggested offensive completions for their name. Google started to actively filter completions for person names after these lawsuits, but its autocompletions continue to suggested terms which could be viewed as racist, sexist or homophobic.
At Searsia, we do not track user queries for autocompletions. Instead, we use the anchor texts of web pages, that is, the text of hyperlinks. We compared its performance with autocompletions based on user queries for a general web search task. Our approach that uses hyperlinks is as effective as the approach that uses user data. The results below show that for short queries, we need 1 more keystroke to predict the full user query. For long queries, our approach outperforms the approach based on user data.
Targeted advertisements without tracking
Now, you might think: Sure, you can improve services without tracking people, but it is impossible to target advertisements without tracking people. Not quite. Remember Searsia is a search engine. Users tell a search engine what they are looking for. If you are looking for a car, Searsia can show you advertisements for cars. The advertisements are based on the search terms, not on the person.
Searsia provides free open source software for federated search. Federated systems are the answer to global corporations like Twitter, Facebook and Google that profit from tracking their users. In a federated system, no-one owns the complete service, so no-one is able to profit from tracking all user interactions. The poem that started this post was taken from Mastodon, a federated alternative for Twitter and Facebook. Just as Mastodon is a federated alternative for Twitter, Searsia’s technology might one day be an alternative for Google. If that happens, we know that there is no reason to track users, not even for providing autocompletions or targeted advertisements.