15

The mermaid is taking over Google search in Norway

 4 years ago
source link: https://alexskra.com/blog/the-mermaid-is-taking-over-google-search-in-norway/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
The mermaid is taking over Google search in NorwaySkip to the content
Web development, tech and thinkering

Recently I’ve started seeing a lot of spam in Google Search in Norwegian. I’m not talking about a bad result here and there that ranks terrible. No, I’m talking about a large-scale spam operation that I’ve noticed more and more in recent days.

It’s so bad that I’m convinced that this one spam domain is getting a large cut of all Google Search traffic in Norway. I can search for basically anything and find it in the first few pages with a very high probability.

The domain they use right now is a danish one, havfruen4220.dk. (“Havfruen”, meaning “the mermaid”)

You’re greeted with this image if you visit it directly:

Silly image from havfruen4220.dkIt seems like whoever is behind the site has been expecting us.

So I found a site with a silly domain and a silly image that ranks for things on Google Search. Do I have some examples? I sure do.

Example searches

Let me demonstrate the scale of the issue with some example searches.

A search for one of my brands? Yep, the spam site ranks as number 10. A large IT consultant firm in Norway(and some other countries)? Number 11. Local newspaper? Top of page 3.

Let’s try another one. Let’s try “REMA 1000” (the largest grocery store chain in Norway). Sure enough, on top of page 5, we have this:

A Google Search result forA Google search result for “REMA 1000” that points to the spammer’s domain.

Let’s try something completely different and random. Maybe something people are wondering about. Let’s try “How often” and let Google pick a thing for us.

Google suggestions forGoogle suggestions for “How often” in Norwegian.

So the first suggested result is “how often should you shower” in Norwegian. Let’s try it.

Google search result forGoogle search result for “How often should you shower”.

Sure enough, it’s on the first page.

What about “How to calculate percentage” in Norwegian(Google’s top suggestion for when starting typing “Hvordan”)

“How to calculate percentage” is Google’s top suggestion for “How”.

Of course, on the first page, we get:

The mermaid got content on calculating percentages too.The mermaid got content on calculating percentages too.

What about “How often does apple update iOS”(“hvor ofte oppdaterer apple iOS”)?

Result nine and ten:

The mermaid even got content about iOS.The mermaid even got content about iOS.

Let’s take a look at results from just the mermaid:

Just a casual 9,95 million pages according to Google.Just a casual 9,95 million pages, according to Google.

There is a lot of pages. Most of them seem to be relatively new and created in the last few days.

How is the content generated?

Just by looking at the results, it doesn’t make much sense. It’s scraped from a bunch of different places. Something is from Twitter, something is from news sites, and something is from other random websites. Content seems to be combined from multiple sources. The page is served with Cloudflare.

Searching for exact strings reveals that there are more domains used. All of them use the TLD .dk.

The thing is, there is no actual content available to us if we visit the page. The page uses cloaking and probably only shows content if you’re visiting from Google crawler IPs.

If I pretend to be Google by changing my user agent I just get the silly image I showed you. If I remove the GoogleBot user agent there is one difference: Javascript is inserted that redirects the user to another page:

var b = "https://havfruen4220.dk/3_5_no_14_-__1627506246/gotodate";
    (/google|yahoo|facebook|vk|mail|alpha|yandex|search|msn|DuckDuckGo|Boardreader|Ask|SlideShare|YouTube|Vimeo|Baidu|AOL|Excite/.test(document.referrer) && location.href.indexOf(".") != -1) && (top.location.href = b);

How do they profit from this?

After the first redirect, that page redirects the user, again, to other scam domains. Some fake news sites pretend to be one of the most popular Norwegian news sites; others are basic “want to earn money fast online” sites.

This is one of the pages asking if I've ever earned money online and that registration is free right now. After a few
This is one of the pages asking if I’ve ever earned money online and that registration is free right now. After a few “do you want to earn £25k a month” questions, you’re redirected to another domain.
This site is the one the previously shown one redirected me to. Clearly, they want me to purchase bitcoin with them, and that's a great idea at it's hitting $500K soon.
This site is the one the previously shown one redirected me to. Clearly, they want me to purchase bitcoin with them, and that’s a great idea at it’s hitting $500K soon.

Other sites are news sites like this:

The website pretends to beThe website pretends to be “Dagbladet”(Norwegian news site).

The fake news article is usually something like “this Norwegian celebrity reveals how he got rich”. And it usually ends with some crypto scheme.

These scams are old, but they usually don’t rank well. I’ve never seen anything like this. It’s currently a top result for nearly anything on Google search in Norway.

How are they ranking so well?

We have all probably heard that Google’s ranking is advanced and that it’s pretty hard to fool it. Someone made it. I’m not going to pretend I know how they did this, but I think I have some ideas.

First of all, the content seems somewhat decent in google search at times, and I’ve clicked it multiple times myself when searching for things. When you click it, it does what it can to block browser navigation so you can’t return to google. The content is also so clearly a scam that I had to read some of it for fun.

I think that Google uses stats on whether the user continued checking more results for that specific search query to determine if the visited result answered the user.

When the website blocks you from going back, Google might think you found the thing you were looking for and use this as a positive signal. This way, the site ranks you even higher. I often forget my exact search query, so I usually don’t search again with that exact phrase if I’m blocked from going back.

How can Google fix this?

The simple solution would be to test sites regularly with an unknown IP and common user agent to check that a site isn’t just showing content to Google and gives real users something completely different. That would stop this.

Another thing is that an alarm should probably go off when a new domain takes off like this. The hafruen4220.dk domain is shown for basically anything, so it wouldn’t surprise me if it’s the most shown domain in Google search in Norway right now.

How would you make a profit if you could rank for basically everything on Google search?

Thank’s for reading this weird and messy blog post. I just wanted to share the weirdness. Have a great day! ?


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK