13

Address 1st-party tracker blocking · Issue #780 · uBlockOrigin/uBlock-issues · G...

 4 years ago
source link: https://github.com/uBlockOrigin/uBlock-issues/issues/780
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Join GitHub today

GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.

Sign up

New issue

Address 1st-party tracker blocking #780

Closed

aeris opened this issue on 10 Nov 2019 · 109 comments

Comments

Copy link

aeris commented on 10 Nov 2019

edited by gorhill

Helle here!

Since friday, we hit a case of 1st-party tracking that seems to be unblockable.

This occurs on https://www.liberation.fr/, embedding a 1st-party tracker f7ds.liberation.fr, which point to a ugly tracking provider Eulerian via the CNAME liberation.eulerian.net.

This provider clearly states it provide unblockable tracker

EJAeTXvWwAAqTPz

EJAwd5wWkAAjmsN

Seems Criteo starts to ask the same to their customer, with 1st-party tracking pointing to *.dnsdelegation.io subdomain.

In this case, it seems really difficult to block such tracker by tools like uBlock:

  • subdomain is mostly random (f7ds.example.org), even if we found some ea.* pattern
  • detection can sometime be done with CNAME resolution (to *.eulerian.net or *.dnsdelegation.io), but this is difficult to integrate to browser (those steps are internal to DNS client resolver)
  • IP filtering is not efficient, tracker provider can easily change IP without notifying it customers. CNAME change is more complexe, but provider can generate quite a bunch on random subdomain in advance and ask it customer to change the subdomain in case of too high blocking (or proactivly trigger a rotation each X days).

Do you have any way to detect then block such content from the browser?
The only (not so) efficient way I have at the moment is using DNS tools like PiHole to blacklist range of IP and CNAME pattern resolution. And even this way, it doesn't cover all the possible case… Even tools like µMatrix seems totally inefficient on such tracker…

Copy link

Member

uBlock-user commented on 10 Nov 2019

edited

Do not post any filter list issues or issues where website's functionality is broken. We have uAssets issue tracker for that, post there instead.

https://github.com/uBlockOrigin/uBlock-issues#ublock-issues

uBlockOrigin

locked and limited conversation to collaborators

on 10 Nov 2019

uBlockOrigin

unlocked this conversation

on 10 Nov 2019

Copy link

Member

gorhill commented on 10 Nov 2019

It's a technique used to bypass filters/rules, it's something which needs to be investigated.

Copy link

Member

uBlock-user commented on 10 Nov 2019

edited

Aren't they lying to PSL with these first-party domain entries ?

Edit: It's an inline-script, should be able to defuse via a scriptlet.

liberation.fr##+js(aopw, EA_data) works.

Copy link

Member

uBlock-user commented on 10 Nov 2019

Here's a crude dump of sites using Eulerian Analytics inline-script -- https://publicwww.com/websites/EA_data/

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

Copy link

Author

aeris commented on 11 Nov 2019

New detection :
keyade.com, on rueducommerce.fr
omtrdc.net, on sfr.fr

This comment was marked as off-topic.

Copy link

Member

gwarser commented on 11 Nov 2019

Wondering if #44 can will apply here if implemented.

Copy link

Member

gorhill commented on 11 Nov 2019

Can't apply, the case given as example make use of legitimate subdomains, statics.liberation.fr, medias.liberation.fr.

I am looking at https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/dns/resolve, it can be used to expose the CNAME:

browser.dns.resolve('f7ds.liberation.fr', [ "canonical_name" ]).then(r => { console.log(r); });
Promise { <state>: "pending" }
Object { addresses: (1) […], canonicalName: "atc.eulerian.net", isTRR: false }

I will prototype and evaluate how to optimally use this in uBO with the utmost care.

Copy link

Member

uBlock-user commented on 11 Nov 2019

Will this be applied in uMatrix too ?

Copy link

Member

gorhill commented on 11 Nov 2019

Copy link

Member

uBlock-user commented on 11 Nov 2019

edited

You will need to add a new permission named 'dns' in the manifest to use this API - https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/dns and since this is Firefox only API, how will you address this in Chromium ?

This comment was marked as off-topic.

This comment was marked as off-topic.

Copy link

Member

uBlock-user commented on 11 Nov 2019

I meant how will you fix this in Chromium..

Copy link

Member

gorhill commented on 11 Nov 2019

Best to assume it can't be fixed on Chromium if it does not support the proper API.

This comment was marked as off-topic.

Either way, when I 1st read NextDNS' blog post, the timing of it clearly indicated they are fans of uBO. I'm interested in hearing this discussion on the mechanics discussed between you two. We all learn something from whoever has thought about a particular thing before... which this whole commit is about :). I've always ignored CNAME records because I didn't think of the trickery they might use. It's nice to see people have read the advertisement companies' documentations to see what new techniques they're employing.

Copy link

Member

uBlock-user commented on 19 Dec 2019

edited

Exploiting CNAME is not a new phenomenon as you would think, washingtonpost/cnn and others have already deployed this since a while and had their ubiquitous domains blocked by filterlists, the only difference is that a french company made an announcement and gathered all the attention.

Copy link

Member

gorhill commented on 19 Dec 2019

edited

significant difference performance-wise

Again: uBO caches on its own side, according to its own TTL, which is currently 60 minutes (which I will probably put back at 120 minutes as it was in the first draft of the code). To describe a (potential and which would be contrary to what the MDN doc suggests) double lookup once every 60 minutes as "significant performance-wise" is quite exaggerated. I ran benchmarks for the new code and I found this was a complete non-issue.

The Firefox profiler if free for all to use, so there is no need to speculate if one wants to make the case the new capability in uBO is causing significant performance issues.

For the opening comment case, https://www.liberation.fr/, I find 25 distinct hostnames, of which 15 were blocked by uBO. This means 10 distinct hostnames will require an extra dns-lookup every 60 minutes assuming MDN documentation is wrong. How can this be construed as a "significant" performance issue?

Note that with DNS-based blocking, there is an extra 15 dns lookups (not present in uBO) in the above case just so the browser can find out the network requests need to be blocked. And since Firefox's TTL for dns results is only two minutes (as per about:networking#dns), these dns lookups will keep being made far more often than uBO's ones (there is no need for uBO's own TTL to be so small, keeping in mind filter lists update every ~4 days, it would still be perfectly reasonable for me to set uBO's TTL for dns lookup to even 4hr, 6hr even).

Copy link

h1z1 commented on 31 Dec 2019

As this will bite operations people in the ass it's worth pointing out the entire thing is BULLSHIT. A browser should not be able to do this. I understand what is trying to be solved here but DNS have TTL's for very valid operational reasons. Can't wait for this to affect Amazon or any site that actively changes entire datacenters, randomly, multiple times per day. At the very least is a very common practice to drop the ttl before maintenance for failovers. Maybe tech sites will finally realize the cluster fuck brewing.

Copy link

Member

gorhill commented on 31 Dec 2019

edited

Let's not dramatize.

uBO's TTL is for blocking filter purpose, not for web resource-fetching purpose, so not the same purpose as the browser's own TTL. The worst to expect is that an alias which could need to be blocked through a hit to its cname is not being blocked for an hour or two at most. uBO's block lists update every few days on average, meaning something which could need to be blocked may not be blocked for a few days (assuming a fix was made to the filter list) at most.

But I am just repeating what I said above, so I may resort to again lock this thread if the unwarranted points get made over and over again.

Copy link

h1z1 commented on 2 Jan

edited

Let's not dramatize.

There's no drama, unless facts are? It wasn't meant solely at you nor uBO. You are doing what you can given the circumstances.

But I am just repeating what I said above, so I may resort to again lock this thread if the unwarranted points get made over and over again.

Then let me be clear - the above was not a personal attack on you nor the project. It was a reflection on the fact that browsers should not be screwing with TTLs, CNAMES, nor anything infrastructure related. If some site is doing asinine things, those sites need to be identified and users warned. It's no different from the myriad of hoops we have to jump through with SSL.. It actually goes against your own philosophy.

The worst to expect is that an alias which could need to be blocked through a hit to its cname is not being blocked for an hour or two at most.

You're assuming a lot of things there but I'll leave that alone. Rest assured that is not the worst case.

Copy link

Member

gorhill commented on 3 Jan

It was a reflection on the fact that browsers should not be screwing with TTLs, CNAMES, nor anything infrastructure related. If some site is doing asinine things, those sites need to be identified and users warned.

Alright then, never mind -- I thought you were arguing that uBO's TTL for its own specific use (that of blocking purpose) of canonical names was inadequate given your comment "any site that actively changes entire datacenters, randomly, multiple times per day".

Copy link

h1z1 commented on 3 Jan

Correct, there is a very big difference however in why uBO does it then the browser. That doesn't change the original point though which is the browser doing something it shouldn't.

uBO makes a DNS lookup only for non-blocked network requests. A DNS lookup is required only for non-blocked request.

.. should still not happen.

Regarding the workaround itself, I intercept DNS locally for similar reasons, the difference there is I use a low TTL and it's on a network level not browser. Makes it fairly obvious when sites do crap like this. Either way it needs to be punted back to Mozilla'n co as not acceptable. What's to stop some dumbasses from exfiltrating internal DNS through the browser?

Every cache be it disk, web or dns is a vector of sidechannel attacks. There's nothing that can be done about it. Doesn't mean caches aren't useful, hope you understand the difference. Overriding the policies 99% of the internet follow because some sites do bad things is like throwing the cache away entirely because it can be abused.

This does also open an oddity wrt dns blacklists since I could spoof a result I know to be blocked and cause the browser to .. well, block it. DNS RBL's are nothing new, in a browser though that is a bit of a yikes.

# cat test.conf 
local=/google.com/
local=/bbc.co.uk/
cname=google.com,blackhole666
cname=www.google.com,blackhole666
cname=bbc.co.uk,blackhole666
cname=www.bbc.co.uk,blackhole666
#
# host  www.bbc.co.uk
www.bbc.co.uk is an alias for blackhole666.
blackhole666 has address 6.6.6.6
blackhole666 mail is handled by 666 6.6.6.6.
#
dnsmasq[23736]: 23 172.16.2.1/53316 query[A] www.bbc.co.uk from 127.0.0.1
dnsmasq[23736]: 23 172.16.2.1/53316 config www.bbc.co.uk is <CNAME>
dnsmasq[23736]: 23 172.16.2.1/53316 config blackhole666 is 6.6.6.6
#
# ip route get 6.6.6.6
local 6.6.6.6 dev lo table local src 6.6.6.6 uid 0 
    cache <local> 
# 

With a filter on blackhole66 it does indeed block www.bbc.co.uk and Google (cnameIgnoreRootDocument==false for sake of testing)

Copy link

Member

uBlock-user commented on 9 Jan

.. should still not happen.

That's the only way to properly address this issue and make filtering third-party servers disguising as first-party possible. This is not a question of whether we should do this or not, but rather finding the best possible way of addressing the ongoing situation at hand via uBO.

Copy link

h1z1 commented on 14 Jan

With that change now landing where are requests for inclusions to be sent? uBO or somewhere upstream?

Copy link

Member

gorhill commented on 14 Jan

requests for inclusions

What do you mean by "inclusions"?

Copy link

h1z1 commented on 15 Jan

Sorry I was thinking of cnameIgnoreList: That another list would be needed to track cnames specifically. Guessing more filters will be added to the existing lists though?

Just to clarify I'm not entirely against filter lists, I do have strong reservations about doing this because it makes implementing those lists on a national level, much easier. See piratebay domain takedowns for example. There's still the cache poison side of it but we're at a bit of a stalemate for now.

Copy link

Member

gorhill commented on 15 Jan

I was thinking of cnameIgnoreList

Not sure this setting will be kept for release, this was a setting I added to allow some control when the first draft of cname-related code was released in the dev build. I lean toward supporting a per-site switch (no UI though) to control whether cname should be revealed or not, as is done in uMatrix dev build.

Guessing more filters will be added to the existing lists though?

Not sure what you have in mind, the idea of uBO enforcing existing filters on un-aliased hostnames is specifically to remove the burden of filter list maintainers to have to worry about whether a hostname is aliased or not. For instance, f7ds.liberation.fr could be re-aliased to blah.liberation.fr and uBO would still catch the connection attempt to eulerian.net without filter list maintainers having to worry about the changed alias.

Copy link

h1z1 commented on 16 Jan

Not sure this setting will be kept for release

Good to know. What of the others?

The two scenerios I had are based on a users existing security posture. Those are default block or allow with a preference on default block and general defuse. Anything supporting existing policies would be preferred which is why I'm entirely against it. I'd go far enough to suggest either another filter flag or abuse an existing one (like 3rdparty as you've done) because there will be one offs either way (allow and deny) especially if those other options are not formalized yet. Otherwise defuse the fuck out of them like GA.

For instance, f7ds.liberation.fr could be re-aliased to blah.liberation.fr and uBO would still catch the connection attempt to eulerian.net without filter list maintainers having to worry about the changed alias.

Don't know what the relationship is between those entities, they could also rotate the target cname from liberation.fr itself. Or they could go the google route.

Copy link

reallyuniquename commented on 17 Jan

edited

@gorhill
Does it mean uBlock will try to decloak absolutely every requested domain (be it through cache or not) without relying on any lists? I mean how many platforms abuse that kind of tracking, a few dozens?

Is there a slight chance this feature will be ported to uBlock Legacy? Although some Firefox forks support webextensions they usually lack required webextensions DNS API.

Copy link

hawkeye116477 commented on 23 Jan

edited

Although some Firefox forks support webextensions they usually lack required webextensions DNS API.
So is there other Firefox fork than Waterfox which supports webextensions?
Anyway, I've good news, DNS API will come in next version of Waterfox Classic.

Copy link

remusao commented on 23 Jan

Although some Firefox forks support webextensions they usually lack required webextensions DNS API.
So is there other Firefox fork than Waterfox which supports webextensions?
Anyway, I've good news, DNS API will come in next version of Waterfox Classic.

Cliqz is based on Firefox and supports Webextensions :)

Copy link

h1z1 commented on 2 Feb

Worth noting more sites are now doing this, to the point seeing garbage like multiple CNAMES are being abused. ie

  • A? video-edge-ac9a6b.sjc02.hls.ttvnw.net. (55)
    CNAME spade.sci.twitch.tv., CNAME science-edge-external-prod-73889260.us-west-2.elb.amazonaws.com., A 35.165.58.36, A 52.26.104.14, A 44.228.74.162, A 52.34.148.174, A 35.166.240.77, A 52.37.175.109, A 52.34.160.100, A 52.38.76.16 (427)

Spade is blocked in many lists, while video-edge-* and science-edge-* rotate and are not. As mentioned elsewhere video-edge is the same convention twitch uses for real video chunks so blocking it isn't really an option.

At least in Firefox these do not show up in about:dns. Strange thing is they continue to show in in things that should be blocked. Ex Slashdot:

<script>
if (!window.is_euro_union) {
(function (s,o,n,a,r,i,z,e) {s['StackSonarObject']=r;s[r]=s[r]||function(){
 (s[r].q=s[r].q||[]).push(arguments)},s[r].l=1*new Date();i=o.createElement(n),
 z=o.getElementsByTagName(n)[0];i.async=1;i.src=a;z.parentNode.insertBefore(i,z)
 })(window,document,'script','https://www.stack-sonar.com/ping.js','stackSonar');
 stackSonar('stack-connect', '66');
}
</script>

Scripts are blocked yet it still resolves that.

Hmm, what's the final resolution here, how it got fixed?

Copy link

ghost commented on 27 Feb

What prevents you from writing a simple proxy server that rewrites every URL from the tracking script? I'm telling you, there's always a way to bypass a system.

Copy link

ghost commented on 27 Feb

The next step is to self-host the tracking script on your server and periodically change the domain. The tracking data will then be sent to the actual "tracking processor." I ain't surprised if that already exists.

uBlockOrigin

locked and limited conversation to collaborators

on 27 Feb

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Assignees

No one assigned

Projects

None yet

Milestone

No milestone

Linked pull requests

Successfully merging a pull request may close this issue.

None yet

35 participants
and others

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK