John Fremlin's blog: Android app shenanigans in 2016

 2 years ago
source link: http://john.freml.in/android-app-shenanigans-in-2016
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Posted 2016-01-25 03:56:00 GMT

Discussing privacy and apps, my friend Jinyang told me about study he'd worked on called Who Knows What About Me? A Survey of Behind the Scenes Personal Data Sharing to Third Parties by Mobile Apps. This made me curious about what my own phone was doing. Fortunately, on Android you can gain administrator access to your device (root) through semi-supported mechanisms, and then use standard Linux sysadmin tools to figure out what's going on. The excellent SSHelper by Paul Lutus allows one to login conveniently via ssh. It was snowing here in NYC so I had plenty of time over the weekend to dig in.

First, I went through my Android Google Play Store app history and tried to install all the apps I'd ever used, total around 400. I ended up with only 181 installed apps in /data/app though, and 48 in /system/app, as the Play store crashed a few times.

Then I had a look at what services were actively listening for network connections (by running netstat -l -p -W). These programs are waiting for external parties to connect to the phone in some way, great in the case of the SSHelper program that I installed, because that's exactly what I wanted it for, but other programs are doing it without my consent and it's unclear for whose benefit.

Disabling information leak from Samsung SAP on port 8230. There was also a com.samsung.accessory.framework listening on port 8230. Turns out that this service is related to my Samsung watch, and if you connect to the port it'll give the model of my phone without authentication: XT1575;motorola;Moto X Pure;SWatch;SAP_... — given that the Samsung software running on the watch is written so sloppily that you sometimes have to reboot it to see the correct time, and the watch is set to connect via Bluetooth, I don't want to let anybody on the Internet have a go at vandalising my phone through this unnecessary service. Pretty easy to disable by running su iptables -A INPUT -p tcp --dport 8230 -m state --state NEW,ESTABLISHED -j DROP on the phone. This doesn't seem to affect the behaviour of the watch.

Local Facebook HTTP servers. There are two servers running on the phone from Facebook main app and Messenger, on ports 38551 and 38194 claiming to be GenericHttpServer. These are only accessible to apps on the phone. I won't comment more on these as I used to work at Facebook.

Local Android services. There are several processes like the Android debugging daemon running locally on port 5037, and the Low Memory Killer Daemon, and the Zygote app starting daemon and so on listening on UN*X sockets.

To see traffic lists, I ran grep [0-9] /proc/uid_stat/*/* after a reboot to dump the traffic usage. The uids can be linked to apps via /data/system/packages.xml, which I did via a quick Python script. There are some uids shared between packages. Oddly enough, my LIFX light app seemed to be all over the Internet. Snapchat was using the most data but I have fairly active account (@vii) that's open to non-friends so please message away. Another heavy app was S Health, especially annoying as I had turn off sync for it in settings. Also the id shared by com.google.android.gsf, com.google.android.gms, com.google.android.backuptransport, com.google.android.gsf.login was very active. Looking at netstat -p -W showed com.google.android.gms.persistent in regular contact with Google IPs (1e100.net). I set up traffic dumps from mitmproxy which showed polling of Google servers apparently about the location service and checking login status on https://android.clients.google.com/auth.

Stop apps running in the background unless they benefit you. The practice of many apps, even from fairly reputable companies, like the Amazon Shopping app, the Bloomberg app, the Etsy app, etc. to wake up and start using the Internet in the background is very damaging to battery life. These apps are communicating for their own interests, not mine, as far as I can see. The general pattern is to send up as much as can be gleaned about your phone as possible (for example, the Kindle app sends up tons of OpenGL information) — great for developers to understand their app install base. It's easy and convenient to crack down on them with the Greenify app, which unfortunately is an app and does its own tracking (quis custodiet ipsos custodes?). However, from the command line the dumpsys power command shows the apps busy in the background or holding wakelocks so you can do it by hand if you want.

The main contribution from the original paper that Jinyang co-authored was an analysis of the sorts of information that apps shared to their owners. It seems his methodology did not allow identifying which apps were responsible for the network traffic and indeed this is theoretically hard because an app can ask another app for something, but it's at least possible to figure out the app that made the network call. This can actually be done quite robustly and unintrusively with Android and iptables, by giving each app (uid) a separate IP address: use ifconfig wlan0:$uid $uid_ip to create an IP address for the uid, iptables POSTROUTING SNAT --to-source $uid_ip to mark traffic as coming from that IP. Unfortunately, this is was a little fiddly because I never mirrored the setup to IPv6 (just disabled IPv6 via /proc/sys/net/ipv6/conf/all/disable_ipv6).

Looking at a few games, they would eat a surprising amount of traffic. For an example, RopeFly used >50MB just starting up, asking androidads21.adcolony.com for assets, a plethora of tracking feedback links for measurementapi.com and then downloading a ton of video ad content from cloudfront, which it didn't show me.

My investigation was done over the snow weekend in New York, and there's obviously a lot more to dig into here: to watch more apps over a longer time with the one IP per app tracing, to use an mitmproxy like tool with support for SPDY and HTTP/2, and to disentangle some obvious shenanigans (for example, Foursquare was using some sort of obfuscation for its logs).

Despite having been involved in mobile app development for years, I was very surprised at how battery and data unfriendly popular apps are. The scheduled polling and dumping of device state might be convenient for managing the operational aspects of an app, but cost the install base battery life and mobile data — the tiny data caps even on unlimited lines in the US makes the second a real issue despite the low traffic cost to the people receiving the tracking data. After installing the apps, my phone heated up and my battery drained incredibly fast (almost as bad as the old days with an iPhone 5) but the battery tracking in the Android settings menu was very slow to assign blame to any culprit and hugely underestimated the overall impact they had.

Some ideas for our friends working on the Android platform (and of course, huge thanks to them for bringing Linux to our pockets):

— more aggressively attribute the battery cost for using mobile data connections and keeping connections open (seems to be accounted under non-app headings now);

— attribute the battery cost for apps that use wifi while not charging;

— all that's difficult: why not, by default, prevent apps from waking up in the background without the user's explicit consent? This should be a big permission with an easy toggle. There are a few apps that improve the user experience from this, like podcast downloaders (and that's great). Most apps don't. Until then, I guess we can install Greenify.

Let me know your tips, tricks and Android app advice! My phone is back to a reasonable temperature now — but what have I missed?

About Joyk

Aggregate valuable and interesting links.
Joyk means Joy of geeK