

Notes on Differential Privacy
source link: https://thewebivore.com/notes-on-differential-privacy/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

I recently saw an interesting demo at work about differential privacy, and thought I should write down some notes! (and I’ve bugged the speaker to publish an article, yes). The concept of differential privacy reminded me of Mozilla’s Lean Data Practices, although I think they serve different purposes … Lean Data asks if you should collect anything in the first place, and I’d say differential privacy gives you a technique to collect data while respecting the user’s privacy.
What’s differential privacy?
So you want to collect data about your users in order to make better product decisions. Collecting data about your users is creepy and also Not Cool to be a way to expose your user’s private data. How can you deal with this? Differential privacy!
As you gather up the data, you add “noise” to the data, so that it’s not quite what it was before.
The cool trick about this, is that once you have large amounts of data, you’re able to get valuable statistics out of it … without having stored user-specific-real data
How might I implement it?
The differential-privacy repo has a few implementations that make it easier to implement this strategy, including a Go implementation.
Some notes on epsilon
I hope the speaker does publish their article, because rather than recreating their demo here, I’m going to write down my favorite note from it: epsilon.
Real systems have a privacy budget — each bit of data you collect will decrease the privacy budget. That’s because each bit of data you collect makes it a bit more likely you can correlate a piece of data with a particular user (thus compromising their privacy). That’s where the epsilon value comes in. Epsilon is the value in this equation that allows you to tune how much noise you add to the data. More noise: more secure, but less accurate, and vice versa.
A question I have reading the type definition for some options you can pass in the dpagg package — I’m curious what the delta value means here? If epsilon influences how much noise to apply, what is the delta? Is that how much difference we’re aiming for between the real and the more-privacy-aware data?
I want to try it!
There’s an example in the differential-privacy repo so you can walk through using that library, if you like.
Leave a Reply Cancel reply
Your email address will not be published. Required fields are marked *
Comment
Name *
Email *
Website
Save my name, email, and website in this browser for the next time I comment.
This site uses Akismet to reduce spam. Learn how your comment data is processed.
Post navigation
Recommend
-
48
Uber's differential privacy .. probably isn't Today we are going to talk through a recently accepted VLDB paper, Toward Practical Differential Privacy...
-
48
README.md Differential Privacy This project contains a C++ library of ε-differentially private algorithms, which can be used to produce aggre...
-
26
Arbout Arbout is a database & differential privacy search engine for arbitration outcomes. There's a live server hosted at arbout.org . Why differential pr...
-
19
Data not only drives our modern world; it also bears enormous potential. Data is necessary to shape creative solutions to critical challenges including climate change, terrorism, income and racial inequality, and COVID-19....
-
6
Differential privacy and correlated data A recent blog post announces:
-
11
Feature Differential privacy: Pros and cons of enterprise use cases Hiding sensitive data in a sea of noise might have more value than encryption in some...
-
7
Google releases differential privacy tools to commemorate Data Privacy Day
-
8
Launch HN: Sarus (YC W22) – Work on sensitive data with differential privacy Launch HN: Sarus (YC W22) – Work on sensitive data with differential privacy...
-
7
29 Mar 2021 Can differential privacy protect our privacy? I’m a mediocre engineer who does systems work and never had experience in the typical user-facing software space. I’ve con...
-
4
May 18, 2022 ...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK