32

Protect your Deep Neural Network by Embedding Watermarks!

 4 years ago
source link: https://towardsdatascience.com/protect-your-deep-neural-network-by-embedding-watermarks-ed4898ec4ad7?gi=713b1f0490d6
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

We have intellectual property (IP) protection watermarks on media contents such as images, musics and etc. How about Deep Neural Network (DNN)?

What is watermark?

A watermark is like an identity given to your media content, e.g. you draw a free content and upload to a media platform, and you will put a signature on the content or just putting some logos on the content. This is to identify that the content is made by you and people who use your content should pay you some money.

We can apply the same to DNN since the improvement of DNN is going to improve every year and a lot of companies start using DNN in their businesses.

Why do we need to embed watermarks on DNN?

Lets say you have invested a lot of resources (e.g. time, GPU) on creating powerful models, then you release it to your github repository, but now some bad people just take your model to do business without asking you like granted, you found out that bad people who used your model but you have no proof on it because the model has no watermark on it. Just like how free contents are shared online, not all people give credits :(

Remember Samsung paid Apple for infringing on design patents? The same case can apply for a company (e.g. company A) to sue other companies who use a DNN released (or even stolen) from company A without paying copyright fee.

Next we are going to talk about common approaches to protect a DNN by embedding watermarks.

White-box approaches

There are a lot of approaches using white-box approach, I will talk about the easiest method, while the other approaches are similar process, just different ways to embed.

White-box Embedding

To protect a DNN model by embedding watermark, in the approach suggested by [1], they were using a transformation matrix to perform watermark embedding.

During the training stage, the model is training on the original classification task, however, the model has one more objective, which is to embed watermark .

The author first choose which layer in a DNN to embed the desired watermark (e.g. binary data)

AZnEJr3.png!web

Then, the weights in the selected layers will go through a matrix multiplication with the transformation matrix, to get desired number of bits of information e.g. 64-bit.

7NfUZjE.png!web

dot between weights and transformation matrix

Both the weights and transformation matrix will be updated through author designed loss function (i.e. a loss function to embed the 64-bits information correctly) while training on original task (i.e. classification)

White-box Detection

For big company, they might be collecting proof from everywhere so that they can sue a suspected company who used their DNN models illegally. Once they have proof, they will need to have a verification process which is to extract watermark from the DNN model and compare if the watermark is from the big company.

They are basically doing the same thing to extract the watermark that they were doing during training.Performing the dot operation between the flatten weights and the transformation matrix again, then the watermark will be extracted.

However, this process is a white-box verification, which means they need to have to access the model physically , usually might need to go through law enforcement.

Black-box approaches

After reading some research papers [2,3,4], I realized the black box way is similar among all the papers. The example I will be talking is from [2].

Black-box Embedding

During training stage, the training tasks are separated into two:

  1. original classification task
  2. trigger set task

What is a trigger set task? It is actually a list of data wrongly labelled by purpose.

IbI3uq3.png!web

Example trigger set data wrongly labelled

The wrongly labelled data is a kind of watermark, the objective is to let model to “memorize” the exact input and labels, and this kind of memorization formed a watermark embedding effect. Although it might affect the feature learning of the model , but then there are some alternative solutions in [3].

The wrongly labelled data are combined with the original dataset, then will go through the original training objective (e.g. cross entropy)

Black-box Detection

This way of watermark embedding is actually better than white-box embedding approach in terms of verification. It is because you can just submit a list of trigger set data as query to the machine learning online service (e.g. the thief stole your model and created a similar service as yours)

NZfYZfU.png!web

a typical black-box verification

After query to the ML online service through API calls, you will have expected labels. If the expected labels are matching with original wrong labels, then you can confirm that this ML online service is using your model because it is impossible to have exact matching (or with high accuracy) on your trigger set data. If a model is not stolen from yours, then the model should be able to classify the cat image as a cat, but not a dog.

Conclusion

We need to protect our DNN model in case of other people stole our credit without paying us! We can have a black-box verification to have an initial suspect on the thief and then we can perform white-box verification through law enforcement after we report to the police. (Although I think this is just wars between big companies :sweat_smile:)

Hope now you have more understanding on embedding watermarks into DNN. Thank you for reading!

References

[1] Embedding Watermarks into Deep Neural Networks. https://arxiv.org/abs/1701.04082

[2] Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring. https://arxiv.org/abs/1802.04633

[3] Protecting Intellectual Property of Deep Neural Networks with Watermarking. https://dl.acm.org/citation.cfm?id=3196550

[4] DeepSigns: A Generic Watermarking Framework for Protecting the Ownership of Deep Learning Models. https://arxiv.org/abs/1804.00750


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK