Privacy and Integrity Preserving Training Using Trusted Hardware
TL;DR
DNNの学習において、GPUなどのacceleratorを使いながら、TEE (Intel SGX)でinput dataのprivacyとcomputation integrityを担保するDarKnightを提案
TEEでは、customized matrix masking techniqueによりinput dataをエンコードすることと、非線形の計算 (ReLU, Maxpool) を行う
cipepser.icon maskingはTEEの中でやる感じです?下図のBlinding?
GPUでは、エンコードされたdataにDNNの線形の計算 (convolution, matrix multiplication) を行う
DarKnight can also detect any malicious activities of untrusted GPUs by its computation integrity feature. Furthermore, DarKnight can protect privacy and integrity even in the presence of a subset of colluding GPUs that try to extract information or sabotage the computation.
アーキテクチャ・実行ステップ
nrryuya.icon > 下記の説明が普通にわかりやすかった
https://gyazo.com/21cb98dcc22a27c1b8b9ba74edfc05a5
(1) A batch of training/inference input data set is encrypted by the client using mutually agreed keys with TEE and sent to the server.
(2) TEE decrypts the images and starts the encoding process.
(3) During the forward/backward pass of training, each layer requires linear and nonlinear operations. The linear operations are compute-intensive and will be offloaded to GPUs.
DarKnight’s encoding mechanism is used to seal the data before sending the data to GPU accelerators.
To seal the data, DarKnight uses the notion of a virtual batch, where K inputs and a random noise are linearly combined to form K + 1 coded inputs
The size of the virtual batch is limited by the size of the TEE memory that is necessary to encode K images, typically 4-8 images at a time.
(4) The encoded data is offloaded to GPUs for linear operation.
Each GPU receives at most one encoded data
(5) GPUs perform linear operations on different encoded data sets and return the results to TEE
(6) The TEE decodes the received computational outputs using DarKnight’s decoding strategy and then performs any non-linear operations within the TEE
(7). This process is repeated both for forward pass and backward propagation of each layer.
In a system with K0 GPUs and virtual batch size K, DarKnight can provide data privacy and computational integrity by tolerating up to M colluding GPUs, where $ K + M + 1 ≤ K0.
GPUに計算を任せるのに、Input dataのprivacyをどう実現しているのかという話
要するに、input dataの行列にノイズを入れている (matrix maskingと呼ばれる)
Kのinput (e.g., K枚の画像) から、K + 1のcoded input (virtual batch) を生成する
Kは、encodingの際のTEEのメモリサイズに依存する。4-8くらい。
Related work
https://gyazo.com/84f96219533d517888e37c364a77681c
nrryuya.icon > TEE * CPU系の、その他
Graviton (OSDI'18) では、SGXなどのCPUのTEEとの組み合わせによりGPUでTEEをサポートするアーキテクチャを提案 (Trusted GPUs). GPUのfirmwareをいじる必要. Properties
Threat Model
GPUs receive data from TEE, they may use known techniques to extract information about the original data or inject faults in the computation.
Moreover, a subset of colluding GPUs may try to extract information by collaborating with each other or inject faults to sabotage the training.
cipepser.icon トレーニングおサボりが脅威なのか
nrryuya.icon > GPUの一定割合が信用ならんというモデルです
Perfect privacy (単精度演算のエラー程度に、攻撃者が得られる情報量がboundされる)達成と主張https://gyazo.com/00f4b44e5c01218abf9e129559454011
cipepser.icon MLわからなすぎるんですが、GPUで学習するときって倍精度じゃなくて、単精度が一般的なんです?
nrryuya.icon > GPUがそもそも単精度のやつを使うのが多いと思います
Encoding/Decoding
Key Insights: 畳み込みなどのDLのメインの処理は双線形性(片方の成分固定するともう片方で線形写像になるやつ)があるので、Kのencoded inputに対する計算は、Kの別の線形演算から復元できる
nrryuya.icon > 引用されていた中で最も新しい手法の論文
Qian Yu, Songze Li, Netanel Raviv, Seyed Mohammadreza Mousavi Kalan, Mahdi Soltanolkotabi, Salman Avestimehr
Experiments
Setup
DarKnight server consisted of an Intel Coffee Lake E-2174G 3.80GHz processor and Nvidia GeForce GTX 1080 Ti GPUs.
The server has 64 GB RAM and supports Intel Soft Guard Extensions (SGX).
DarKnight’s training scheme and the related unique coding requirements are implemented as an SGX enclave thread where both the decoding and encoding are performed.
For SGX implementations, we used Intel Deep Neural Network Library (DNNL) for designing the DNN layers including the Convolution layer, ReLU, MaxPooling, and Eigen library for Dense layer.
We used Keras 2.1.5, Tenseflow 1.8.0, and Python 3.6.
https://gyazo.com/fbba9b9b8d9e435afaea75ce7657a5af
DarKnight relative to the baseline fully implemented on SGX with K = 2 images encoded and offloaded to 3 GPUs.
Effect of Random Noise on Accuracy
https://gyazo.com/751a51696d3898fb505151b59fbc28f7
cipepser.icon ランダムノイズ加えてもAccuracyが変わらないぜ!って結果です?MI upper bound is?