版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
Data
Extraction
and
Model
Stealing姜育剛,馬興軍,吳祖煊Recap:
week
7A
Brief
History
of
Backdoor
LearningBackdoor
AttacksBackdoor
DefensesFuture
ResearchThis
WeekData
Extraction
Attack
&
DefenseModel
Stealing
AttackFuture
ResearchThis
WeekData
Extraction
Attack
&
DefenseModel
Stealing
AttackFuture
ResearchData
Extraction
Attack通過(guò)模型逆向得到訓(xùn)練數(shù)據(jù):8001/dss/imageClassify
TerminologyThe
following
terms
describe
the
same
thing:Data
Extraction
AttackData
Stealing
AttackTraining
Data
Extraction
AttackModel
Memorization
AttackModel
Inversion
AttackSecurity
ThreatsMysocialsecuritynumberis078-Personal
Info
LeakageSensitive
Info
LeakageThreats
to
National
SecurityIllegal
Data
Trading…Memorization
of
DNNsEvidence
1:
DNN
learns
different
levels
of
representationsMemorization
of
DNNsEvidence
2:
DNN
can
memorize
random
labels/pixels真實(shí)標(biāo)簽隨機(jī)標(biāo)簽亂序像素隨機(jī)像素高斯噪聲Zhang,Chiyuan,etal.“Understandingdeeplearningrequiresrethinkinggeneralization.”ICLR
2017.Memorization
of
DNNsEvidence
3:
The
success
of
GANs
and
diffusion
models/;
/
Intended
vs.
Unintended
MemorizationIntended
MemorizationTask-relatedStatisticsInputs
and
LabelsArpitetal.“Acloserlookatmemorizationindeepnetworks.”
ICML,2017.
Carlinietal.“Thesecretsharer:Evaluatingandtestingunintendedmemorizationinneuralnetworks.”USENIXSecurity,2019.第一層Filter正常CIFAR-10第一層Filter隨機(jī)標(biāo)注CIFAR-10自然語(yǔ)言翻譯模型記憶:“我的社保號(hào)碼是xxxx”Unintended
MemorizationTask-irrelevant
but
memorizedEven
appear
only
a
few
times出現(xiàn)4次就能全記住現(xiàn)有數(shù)據(jù)竊取攻擊黑盒竊取主動(dòng)測(cè)試:煤礦里的金絲雀“隨機(jī)號(hào)碼為****”“我的社保號(hào)碼為****”主動(dòng)注入,然后先兆數(shù)據(jù)在語(yǔ)言模型中的“曝光度”(Exposure)Carlinietal.“Thesecretsharer:Evaluatingandtestingunintendedmemorizationinneuralnetworks.”USENIXSecurity,2019.意外記憶測(cè)試和量化:’先兆’黑盒竊取針對(duì)通用語(yǔ)言模型:逆向出大量的:名字、手機(jī)號(hào)、郵箱、社保號(hào)等大模型比小模型更容易記住這些信息即使只在一個(gè)文檔里出現(xiàn)也能被記住Carlini,Nicholas,etal.“Extractingtrainingdatafromlargelanguagemodels.”
USENIXSecurity,2021.訓(xùn)練數(shù)據(jù)萃取攻擊Training
Data
Extraction
AttackDefinition
of
MemorizationCarlini,Nicholas,etal.“Extractingtrainingdatafromlargelanguagemodels.”
USENIXSecurity,2021.模型知識(shí)提取k-逼真記憶攻擊步驟Carlini,Nicholas,etal.“Extractingtrainingdatafromlargelanguagemodels.”
USENIXSecurity,2021.步驟1:生成大量文本;步驟2:文本篩選和確認(rèn)實(shí)驗(yàn)結(jié)果604條“意外”記憶只在一個(gè)文檔里出現(xiàn)的記憶模型越大記憶越強(qiáng)Memorization
ofDiffusion
Models美國(guó)馬里蘭大學(xué)和紐約大學(xué)聯(lián)合研究發(fā)現(xiàn),生成擴(kuò)散模型會(huì)記憶原始訓(xùn)練數(shù)據(jù),導(dǎo)致在特定文本提示下,泄露原始數(shù)據(jù)生成的:原始的:Memorization
ofDiffusion
ModelsDefinition
of
Replication:Wesaythatageneratedimagehasreplicatedcontentifitcontainsanobject(eitherintheforegroundorbackground)thatappearsidenticallyinatrainingimage,neglectingminorvariationsinappearancethatcouldresultfromdataaugmentation.Somepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Memorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.OriginalSegmixDiagonal
OutpaintingPatch
OutpaintingCreate
Synthetic
and
Real
DatasetsExisting
image
retrieval
datasets:OxfordParisINSTREGPR1200Memorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Train
Image
Retrieval
ModelsMemorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Similarity
metric:
inner
product
token-wise
inner
productDiffusion
model:
DDPMDataset:
Celeb-AThe
top-2
matches
of
diffusion
models
trained
on
300,
3000,
and
30000
images
(the
full
set
is
30000).Results:Green:
copyBlue:
close
but
no
exact
copyOthers:
similar
but
not
the
sameMemorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Gen-train
vs
train-train
similarity
score
distribution數(shù)據(jù)越少Copy越多Memorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Many
close
copy
but
no
exact
match
(similarity
score
<0.65)Case
study:
ImageNet
LDMMost
similar:
theatercurtain,peacock,andbananasLeast
similar:
sealion,bee,andswingMemorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Case
study:
StableDiffusionLAIONAestheticsv26+:
12M
imagesRandom
select
9000
images
as
source
and
use
their
captions
to
promptMemorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Case
study:
StableDiffusionSome
keywords
(those
in
red)
are
associated
with
certain
fixed
patterns.
Key
wordsMemorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Case
study:
StableDiffusionStyle
copying
using
text
prompt:
<Name
of
the
painting>
by
<name
of
the
artist>Memorization
of
Large
Language
Models
(LLMs)Shi,Weijia,etal."DetectingPretrainingDatafromLargeLanguageModels."
arXivpreprintarXiv:2310.16789
(2023).PretrainingdatadetectionMIN-K%PROBMemorization
of
Large
Language
Models
(LLMs)Shi,Weijia,etal."DetectingPretrainingDatafromLargeLanguageModels."
arXivpreprintarXiv:2310.16789
(2023).Detection
on
WIKIMIAA
dynamic
benchmark:
WIKIMIA白盒竊取白盒竊取需要利用梯度信息,也稱梯度逆向攻擊(Gradient
Inversion
Attack)針對(duì)梯度共享的訓(xùn)練:分布式訓(xùn)練聯(lián)邦學(xué)習(xí)并行訓(xùn)練無(wú)中心化訓(xùn)練兩種分布式訓(xùn)練范式白盒竊取白盒竊取需要利用梯度信息,也稱梯度逆向攻擊(Gradient
Inversion
Attack)Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.迭代逆向(逐層)遞歸逆向逼近反推白盒竊?。旱嫦虻嫦颍和ㄟ^(guò)構(gòu)造數(shù)據(jù)來(lái)接近真實(shí)梯度真實(shí)梯度,假設(shè)已知一次前傳兩次后傳生成數(shù)據(jù)產(chǎn)生的梯度
Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.白盒竊?。旱嫦蛞延泄ぷ鲄R總Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.白盒竊?。哼f歸逆向遞歸逆向:基于真實(shí)梯度追層逆向推導(dǎo)關(guān)鍵點(diǎn):圖像大?。?2x32)Batch大小(大多為1)模型大小真實(shí)梯度,已知Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.白盒竊?。哼f歸逆向已有工作匯總Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.白盒防御已有工作匯總Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.This
WeekData
Extraction
Attack
&
DefenseModel
Stealing
AttackFuture
ResearchAI模型訓(xùn)練代價(jià)高昂BERTGoogle$160萬(wàn)大規(guī)模、高性能的AI模型訓(xùn)練耗費(fèi)巨大數(shù)據(jù)資源計(jì)算資源人力資源模型竊取的動(dòng)機(jī)巨大的商業(yè)價(jià)值盡量保持模型性能不希望被發(fā)現(xiàn)寶貴的AI模型模型竊取為其所用模型竊取的方式輸入輸出模型微調(diào)模型剪枝竊取攻擊StealingmachinelearningmodelsviapredictionAPIs,
USENIXSecurity,
2016;
Practicalblack-boxattacksagainstmachinelearning,
ASIACCS,
2017;
Knockoffnets:Stealingfunctionalityofblack-boxmodels,
CVPR,
2019;
Maze:Data-free
modelstealing
attackusingzeroth-ordergradientestimation,
CVPR,
2021;基于方程式求解的攻擊攻擊思路示例基于方程式求解的攻擊Tramèr,Florian,etal."Stealingmachinelearningmodelsviaprediction{APIs}."
USENIXSecurity,2016.100%竊取某些商業(yè)模型所需的查詢數(shù)和時(shí)間基于方程式求解的攻擊:竊取參數(shù)攻擊算法參數(shù)個(gè)數(shù)為d通過(guò)d+1個(gè)輸入,構(gòu)造d+1個(gè)下列方程
主要特點(diǎn):針對(duì)傳統(tǒng)機(jī)器學(xué)習(xí)模型:SVM、LR、DT可精確求解,需要模型返回精確的置信度竊取得到的模型還可能泄露訓(xùn)練數(shù)據(jù)(數(shù)據(jù)逆向攻擊)Tramèr,Florian,etal."Stealingmachinelearningmodelsviaprediction{APIs}."
USENIXSecurity,2016.基于方程式求解的攻擊:竊取超參Wang,Binghui,andNeilZhenqiangGong."Stealinghyperparametersinmachinelearning."
S&P,2018.攻擊思想:模型訓(xùn)練完了的狀態(tài)應(yīng)該是Loss梯度為0
基于替代模型的攻擊Orekondy
et
al."Knockoffnets:Stealingfunctionalityofblack-boxmodels."
CVPR,2019.攻擊思想:在查詢目標(biāo)模型的過(guò)程中訓(xùn)練一個(gè)替代模型模擬其行為基于替代模型的攻擊Orekondy
et
al."Knockoffnets:Stealingfunctionalityofblack-boxmodels."
CVPR,2019.Knockoff
Nets攻擊:“仿冒網(wǎng)絡(luò)”基于替代模型的攻擊Knockoff
Nets攻擊:攻擊流程采樣大量查詢樣本訓(xùn)練替代模型強(qiáng)化學(xué)習(xí),學(xué)習(xí)如何高效選擇樣本Orekondy
et
al."Knockoffnets:Stealingfunctionalityofblack-boxmodels."
CVPR,2019.基于替代模型的攻擊Jagielski,Matthew,etal.“Highaccuracyandhighfidelityextractionofneuralnetworks.”
USENIXSecurity,2020.高準(zhǔn)確(accuracy)vs高保真(fidelity)竊取攻擊藍(lán)色:目標(biāo)決策邊界橙色:高準(zhǔn)確竊取綠色:高保真竊取基于替代模型的攻擊Jagielski,Matthew,etal.“Highaccuracyandhighfidelityextractionofneuralnetworks.”
USENIXSecurity,2020.高準(zhǔn)確(accuracy)vs高保真(fidelity)竊取攻擊目標(biāo)模型(黑盒)查詢圖片替代模型模型輸出作為標(biāo)簽指導(dǎo)替代模型訓(xùn)練
概率輸出類別輸出基于替代模型的攻擊Jagielski,Matthew,etal.“Highaccuracyandhighfidelityextractionofneuralnetworks.”
USENIXSecurity,2020.功能等同竊取FunctionallyEquivalentExtraction攻擊步驟:尋找在某個(gè)Neuron上,讓ReLU=0的關(guān)鍵點(diǎn)在關(guān)鍵點(diǎn)兩側(cè)探索邊界,確定對(duì)應(yīng)權(quán)重只能竊取兩層網(wǎng)絡(luò)基于替代模型的攻擊Carlini
et
al."Cryptanalyticextractionofneuralnetworkmodels."
AnnualInternationalCryptologyConference,2020.加密分析竊取CryptanalyticExtraction思想:ReLU的二級(jí)導(dǎo)為0
&有限差分(finite
difference)ReLU=0基于替代模型的攻擊加密分析竊取CryptanalyticExtraction竊取0-deep神經(jīng)網(wǎng)絡(luò):竊取1-deep神經(jīng)網(wǎng)絡(luò):Carlini
et
al."Cryptanalyticextractionofneuralnetworkmodels."
AnnualInternationalCryptologyConference,2020.基于替代模型的攻擊Yuan,Xiaoyong,eta
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 軟件安全設(shè)計(jì)評(píng)估報(bào)告范本
- 浙江省麗水市青田縣2023-2024學(xué)年五年級(jí)上學(xué)期英語(yǔ)期末試卷
- 石材固色劑知識(shí)培訓(xùn)課件
- 塑造五種心態(tài)培訓(xùn)課件4
- 年產(chǎn)6萬(wàn)噸飼用餅干粉和面包粉項(xiàng)目可行性研究報(bào)告寫作模板-申批備案
- 二零二五年度地產(chǎn)公司建筑工程合同風(fēng)險(xiǎn)評(píng)估與防控策略3篇
- 禮儀知識(shí)培訓(xùn)課件
- 二零二五年度辦公樓主體結(jié)構(gòu)施工與智慧安防系統(tǒng)合同3篇
- 中國(guó)大陸自閉癥干預(yù)方法研究綜述
- Unit 9 Can you come to my party Section A 1a~1c 說(shuō)課稿 -2024-2025學(xué)年人教版八年級(jí)英語(yǔ)上冊(cè)
- 口腔頜面外科學(xué) 09顳下頜關(guān)節(jié)疾病
- 臺(tái)達(dá)變頻器說(shuō)明書(shū)
- 2023年廣東羅浮山旅游集團(tuán)有限公司招聘筆試題庫(kù)及答案解析
- DB11-T1835-2021 給水排水管道工程施工技術(shù)規(guī)程高清最新版
- 解剖篇2-1內(nèi)臟系統(tǒng)消化呼吸生理學(xué)
- 《小學(xué)生錯(cuò)別字原因及對(duì)策研究(論文)》
- 北師大版七年級(jí)數(shù)學(xué)上冊(cè)教案(全冊(cè)完整版)教學(xué)設(shè)計(jì)含教學(xué)反思
- 智慧水庫(kù)平臺(tái)建設(shè)方案
- 系統(tǒng)性紅斑狼瘡-第九版內(nèi)科學(xué)
- 全統(tǒng)定額工程量計(jì)算規(guī)則1994
- 糧食平房倉(cāng)設(shè)計(jì)規(guī)范
評(píng)論
0/150
提交評(píng)論