版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡介
Adversarial
Examples姜育剛,馬興軍,吳祖煊
1.
Deep
Neural
Networks
2.
Explainable
Machine
LearningRecap:
week
2Principles
and
MethodologiesLearning
DynamicsThe
Learned
ModelInferenceGeneralizationRobustness
to
Common
CorruptionsThis
Week
1.
Adversarial
Examples
2.
Adversarial
Attacks
3.
Adversarial
Vulnerability
UnderstandingMachine
Learning
Is
EverywhereMedicineandBiologySecurityandDefenseAutonomousVehicleIoTFinancialSystemMachineLearningMediaandEntertainmentCriticalInfrastructureBeat
Humans
on
Many
TasksSpeechRecognitionBaiduDeepSpeech2:End-to-endDeepLearningforEnglishandMandarinSpeechRecognitionEnglishandMandarinspeechrecognitionTransitionfromEnglishtoMandarinmadesimplerbyend-to-endDLNofeatureengineeringorMandarin-specificsrequiredMoreaccuratethanhumansErrorrate3.7%vs.4%forhumantestshttp://svail.github.io/mandarin//pdf/1512.02595.pdfOutperform
Human
on
Many
TasksStrategicGamesAlphaGo:
FirstComputerProgramtoBeataHumanGoProfessionalTrainingDNNs:3weeks,340milliontrainingstepson50GPUsPlay:Asynchronousmulti-threadedsearchSimulationsonCPUs,policyandvalueDNNsinparallelonGPUsSinglemachine:40searchthreads,48CPUs,and8GPUsDistributedversion:40searchthreads,1202CPUsand176GPUsOutcome:BeatbothEuropeanandWorldGochampionsinbestof5matches/nature/journal/v529/n7587/full/nature16961.htmlOutperform
Human
on
Many
TasksStrategicGamesAlphaGo:
FirstComputerProgramtoBeataHumanGoProfessionalTrainingDNNs:3weeks,340milliontrainingstepson50GPUsPlay:Asynchronousmulti-threadedsearchSimulationsonCPUs,policyandvalueDNNsinparallelonGPUsSinglemachine:40searchthreads,48CPUs,and8GPUsDistributedversion:40searchthreads,1202CPUsand176GPUsOutcome:BeatbothEuropeanandWorldGochampionsinbestof5matches/nature/journal/v529/n7587/full/nature16961.htmlOutperform
Human
on
Many
TasksLarge-scale
Image
RecognitionDALL·E2AlphaFold
V2LargeLanguageModel(LLM):ChatGPT
LargeMultimodelModel:GPT-4OpenAI在2023年3月發(fā)布的多模態(tài)對話大模型,能夠接受圖像和文本輸入,并輸出文本,具有超出ChatGPT的圖文理解能力、運(yùn)算能力、代碼生成能力、以及很多專業(yè)考試能力。參數(shù)量:1萬億基礎(chǔ)模型:GPT-4訓(xùn)練數(shù)據(jù):
在GPT-3.5、ChatGPT基礎(chǔ)之上增加了多模態(tài)數(shù)據(jù)、更多的人工標(biāo)注數(shù)據(jù)等等Outperform
Human
on
Many
TasksImageRecognitionGoogLeNet:/people/karpathy/ilsvrc/LabrapoodleorFriedchickenSheepdogorMopBarnowlorAppleRawchickenorDonaldTrumpParrotorGuacamoleVulnerabilities
of
DNNsDog,82%confidenceOstrich,98%confidenceVulnerabilities
of
DNNsDog,82%confidenceAdversarial
ExamplesSzegedyC,ZarembaW,SutskeverI,etal.Intriguingpropertiesofneuralnetworks[J].ICLR
2014.GoodfellowIJ,ShlensJ,SzegedyC.Explainingandharnessingadversarialexamples[J].ICLR
2015.Small
perturbations
can
fool
DNNsAdversarial
AttackSzegedyC,ZarembaW,SutskeverI,etal.Intriguingpropertiesofneuralnetworks[J].ICLR
2014.GoodfellowIJ,ShlensJ,SzegedyC.Explainingandharnessingadversarialexamples[J].ICLR
2015.
DNN
Training:
Adversarial
Attack:
Misclassification
test
time
attack
Adversarial
Attack123Characteristics
of
Adversarial
ExamplesSmallImperceptibleHiddenTransferUniversalAdversarial
ExamplesExample
AttacksPerturbationsaresmall,imperceptibletohumaneyes.Adversarialexamplesareeasytogenerateandtransferacrossmodels.Maetal.,
“UnderstandingAdversarialAttacksonDeepLearningBasedMedicalImageAnalysisSystems”,Pattern
Recognition,
2021.BenignNevus,73%confidenceAdversarialnoiseMalignantNevus,89%confidenceExample
AttacksCleanvideoframes:
Correct
ClassAdversarial
video:
Wrong
ClassJiangetal.,
“Black-boxAdversarialAttacksonVideoRecognitionModels”,ACMMM,
2019.Example
AttacksEykholt,Kevin,etal.“Robustphysical-worldattacksondeeplearningvisualclassification.”
CVPR,2018.Physical-world
attacks
against
traffic
signsScience
Museum
at
LondonStop
signs
recognized
as
45km
speed
limitExample
AttacksAthalye,Anish,etal."Synthesizingrobustadversarialexamples."
ICML,2018.3D
printed
turtle
recognized
as
a
rifle
from
any
angle
Example
AttacksBrown,TomB.,etal."Adversarialpatch."
arXivpreprintarXiv:1712.09665
(2017).Adversarial
patch
makes
people
invisible
to
object
detection
(YOLO)Example
Attacks/Adversarial
attack
or
new
fashion?Example
AttacksXu,Kaidi,etal.“Adversarialt-shirt!evadingpersondetectorsinaphysicalworld.”
ECCV,
2020.Adversarial
t-shirt:
one
step
closer
to
real-world
attackExample
AttacksDuan
etal.AdversarialCamouflage:HidingPhysical-WorldAttacksWith
Natural
Styles.
CVPR,
2020.Tree
bark
->
street
signpeople+pikachu
t-shirt
->
dogCamouflage
adversarial
patterns
into
realistic
stylesExample
AttacksDuan,Ranjie,etal."Adversariallaserbeam:Effectivephysical-worldattacktodnnsinablink."
CVPR,2021Night
scene
adversarial
attack
with
laser
pointerExample
AttacksCao,Yulong,etal."Invisibleforbothcameraandlidar:Securityofmulti-sensorfusionbasedperceptioninautonomousdrivingunderphysical-worldattacks."
S&P,2021.Attacking
both
camera
and
lidar
using
adversarial
objectsExample
AttacksCarlini,Nicholas,andDavidWagner.“Audioadversarialexamples:Targetedattacksonspeech-to-text.”
S&PW,2018./code/audio_adversarial_examples/AdversarialMusic:RealworldAudioAdversaryagainstWake-wordDetectionSystem/watch?v=r4XXGDVs0f8Attacking
speech/command
recognition
modelsExample
AttacksQ&AAdversariesRibeiro
et
al.“SemanticallyequivalentadversarialrulesfordebuggingNLPmodels.”
ACL,
2018.Threats
to
AI
ApplicationsTransportationindustryTrickautonomousvehiclesintomisinterpretingstopsignsorspeedlimitCybersecurityindustryBypassAI-basedmalwaredetectiontoolsMedicalindustryForgemedicalconditionSmartHomeindustryFoolvoicecommandsFinancialIndustryTrickanomalyandfrauddetectionenginesDefinition
of
Adversarial
ExamplesNostandardcommunity-accepteddefinition“Adversarialexamplesareinputstomachinelearningmodelsthatanattackerhasintentionallydesignedtocausethemodeltomakeamistake”Goodfellow,Ian.“Defenseagainstthedarkarts:Anoverviewof
adversarialexamplesecurityresearchandfutureresearch
directions."
arXiv:1806.04169
(2018).TaxonomyofAttacksAttacktimingPoisoningattackEvasionattackAttacker’sgoalTargetedattackUntargetedattackAttacker’sknowledgeBlack-boxWhite-boxGray-boxUniversalityIndividualUniversalAttack
TimingEvasion(Causation)attackTesttimeattackChangeinputexamplePoisoningattackTrainingtimeattackChangeclassificationboundaryAttacker's
GoalTargetedattackCauseaninputtoberecognizedascomingfromaspecificclassUntargetedattackCauseaninputtoberecognizedasanyincorrectclassOstrich
Anyclass,exceptdog
Adversary's
KnowledgeWhite-boxattack:Attackerhasfullaccesstothemodel,includingmodeltype,modelarchitecture,valuesofparametersandtrainingweightsBlack-boxattack:AttackerhasnoknowledgeaboutthemodelunderattackRelyontransferabilityofadversarialexamplesGray-boxattack(Semi-black-boxattack)Attacker
may
know
some
hyperparameters
like
model
architectureUniversalityIndividualattackGeneratedifferentperturbationsforeachcleaninputUniversalattackOnlycreateauniversalperturbationforthewholedataset.Makeiteasiertodeployadversaryexamples.Moosavi-Dezfooli,Seyed-Mohsen,etal.“Universaladversarialperturbations.”
CVPR
2017.A
Brief
History
of
Adversarial
Machine
LearningBiggioetal.“Evasionattacksagainstmachinelearningattesttime.”;
Szegedy,Christian,etal."Intriguingpropertiesofneuralnetworks."
2014年Goodfellow等人提出快速單步攻擊FGSM及對抗訓(xùn)練2015年簡單檢測方法(PCA)和對抗訓(xùn)練方法2016年提出對抗訓(xùn)練的min-max優(yōu)化框架2017年大量的對抗樣本檢測方法和攻擊方法(BIM、C&W)、10種檢測方法被攻破2019年TRADES及大量其他對抗訓(xùn)練方法、第一篇Science文章2018年物理世界攻擊方法、檢測方法升級、PGD攻擊與對抗訓(xùn)練、9種防御方法被攻破2020年AutoAttack攻擊、Fast對抗訓(xùn)練2021年增大模型、增加數(shù)據(jù)的對抗訓(xùn)練、領(lǐng)域延伸2022年尚未解決的問題,攻擊越來越多,防御越來越難2013年Biggio等人與Szegedy等人發(fā)現(xiàn)對抗樣本W(wǎng)hite-box
Attacks單步攻擊:FastGradientSign
Method(FGSM)
(Goodfellowetal.2014):多步攻擊:IterativeMethods(BIM,
PGD),(Kurakinetal.2016;
Madry
et
al.
2018):ProjectedGradientDescent
(PGD):
strongest
first-order
attack.基于優(yōu)化的攻擊:C&W
attack(Carlini&Wagner2017):
CW
attack
was
the
strongest
attack
集成攻擊:AutoAttack
(Croceetal.2020):
current
strongest
attackWhy
Adversarial
Examples
Exist?Non-linear
Explanation1stlayerViewing
DNNas
a
sequence
of
transformed
spaces:10thlayer20thlayerSzegedyC,ZarembaW,SutskeverI,etal.Intriguingpropertiesofneuralnetworks[J].ICLR
2014;
Ma
et
al.
CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
ICLR
2018High
dimensional
non-linearexplanation:
Non-lineartransformationsleadstotheexistenceofsmall“pockets”inthedeepspace:Regionsoflowprobability(notnaturallyoccurring).Densely
scatteredregions.Continuousregions.Closetonormaldatasubspace.Linear
ExplanationViewing
DNNas
a
stack
of
linear
operations:
GoodfellowIJ,ShlensJ,SzegedyC.Explainingandharnessingadversarialexamples[J].ICLR
2015.VulnerabilityIncreaseswithIntrinsicDimensionalityAmsaleget
al.
TheVulnerabilityofLearningtoAdversarialPerturbationIncreaseswithIntrinsicDimensionality.
WIFS,
2017Y-axis:
the
minimum
adversarial
noise
required
to
subvert
a
KNN
classifierX-axis:
LID
valuesRed
curve:
theoretical
boundCIFAR-10ImageNetInsufficient
Training
Data
Insufficient
Training
Data
SizeofthetrainingdatasetAccuracyonitsowntestdatasetAccuracyonthetestdatasetwith4×104pointsAccuracyontheboundarydataset8010092.760.880099.097.474.9800099.599.694.18000099.999.998.9SizeofthetrainingdatasetAccuracyonitsowntestdatasetAccuracyonthetestdatasetwith4×104pointsAccuracyontheboundarydataset8010096.370.180099.899.085.7800099.999.897.38000099.9899.9899.5Unnecessary
Features
Wanget
al."Atheoreticalframeworkforrobustnessof(deep)classifiersagainstadversarialexamples."
arXiv:1612.00334
(2016).Unnecessary
FeaturesAdversarialsamplescanbefarawayfr
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 二零二五版班班通設(shè)備與物聯(lián)網(wǎng)結(jié)合合同2篇
- 二零二五年綠色生態(tài)住宅小區(qū)消防工程設(shè)計(jì)與施工合同3篇
- 二零二五版股份制企業(yè)股份自愿轉(zhuǎn)讓與投資者關(guān)系維護(hù)合同3篇
- 二零二五年度監(jiān)理合同延期補(bǔ)充協(xié)議-責(zé)任劃分與風(fēng)險(xiǎn)承擔(dān)3篇
- 二零二五版中央空調(diào)清洗保養(yǎng)及能耗管理服務(wù)合同3篇
- 二零二五年度國有資產(chǎn)管理委托服務(wù)合同2篇
- 二零二五版股票質(zhì)押擔(dān)保合同范本編制與解析3篇
- 二零二五年度風(fēng)力發(fā)電項(xiàng)目融資合同2篇
- 二零二五年美發(fā)師國際交流聘用合同2篇
- 二零二五年度酒店地毯翻新與維護(hù)服務(wù)合同范本3篇
- 垃圾焚燒發(fā)電環(huán)保培訓(xùn)
- 北京市朝陽區(qū)2024-2025學(xué)年高一(上)期末化學(xué)試卷(含答案)
- 中醫(yī)基礎(chǔ)學(xué)考試題(附答案)
- 2025貴州建筑安全員B證考試題庫附答案
- 2024年杭州師范大學(xué)附屬醫(yī)院高層次衛(wèi)技人才招聘筆試歷年參考題庫頻考點(diǎn)附帶答案
- 2024-2025學(xué)年八年級歷史上冊期末復(fù)習(xí)課件
- 2025年云南省大理州事業(yè)單位招聘339人歷年高頻重點(diǎn)提升(共500題)附帶答案詳解
- 2024-2025學(xué)年度第一學(xué)期三年級數(shù)學(xué)寒假作業(yè) 有答案
- 大型起重機(jī)械現(xiàn)場管理手冊
- 2024年貴州省公務(wù)員錄用考試《行測》真題及答案解析
- 江蘇省南京市聯(lián)合體2024-2025學(xué)年九年級上學(xué)期期中學(xué)情分析化學(xué)試卷(無答案)
評論
0/150
提交評論