一天理解深度學習

上傳人：遂*** IP屬地：江蘇上傳時間：2023-04-12 格式：PPTX 頁數(shù)：304 大?。?1.92MB 積分：80 舉報 版權(quán)申訴

已閱讀5頁，還剩299頁未讀，繼續(xù)免費閱讀

版權(quán)說明：本文檔由用戶提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權(quán)，請進行舉報或認領(lǐng)

文檔簡介

一天了解深度學習Hung-yi

LeeOutlineLectureI:IntroductionofDeepLearningLectureII:VariantsofNeuralNetworkLectureIII:BeyondSupervisedLearningLectureI:

Introductionof

DeepLearningOutlineIntroductionofDeepLearning“HelloWorld”forDeepLearningTipsforDeepLearningMachineLearning

≈LookingforaFunctionSpeechRecognitionImageRecognitionPlayingGoDialogueSystem“Cat”“Howareyou”“5-5”“Hello”“Hi”(whattheusersaid)(systemresponse)(nextmove)FrameworkAsetoffunction“cat”“dog”“money”“snake”Model“cat”ImageRecognition:FrameworkAsetoffunction“cat”ImageRecognition:ModelTrainingDataGoodnessoffunctionfBetter!“monkey”“cat”“dog”functioninput:functionoutput:SupervisedLearningFrameworkAsetoffunction“cat”ImageRecognition:ModelTrainingDataGoodnessoffunctionf“monkey”“cat”“dog”Pickthe“Best”FunctionUsing“cat”TrainingTestingStep1Step2Step3ThreeStepsforDeepLearningStep1:defineasetoffunctionStep2:goodnessoffunctionStep3:pickthebestfunctionNeuralNetworkNeuralNetwork…biasweightsNeuron………AsimplefunctionActivationfunctionNeuralNetworkbiasActivationfunctionweightsNeuron1-2-112-114SigmoidFunction0.98NeuralNetworkDifferentconnectionsleadtodifferentnetworkstructures

Theneuronshavedifferentvaluesofweightsandbiases.FullyConnectFeedforwardNetworkSigmoidFunction1-11-21-1104-20.980.12FullyConnectFeedforwardNetwork1-21-1104-20.980.122-1-1-23-14-10.860.110.620.8300-221-1FullyConnectFeedforwardNetwork1-21-1100.730.52-1-1-23-14-10.720.120.510.8500-22

00Thisisafunction.Inputvector,outputvectorGivennetworkstructure,defineafunctionsetOutputLayerHiddenLayersInputLayerFullyConnectFeedforwardNetworkInputOutputLayer1…………Layer2……LayerL…………………………y1y2yMDeepmeansmanyhiddenlayersneuronWhyDeep?UniversalityTheoremReference

forthereason:/chap4.htmlAnycontinuousfunctionfCanberealizedbyanetworkwithonehiddenlayer(givenenoughhiddenneurons)Why“Deep”neuralnetworknot“Fat”neuralnetwork?LogiccircuitsconsistsofgatesAtwolayersoflogicgatescanrepresentanyBooleanfunction.UsingmultiplelayersoflogicgatestobuildsomefunctionsaremuchsimplerNeuralnetworkconsistsofneuronsAhiddenlayernetworkcanrepresentanycontinuousfunction.UsingmultiplelayersofneuronstorepresentsomefunctionsaremuchsimplerlessgatesneededLogiccircuitsNeuralnetworklessparameterslessdata?Morereason:WhyDeep?Analogy8layers19layers22layersAlexNet(2023)VGG(2023)GoogleNet(2023)16.4%7.3%6.7%Deep=ManyhiddenlayersAlexNet(2023)VGG(2023)GoogleNet(2023)152layers3.57%ResidualNet(2023)Taipei101101layers16.4%7.3%6.7%Deep=ManyhiddenlayersSpecialstructureOutputLayerSoftmaxlayerastheoutputlayerOrdinaryLayerIngeneral,theoutputofnetworkcanbeanyvalue.MaynotbeeasytointerpretOutputLayerSoftmaxlayerastheoutputlayerSoftmaxLayer3-312.7200.050.880.12≈0

ExampleApplicationInputOutput16x16=256……Ink→1Noink→0……y1y2y10Eachdimensionrepresentstheconfidenceofadigit.is1is2is0……Theimageis“2”ExampleApplicationHandwritingDigitRecognitionMachine“2”…………y1y2y10is1is2is0……Whatisneededisafunction……Input:256-dimvectoroutput:10-dimvectorNeuralNetworkOutputLayerHiddenLayersInputLayerExampleApplicationInputOutputLayer1…………Layer2……LayerL……………………“2”……y1y2y10is1is2is0……AfunctionsetcontainingthecandidatesforHandwritingDigitRecognitionYouneedtodecidethenetworkstructuretoletagoodfunctioninyourfunctionset.FAQQ:Howmanylayers?Howmanyneuronsforeachlayer?Q:Canwedesignthenetworkstructure?Q:Canthestructurebeautomaticallydetermined?Yes,butnotwidelystudiedyet.TrialandErrorIntuition+ConvolutionalNeuralNetwork(CNN)inthenextlectureHighwayNetworkResidualNetworkHighwayNetworkDeepResidualLearningforImageRecognition/abs/1512.03385TrainingVeryDeepNetworks+copycopyGatecontrollerInputlayeroutputlayerInputlayeroutputlayerInputlayeroutputlayerHighwayNetworkautomaticallydeterminesthelayersneeded!ThreeStepsforDeepLearningStep1:defineasetoffunctionStep2:goodnessoffunctionStep3:pickthebestfunctionTrainingDataPreparingtrainingdata:imagesandtheirlabelsThelearningtargetisdefinedonthetrainingdata.“5”“0”“4”“1”“3”“1”“2”“9”LearningTarget16x16=256…………………………Ink→1Noink→0……y1y2y10y1hasthemaximumvalueThelearningtargetis……Input:y2hasthemaximumvalueInput:is1is2is0SoftmaxLoss………………………………y1y2y10

“1”……100……LosscanbesquareerrororcrossentropybetweenthenetworkoutputandtargettargetSoftmaxAscloseaspossibleAgoodfunctionshouldmakethelossofallexamplesassmallaspossible.GivenasetofparametersTotalLossx1x2xRNNNNNN…………y1y2yR

…………x3NNy3

Foralltrainingdata…

TotalLoss:

AssmallaspossibleFindafunctioninfunctionsetthatminimizestotallossLThreeStepsforDeepLearningStep1:defineasetoffunctionStep2:goodnessoffunctionStep3:pickthebestfunctionHowtopickthebestfunction

EnumerateallpossiblevaluesLayerl……Layerl+1……E.g.speechrecognition:8layersand1000neuronseachlayer1000neurons1000neurons106weightsMillionsofparametersGradientDescent

Random,RBMpre-trainUsuallygoodenough

Pickaninitialvalueforw

GradientDescent

Pickaninitialvalueforw

PositiveNegativeDecreasewIncreasew

GradientDescent

Pickaninitialvalueforw

ηiscalled“l(fā)earningrate”

Repeat

GradientDescent

Pickaninitialvalueforw

Repeat

(whenupdateislittle)

GradientDescentColor:ValueofTotalLossLRandomlypickastartingpoint

GradientDescentHopfully,wewouldreachaminima…..

Color:ValueofTotalLossLLocalMinimaTotalLossThevalueofanetworkparameterwVeryslowattheplateauStuckatlocalminima

Stuckatsaddlepoint

LocalMinimaGradientdescentneverguaranteeglobalminima

DifferentinitialpointReachdifferentminima,sodifferentresultsGradientDescentThisisthe“l(fā)earning”ofmachinesindeeplearning……Evenalphagousingthisapproach.Ihopeyouarenottoodisappointed:pPeopleimage……Actually…..Backpropagation

libdnn臺大周伯威同學開發(fā)Ref:Step1:defineasetoffunctionStep2:goodnessoffunctionStep3:pickthebestfunctionThreeStepsforDeepLearningDeepLearningissosimple……NowIfyouwanttofindafunctionIfyouhavelotsoffunctioninput/output(?)astrainingdataYoucanusedeeplearningForexample,youcando…….Image

RecognitionNetwork“monkey”“cat”“dog”“monkey”“cat”“dog”Forexample,youcando…….Spamfiltering(/)Network(Yes/No)1/01(Yes)0(No)“free”ine-mail“Talk”ine-mailForexample,youcando……./Network政治體育經(jīng)濟“president”indocument“stock”indocument體育政治財經(jīng)OutlineIntroductionofDeepLearning“HelloWorld”forDeepLearningTipsforDeepLearningKeraskeras.tw/~tlkagk/courses/MLDS_2023_2/Lecture/RNN%20training%20(v6).ecm.mp4/index.htmlVeryflexibleNeedsomeefforttolearnEasytolearnanduse(stillhavesomeflexibility)YoucanmodifyitifyoucanwriteTensorFloworTheanoInterfaceofTensorFloworTheanoorIfyouwanttolearntheano:KerasFran?oisCholletistheauthorofKeras.HecurrentlyworksforGoogleasadeeplearningengineerandresearcher.Kerasmeans

horn

inGreekDocumentation:http://keras.io/Example:/fchollet/keras/tree/master/examples使用Keras心得感謝沈昇勳同學提供圖檔ExampleApplicationHandwritingDigitRecognitionMachine“1”“Helloworld”fordeeplearningMNISTData:Kerasprovidesdatasetsloadingfunction:http://keras.io/datasets/28x28Kerasy1y2y10……………………Softmax50050028x28KerasKerasStep3.1:ConfigurationStep3.2:Findtheoptimalnetworkparameters

0.1Trainingdata(Images)Labels(digits)KerasStep3.2:FindtheoptimalnetworkparametersNumberoftrainingexamplesnumpyarray28x28=784numpyarray10Numberoftrainingexamples…………Kerashttp://keras.io/getting-started/faq/#how-can-i-save-a-keras-modelHowtousetheneuralnetwork(testing):case1:case2:SaveandloadmodelsKerasUsingGPUtospeedtrainingWay1THEANO_FLAGS=device=gpu0pythonYourCode.pyWay2(inyourcode)importosos.environ["THEANO_FLAGS"]="device=gpu0"DemoStep1:defineasetoffunctionStep2:goodnessoffunctionStep3:pickthebestfunctionThreeStepsforDeepLearningDeepLearningissosimple……OutlineIntroductionofDeepLearning“HelloWorld”forDeepLearningTipsforDeepLearningNeuralNetworkGoodResultsonTestingData?GoodResultsonTrainingData?Step1:defineasetoffunctionStep2:goodnessoffunctionStep3:pickthebestfunctionYESYESNONOOverfitting!RecipeofDeepLearningDonotalwaysblameOverfittingDeepResidualLearningforImageRecognition/abs/1512.03385TestingDataOverfitting?TrainingDataNotwelltrainedNeuralNetworkGoodResultsonTestingData?GoodResultsonTrainingData?YESYESRecipeofDeepLearningDifferentapproachesfordifferentproblems.e.g.dropoutforgoodresultsontestingdataGoodResultsonTestingData?GoodResultsonTrainingData?YESYESRecipeofDeepLearningChoosingproperlossMini-batchNewactivationfunctionAdaptiveLearningRateMomentumChoosingProperLoss………………………………y1y2y10loss“1”……100……targetSoftmax

SquareErrorCrossEntropy

Whichoneisbetter?

……100=0=0DemoSquareErrorCrossEntropySeveralalternatives:https://keras.io/objectives/DemoChoosingProperLossTotalLossw1w2CrossEntropySquareErrorWhenusingsoftmaxoutputlayer,choosecrossentropy/proceedings/papers/v9/glorot10a/glorot10a.pdfGoodResultsonTestingData?GoodResultsonTrainingData?YESYESRecipeofDeepLearningChoosingproperlossMini-batchNewactivationfunctionAdaptiveLearningRateMomentumMini-batchx1NN……y1

x31NNy31

x2NN……y2

x16NNy16

Pickthe1stbatchRandomlyinitializenetworkparametersPickthe2ndbatchMini-batchMini-batch

UpdateparametersonceUpdateparametersonceUntilallmini-batcheshavebeenpicked…oneepochRepeattheaboveprocessWedonotreallyminimizetotalloss!Mini-batchx1NN……y1

x31NNy31

Mini-batchPickthe1stbatchPickthe2ndbatch

UpdateparametersonceUpdateparametersonceUntilallmini-batcheshavebeenpicked…oneepoch100examplesinamini-batchRepeat20timesMini-batchOriginalGradientDescentWithMini-batchUnstable!!!Thecolorsrepresentthetotalloss.Mini-batchisFaster1epochSeeallexamplesSeeonlyonebatchUpdateafterseeingallexamplesIfthereare20batches,update20timesinoneepoch.OriginalGradientDescentWithMini-batchNotalwaystruewithparallelcomputing.Canhavethesamespeed(notsuperlargedataset)Mini-batchhasbetterperformance!Demox1NN……y1

x31NNy31

x2NN……y2

x16NNy16

Mini-batchMini-batchShufflethetrainingexamplesforeachepochEpoch1x1NN……y1

x17NNy17

x2NN……y2

x26NNy26

Mini-batchMini-batchEpoch2Don’tworry.ThisisthedefaultofKeras.GoodResultsonTestingData?GoodResultsonTrainingData?YESYESRecipeofDeepLearningChoosingproperlossMini-batchNewactivationfunctionAdaptiveLearningRateMomentumHardtogetthepowerofDeep…Deeperusuallydoesnotimplybetter.ResultsonTrainingDataDemoVanishingGradientProblemLargergradientsAlmostrandomAlreadyconvergebasedonrandom!?LearnveryslowLearnveryfast…………………………………………y1y2yMSmallergradientsVanishingGradientProblem…………………………………………

……

Intuitivewaytocomputethederivatives…

SmallergradientsLargeinputSmalloutputHardtogetthepowerofDeep…In2023,peopleusedRBMpre-training.In2023,peopleuseReLU.ReLURectifiedLinearUnit(ReLU)Reason:1.Fasttocompute2.Biologicalreason3.Infinitesigmoidwithdifferentbiases4.Vanishinggradientproblem

[XavierGlorot,AISTATS’11][AndrewL.Maas,ICML’13][KaimingHe,arXiv’15]ReLU0000

ReLUAThinnerlinearnetworkDonothavesmallergradients

DemoReLU-variant

αalsolearnedbygradientdescentMaxoutLearnableactivationfunction[IanJ.Goodfellow,ICML’13]MaxInputMax+

MaxMax+

ReLUisaspecialcasesofMaxoutYoucanhavemorethan2elementsinagroup.neuronMaxoutLearnableactivationfunction[IanJ.Goodfellow,ICML’13]ActivationfunctioninmaxoutnetworkcanbeanypiecewiselinearconvexfunctionHowmanypiecesdependingonhowmanyelementsinagroupReLUisaspecialcasesofMaxout2elementsinagroup3elementsinagroupGoodResultsonTestingData?GoodResultsonTrainingData?YESYESRecipeofDeepLearningChoosingproperlossMini-batchNewactivationfunctionAdaptiveLearningRateMomentum

LearningRatesIflearningrateistoolargeTotallossmaynotdecreaseaftereachupdateSetthelearningrateηcarefully

LearningRatesIflearningrateistoolargeSetthelearningrateηcarefullyIflearningrateistoosmallTrainingwouldbetooslowTotallossmaynotdecreaseaftereachupdateLearningRates

AdagradParameterdependentlearningrate

constant

Summationofthesquareofthepreviousderivatives

Original:Adagrad:Adagradg0g1……0.10.2……g0g1……20.010.0……Observation:1.Learningrateissmallerandsmallerforallparameters2.Smallerderivatives,largerlearningrate,andviceversa

Why?

Learningrate:Learningrate:

SmallerDerivativesLargerLearningRate2.Smallerderivatives,largerlearningrate,andviceversaWhy?SmallerLearningRateLargerderivativesNotthewholestory……Adagrad[JohnDuchi,JMLR’11]RMSpropAdadelta[MatthewD.Zeiler,arXiv’12]“Nomorepeskylearningrates”[TomSchaul,arXiv’12]AdaSecant[CaglarGulcehre,arXiv’14]Adam

[DiederikP.Kingma,ICLR’15]Nadam

/proj2023/054_report.pdf

GoodResultsonTestingData?GoodResultsonTrainingData?YESYESRecipeofDeepLearningChoosingproperlossMini-batchNewactivationfunctionAdaptiveLearningRateMomentumHardtofind

optimalnetworkparametersTotalLossThevalueofanetworkparameterwVeryslowattheplateauStuckatlocalminima

Stuckatsaddlepoint

Inphysicalworld

……MomentumHowaboutputthisphenomenoningradientdescent?Movement=Negativeof????∕????+MomentumMomentumcost????∕????=0Stillnotguaranteereachingglobalminima,butgivesomehope……

MomentumRealMovementAdamRMSProp(AdvancedAdagrad)+MomentumDemoGoodResultsonTestingData?GoodResultsonTrainingData?YESYESRecipeofDeepLearningEarlyStoppingRegularizationDropoutNetworkStructurePanaceaforOverfittingHavemoretrainingdataCreatemoretrainingdata(?)OriginalTrainingData:CreatedTrainingData:Shift15。Handwritingrecognition:GoodResultsonTestingData?GoodResultsonTrainingData?YESYESRecipeofDeepLearningEarlyStoppingRegularizationDropoutNetworkStructureDropoutTraining:EachtimebeforeupdatingtheparametersEachneuronhasp%todropoutDropoutTraining:EachtimebeforeupdatingtheparametersEachneuronhasp%todropoutUsingthenewnetworkfortrainingThestructureofthenetworkischanged.Thinner!Foreachmini-batch,weresamplethedropoutneuronsDropoutTesting:NodropoutIfthedropoutrateattrainingisp%,alltheweightstimes1-p%

Dropout-IntuitiveReasonTrainingTestingDropout(腳上綁重物)Nodropout(拿下重物後就變很強)Dropout-IntuitiveReasonWhytheweightsshouldmultiply(1-p)%(dropoutrate)whentesting?TrainingofDropoutTestingofDropout

Assumedropoutrateis50%

NodropoutWeightsfromtraining

Weightsmultiply1-p%Dropoutisakindofensemble.EnsembleNetwork1Network2Network3Network4TrainabunchofnetworkswithdifferentstructuresTrainingSetSet

1Set2Set3Set4Dropoutisakindofensemble.Ensembley1Network1Network2Network3Network4Testingdataxy2y3y4averageDropoutisakindofensemble.TrainingofDropoutminibatch1……Usingonemini-batchtotrainonenetworkSomeparametersinthenetworkaresharedminibatch2minibatch3minibatch4Mneurons2MpossiblenetworksDropoutisakindofensemble.testingdataxTestingofDropout……averagey1y2y3Alltheweightsmultiply1-p%≈y?????MoreaboutdropoutMorereferencefordropout[NitishSrivastava,JMLR’14][PierreBaldi,NIPS’13][GeoffreyE.Hinton,arXiv’12]DropoutworksbetterwithMaxout[IanJ.Goodfellow,ICML’13]Dropconnect[LiWan,ICML’13]DropoutdeleteneuronsDropconnectdeletestheconnectionbetweenneuronsAnnealeddropout[S.J.Rennie,SLT’14]DropoutratedecreasesbyepochsStandout[J.Ba,NISP’13]EachneuralhasdifferentdropoutrateDemoy1y2y10……………………Softmax500500model.add(dropout(0.8))model.add(dropout(0.8))DemoGoodResultsonTestingData?GoodResultsonTrainingData?YESYESRecipeofDeepLearningEarlyStoppingRegularizationDropoutNetworkStructureCNNisaverygoodexample!(nextlecture)ConcludingRemarksRecipeofDeepLearningNeuralNetworkGoodResultsonTestingData?GoodResultsonTrainingData?Step1:defineasetoffunctionStep2:goodnessoffunctionStep3:pickthebestfunctionYESYESNONOLectureII:

VariantsofNeuralNetworksVariantsofNeuralNetworksConvolutionalNeuralNetwork(CNN)RecurrentNeuralNetwork(RNN)WidelyusedinimageprocessingWhyCNNforImage?Canthenetworkbesimplifiedbyconsideringthepropertiesofimages?……………………………………ThemostbasicclassifiersUse1stlayerasmoduletobuildclassifiersUse2ndlayerasmodule……[Zeiler,M.D.,ECCV2023]RepresentedaspixelsWhyCNNforImageSomepatternsaremuchsmallerthanthewholeimageAneurondoesnothavetoseethewholeimagetodiscoverthepattern.“beak”

detectorConnectingtosmallregionwithlessparametersWhyCNNforImageThesamepatternsappearindifferentregions.“upper-leftbeak”

detector“middlebeak”

detectorTheycanusethesamesetofparameters.DoalmostthesamethingWhyCNNforImageSubsampling

thepixelswillnotchangetheobjectsubsamplingbirdbirdWecansubsamplethepixelstomakeimagesmallerLessparametersforthenetworktoprocesstheimageStep1:defineasetoffunctionStep2:goodnessoffunctionStep3:pickthebestfunctionThreeStepsforDeepLearningDeepLearningissosimple……ConvolutionalNeuralNetworkThewholeCNNFullyConnectedFeedforwardnetworkcatdog……ConvolutionMaxPoolingConvolutionMaxPoolingFlattenCanrepeatmanytimesThewholeCNNConvolutionMaxPoolingConvolutionMaxPoolingFlattenCanrepeatmanytimesSomepatternsaremuchsmallerthanthewholeimageThesamepatternsappearindifferentregions.Subsampling

thepixelswillnotchangetheobjectProperty1Property2Property3ThewholeCNNFullyConnectedFeedforwardnetworkcatdog……ConvolutionMaxPoolingConvolutionMaxPoolingFlattenCanrepeatmanytimesCNN–Convolution1000010100100011001000100100100010106x6image1-1-1-11-1-1-11Filter1-11-1-11-1-11-1Filter2……Thosearethenetworkparameterstobelearned.MatrixMatrixEachfilterdetectsasmallpattern(3x3).Property1CNN–Convolution1000010100100011001000100100100010106x6image1-1-1-11-1-1-11Filter13-1stride=1CNN–Convolution1000010100100011001000100100100010106x6image1-1-1-11-1-1-11Filter13-3Ifstride=2Wesetstride=1belowCNN–Convolution1000010100100011001000100100100010106x6image1-1-1-11-1-1-11Filter13-1-3-1-310-3-3-3013-2-2-1stride=1Property2CNN–Convolution1000010100100011001000100100100010106x6image3-1-3-1-310-3-3-3013-2-2-1-11-1-11-1-11-1Filter2-1-1-1-1-1-1-21-1-1-21-10-43Dothesameprocessforeveryfilterstride=14x4imageFeatureMapCNN–ZeroPadding1000010100100011001000100100100010106x6image1-1-1-11-1-1-11Filter1Youwillgetanother6x6imagesinthisway0Zeropadding000000000CNN–Colorfulimage1000010100100011001000100100100010101000010100100011001000100100100010101000010100100011001000100100100010101-1-1-11-1-1-11Filter1-11-1-11-1-11-1Filter21-1-1-11-1-1-111-1-1-11-1-1-11-11-1-11-1-11-1-11-1-11-1-11-1Colorfulimage100001010010001100100010010010001010imageconvolution-11-1-11-1-11-11-1-1-11-1-1-11…………100001010010001100100010010010001010Convolutionv.s.FullyConnectedFully-connected1000010100100011001000100100100010106x6image1-1-1-11-1-1-11Filter11:2:3:…7:8:9:…13:14:15:…Onlyconnectto9input,notfullyconnected4:10:16:1000010000113Lessparameters!1000010100100011001000100100100010101-1-1-11-1-1-11Filter11:2:3:…7:8:9:…13:14:15:…4:10:16:1000010000113-1Sharedweights6x6imageLessparameters!Evenlessparameters!ThewholeCNNFullyConnectedFeedforwardnetworkcatdog……ConvolutionMaxPoolingConvolutionMaxPoolingFlattenCanrepeatmanytimesCNN–MaxPooling3-1-3-1-310-3-3-3013-2-2-1-11-1-11-1-11-1Filter2-1-1-1-1-1-1-21-1-1-21-10-431-1-1-11-1-1-11Filter1CNN–MaxPooling1000010100100011001000100100100010106x6image3013-11302x2imageEachfilterisachannelNewimagebutsmallerConvMaxPoolingThewholeCNNConvolutionMaxPoolingConvolutionMaxPoolingCanrepeatmanytimesAnewimageThenumberofthechannelisthenumberoffiltersSmallerthantheoriginalimage3013-1130ThewholeCNNFullyConnectedFeedforwardnetworkcatdog……ConvolutionMaxPoolingConvolutionMaxPoolingFlattenAnewimageAnewimageFlatten3013-1130Flatten3013-1103FullyConnectedFeedforwardnetworkConvolutionalNeuralNetworkLearning:Nothingspecial,justgradientdescent……CNN“monkey”“cat”“dog”Convolution,MaxPooling,fullyconnected100……targetStep1:defineasetoffunctionStep2:goodnessoffunctionStep3:pickthebestfunctionConvolutionalNeuralNetworkOnlymodifiedthenetworkstructureandinputformat(vector->3-Dtensor)CNNinKerasConvolutionMaxPoolingConvolutionMaxPoolinginput1-1-1-11-1-1-11-11-1-11-1-11-1Thereare25

3x3filters.……Input_shape=(1,28,28)1:black/weight,3:RGB28x28pixels3-1-313Onlymodifiedthenetworkstructureandinputformat(vector->3-Dtensor)CNNinKerasConvolutionMaxPoolingConvolutionMaxPoolinginput1x28x2825x26x2625x13x1350x11x1150x5x5Howmanyparametersforeachfilter?Howmanyparametersforeachfilter?9225Onlymodifiedthenetworkstructureandinputformat(vector->3-Dtensor)CNNinKerasConvolutionMaxPoolingConvolutionMaxPoolinginput1x28x2825x26x2625x13x1350x11x1150x5x5Flatten1250FullyConnectedFeedforwardnetworkoutputLiveDemoConvolutionMaxPoolingConvolutionMaxPoolinginput253x3filters503x3filtersWhatdoesCNNlearn?50x11x11Theoutputofthek-thfilterisa11x11matrix.Degreeoftheactivationofthek-thfilter:

3-1-1-31-33-2-1………………………………

1111x

(gradientascent)ConvolutionMaxPoolingConvolutionMaxPoolinginput253x3filters503x3filtersWhatdoesCNNlearn?50x11x11Theoutputofthek-thfilterisa11x11matrix.Degreeoftheactivationofthek-thfilter:

(gradientascent)ForeachfilterConvolutionMaxPoolinginputConvolutionMaxPoolingflatten

WhatdoesCNNlearn?

Canweseedigits?012345678DeepNeuralNetworksareEasilyFooledWhatdoesCNNlearn?012345678012345678

OverallpixelvaluesDeepDreamGivenaphoto,machineaddswhatitsees……/CNN

ModifyimageCNNexaggerateswhatitseesDeepDreamGivenaphoto,machineaddswhatitsees……/DeepStyleGivenaphoto,makeitsstylelikefamouspaintings/DeepStyleGivenaphoto,makeitsstylelikefamouspaintings/DeepStyleCNNCNNcontentstyleCNN?ANeuralAlgorithmofArtisticStyle/abs/1508.06576MoreApplication:PlayingGoNetwork(19x19positions)Nextmove19x19vectorBlack:1white:-1none:019x19vectorFully-connectedfeedforwardnetworkcanbeusedButCNNperformsmuchbetter.19x19matrix(image)MoreApplication:PlayingGoCNNCNNrecordofpreviousplaysTarget:“天元”

=1else=0Target:“五之5”

=1else=0Training:黑:

5之五白:天元黑:五之5…WhyCNNforplayingGo?SomepatternsaremuchsmallerthanthewholeimageThesamepatternsappearindifferentregions.AlphaGouses5x5forfirstlayerWhyCNNforplayingGo?Subsampling

thepixelswillnotchangetheobjectAlphaGodoesnotuseMaxPooling……MaxPoolingHowtoexplainthis???VariantsofNeuralNetworksConvolutionalNeuralNetwork(CNN)RecurrentNeuralNetwork(RNN)NeuralNetworkwithMemoryExampleApplicationSlotFillingIwouldliketoarriveTaipeionNovember2nd.

ticketbookingsystemDestination:timeofarrival:TaipeiNovember2nd

SlotExampleApplicationTaipeiInput:aword(Eachwordisrepresentedasavector)SolvingslotfillingbyFeedforwardnetwork?1-of-NencodingEachdimensioncorrespondstoawordinthelexi

人人文庫> 全部分類> 辦公材料 > 辦公文檔

溫馨提示

1. 本站所有資源如無特殊說明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁內(nèi)容里面會有圖紙預覽，若沒有圖紙預覽就沒有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 人人文庫網(wǎng)僅提供信息存儲空間，僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理，對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對任何下載內(nèi)容負責。
6. 下載文件中如有侵權(quán)或不適當內(nèi)容，請與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

一天理解深度學習

文檔簡介

溫馨提示

最新文檔

評論

一天理解深度學習

文檔簡介

溫馨提示

最新文檔

評論

相關(guān)文檔