東南大學(xué) －自監(jiān)督學(xué)習(xí)算法、理論與應(yīng)用 A Survey of Self-Supervised Learning from Multiple Perspectives Algorithms,Theory,Applications and Future Trends

上傳人：策*** IP屬地：山西上傳時間：2023-02-03 格式：DOCX 頁數(shù)：50 大小：796.43KB 積分：19.9 舉報 版權(quán)申訴

已閱讀5頁，還剩45頁未讀，繼續(xù)免費閱讀

版權(quán)說明：本文檔由用戶提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權(quán)，請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

arXivv[cs.LG]13Jan2023JOURNALOFLATEXCLASSFILESVOLarXivv[cs.LG]13Jan2023yndDD 1INTRODUCTIONEEPsupervisedlearningalgorithmshaveachievedsat-tervision(CV)andnaturallanguageprocessing(NLP).Gen-erally,supervisedlearningalgorithmsneedlargenumbersoflabeledexamplestoobtainbetterperformance.Mod-elstrainedonlarge-scaledatabasessuchasImageNetarewidelyutilizedaspretrainedmodelsandthen?ne-tunedforothertasks(Table1)duetothefollowingtwomainreasons.First,theparameterslearnedondifferentlarge-scaledatabasesprovideagoodstartingpoint.Thus,net-workstrainedonothertaskscanconvergemorequickly.eadylearnedtherelevanthierarchycharacteristicswhichcanhelplessenover?ttingproblemduringthetrainingprocessesofothertasks,EEPsupervisedlearningalgorithmshaveachievedsat-UniversityandwithPurpleMountainLaboratories,Nanjing210000,swiththeCenterforResearchonIntelligentPerceptionandl◆instancesinothertasksaresmallorthetraininglabelsarelimited.Unfortunately,inmanyrealdataminingandmachinelearningapplications,althoughmanyunlabeledtraininginstancescanbefound,usuallyonlyalimitednumberoflabeledtraininginstancesareavailable.Labeledexamplesareoftenexpensive,dif?cult,ortime-consumingtoobtainsincetheyrequiretheeffortsofexperiencedhumanannota-tors.Forinstance,inwebuserpro?leanalysis,itiseasytocollectmanywebuserpro?les,butlabelingthenon-pro?tableusersorpro?tableusersinthesedatarequiresin-spection,judgment,andeventime-consumingtracingtaskstobeperformedbyexperiencedhumanassessors,whichisveryexpensive.Asanothercase,inthemedical?eld,unlabeledexamplescanbeeasilyobtainedfromroutinemedicalexaminations.However,makingdiagnosesforsomanyexamplesinacase-by-casemannerimposesaheavyburdenonmedicalexperts.Forexample,toperformbreastcancerdiagnosis,radiologistsmustassignlabelstoeveryfocusinalargenumberofeasilyobtainedhigh-resolutionmammograms.Thisprocessisoftenveryinef?cientandtime-consuming.Furthermore,supervisedlearningmeth-odssufferfromspuriouscorrelationsandgeneralizationerrors,andtheyarevulnerabletoadversarialattacks.Toalleviatethetwoaforementionedlimitationsofsu-pervisedlearning,manymachinelearningparadigmshavebeenproposed,suchasactivelearning,semi-supervisedlearningandself-supervisedlearning(SSL).Thispaperfo-cusesonSSL.SSLalgorithmshavebeenproposedtolearngoodfeaturesfromalargenumberofunlabeledinstanceswithoutusinganyhumanannotations.ThegeneralpipelineofSSLisshowninFig.1.Duringtheself-supervisedpretrainingphase,aprede?nedpretextLabeledDataTransferSSLTasksJOURNALOFLATEXCLASSFILESVOLNOAUGUST2LabeledDataTransferSSLTasksPretrainingPretrainingTasksDownstreamTasksedtaimagecategorizationdetection/segmentation/poseestimation/depthestimation,etcvideoactioncategorizationactionrecognition/objecttracking,etcveunlabeleddatamagerotationjigsawetcdetection/segmentation/poseestimation/depthestimation,etcVideo:theorderofframes,playingdirection,etcactionrecognition/objecttracking,etcNLP:maskedlanguagemodelingquestionanswering/textualentailmentrecognition/naturallanguageinference,etc.TABLE1:Contrastbetweensupervisedandself-supervisedpretrainingand?ne-tuning.UnlabeledUnlabeledDataInitializationInitializationDownstreaDownstreamTasksofSSL4432.52610.6Fig.2:GoogleScholarsearchresultsfor“self-supervisedlearning”.Theverticalandhorizontalaxesdenotethenum-fSSLpublicationsandtheyearrespectivelytaskisdesignedforadeeplearningalgorithmtosolve,andthepseudolabelsforthepretexttaskareautomaticallygen-eratedbasedoncertainattributesoftheinputdata.Then,thedeeplearningalgorithmistrainedtolearntosolvethepretexttask.Aftertheself-supervisedpretrainingprocessiscompleted,thelearnedmodelcanbefurthertransferredtodownstreamtasks(especiallywhenonlyarelativelysmallnumberofexamplesareavailable)asapretrainedmodeltoimproveperformanceandovercomeover?ttingissues.Becausenohumanannotationsarerequiredtogener-atepseudolabelsduringself-supervisedtraining,onemainmeritofSSLalgorithmsisthattheycanmakethemostofself-supervisedalgorithmshaveachievedpromisingresults,andtheperformancegapbetweenself-supervisedandsu-pervisedalgorithmsindownstreamtaskshasdecreased.Asanoetal.[1]showedthatevenononlyasingleimage,SSLcansurprisinglyproducelow-levelcharacteristicsthatgeneralizewell.SSL[2]–[19]hasrecentlyattractedincreasingattentionFigYannLeCunoneoftherecipientsoftheACMA.M.TuringAward,gaveakeynotetalkattheEighthIn-ternationalConferenceonLearningRepresentations(ICLR2020),andthetitleofhistalkwas“Thefutureisself-supervised”.YannLeCunandYoshuaBengio,whobothreceivedtheTuringaward,saidthatSSLiskeytohuman-levelintelligence[20].AccordingtoGoogleScholar,alargenumberofpapersrelatedtoSSLhavealreadybeenpub-ForexampleapproximatelypapersrelatedtoSSLwerepublishedin2021,constitutingapproximately52paperseverydayormorethantwopapersperhour(Fig.2).TopreventtheresearchersfrombecominglostinsomanySSLpapersandtocollatethelatestresearch?ndings,weattempttoprovideatimelysurveyofthistopic.DifferencesFromPreviousWork:ReviewsofSSLareavailableforspeci?capplicationssuchasrecommendersystems[21],graphs[22],[23],sequentialtransferlearning[24],videos[25],adversarialpretrainingofself-superviseddeepnetworks[26],andvisualfeaturelearning[27].Liuetal.[18]mainlycoveredpaperswrittenbefore2020,andtheirworkdidnotcontainthelatestprogress.Jaiswaletal.[28]focusedoncontrastivelearning(CL).SSLresearchbreakthroughsinCVhavebeenachievedinrecentyears.Inthiswork,wethereforemainlyincludeSSLresearchderivedfromtheCVcommunityinrecentyears,especiallyclassicandin?uentialresearchresults.TheobjectivesofthisreviewaretoexplainwhatSSLis,itscategoriesandsubcategories,howitdiffersandrelatestoothermachinelearningparadigms,anditstheoreticalunderpinnings.Wepresentanup-to-dateandcomprehensivereviewofthefrontiersofvisualSSLanddividevisualSSLintothreeparts:context-based,contrastive,andgenerativeSSL,inthehopeofsortingthetrendsforresearchers.Theremainderofthispaperisorganizedasfollows.SectionsintroduceSSLfromtheperspectivesofitsalgorithms,theory,applications,threemaintrends,open2ALGORITHMSInthissection,we?rstintroducewhatSSLis.Then,weintroducethepretexttasksofSSLanditscombinationswithotherlearningparadigms.JOURNALOFLATEXCLASSFILESVOLNOAUGUST3ALGORITHMSWhatisSSL?PretexttasksContext-basedmethodsCLGenerativealgorithmsCombinationswithotherlearningparadigmsGenerativeadversarialnetworks(GANs)Semi-supervisedlearningMulti-instancelearningMulti-view/Multi-modal(ality)learningTesttimetrainingEORYvealgorithmsMaximumlikelihoodestimation(MLE)TheoriginalGANsInfoGAN’sdisentanglingabilityDenoisingautoencoder(DAE)ContrastiveConnectiontootherunsupervisedlearningalgorithmsConnectiontosupervisedlearningConnectiontometriclearningUnderstandingthecontrastivelossbasedonalignmentanduniformityTherelationshipbetweenthecontrastivelossandmutualinformationCompletecollapseanddimensionalcollapseAPPLICATIONSImageprocessingandcomputervisionNaturallanguageprocessing(NLP)Other?eldsMAINTheoreticalanalysisofSSLAutomaticdesignofanoptimalpretexttaskAuni?edSSLparadigmformultiplemodalitiesCanSSLbene?tfromalmostunlimiteddata?Whatisitsrelationshipwithmulti-modalitylearning?WhichSSLalgorithmisthebest/shouldIuse?Dounlabeleddataalwayshelp?TABLE2:Structureofthispaper.supervisedsupervisedselfsupervisedsupervisedderiveslabelfromco-ocurringearningunsuperdSSLTheimageisreproducedfrom2.1WhatisSSL?BeforedivingintoSSL,we?rstintroducetheconceptofunsupervisedlearning.Inunsupervisedlearning[29],thetrainingdataarecomposedofasetofinputvectorsxwithoutanycorrespondingtargetvalues.Representativeunsupervisedlearningalgorithmsincludeclusteringanddensityestimation.SSLwaspossibly?rstintroducedin[30](Fig.3).[30]usedthisstructureinnaturalenvironmentsderivedfromdifferentmodalities.Forinstance,seeingacowandhearing“mooing”areeventsoftenoccurtogether.Thus,althoughthesightofacowdoesnotmeanthatacowlabelshouldbeascribed,itdoesco-occurwithanexampleofa“moo”.Thekeyistoprocessthecowimagetoobtainaself-supervisedlabelforthenetworksothatitcanprocessthe“moo”soundandviceversa.Sincethen,themachinelearningcommunityhasfurtherpervisedlearning.InSSL,outputlabelscanbe‘intrinsically’generatedfromtheinputdataexamplesbyexposingtherelationsbetweenpartsofthedataordifferentviewsofthedata.Theoutputlabelsaregeneratedfromthedataexamplesthemselves.Fromthisde?nition,anautoencoder(AE)maybeseenasonekindofSSLalgorithmsinwhichtheoutputlabelsarethedatathemselves.AEshavebeenwidelyusedinmanyareas,suchasdimensionalityreductionandanomalydetection.InYannLeCun’skeynotetalkatICLR2020,SSLwasdescribedasequalto?llingintheblanks(reconstruction),andhegaveseveralformsofSSL(Fig.4),whichareshownasfollows.1)Predictanypartoftheinputfromanyotherpart.2)Predictthefuturefromthepast.3)Predicttheinvisiblefromthevisible.4)Predictanyoccluded,masked,orcorruptedpartfromallavailableparts.InSSL,apartoftheinputisunknown,andthegoalistopredictthatpart.Jingetal.[27]furtherextendedthemeaningofSSLasfollows.Ifamethoddoesnotinvolveanyhuman-annotatedlabels,themethodfallsintoSSL.Inthisway,SSLisequaltounsupervisedlearning.Therefore,generativeadversarialnetworks(GANs)[31]belongtoSSL.Animportantconceptinthe?eldofSSListheideaofpretext(alsoknownassurrogateorproxy)tasks.Theterm“pretext”meansthatthetaskbeingsolvedisnotthetruein-terestbutissolvedonlyforthegenuinepurposeofprovid-ingapromisingpretrainedmodel.Commonpretexttasksincluderotationpredictionandinstancediscrimination.Torealizedifferentpretexttasks,differentlossfunctionsareintroduced.AsthemostimportantconceptinSSL,we?rstintroducepretexttasksbelow.OLNOAUGUSTtimeorspace-Fig.4:SSL.This?gureisreproducedfromYannLeCun’skeynotetalkatICLR2020.Theredpartisknown,andtheotherpartisunknown.2.2PretexttasksInthissection,wesummarizethepretexttasksofSSL.ApopularSSLsolutionistoproposeapretexttaskfornetworkstosolve,andthenetworksaretrainedbylearningtheobjectivefunctionsofthesepretexttasks.Pretexttaskshavetwocommoncharacteristics,asfollows.First,featuresneedtobelearnedbydeeplearningmethodstosolvethethedatathemselves(self-supervision).Existingmethodsgenerallyutilizethreetypesofpre-ethodsCLandgenerativealgorithms.Here,generativealgorithmsgenerallymeanmaskedimagemodeling(MIM)methods.odsareusuallybasedonthecontextualrelationshipsamongthegivenexamples,suchastheirspa-tialstructuresandlocalandglobalconsistency.Now,weuserotationasasimpleexampletodemon-theconceptofcontextbasedpretexttasksThenwegraduallyintroduceothertasks.Theparadigmthatrotationfollowsinvolveslearningimagerepresentationsbytrainingneuralnetworks(NNs)torecog-nizethegeometrictransformationsappliedtotheoriginalimage.Foreachoriginalimage(see“0orotation”inFig.5),Gidarisetal.[33]createdthreerotatedimageswith90o,180o,and270orotations.Eachimagebelongedtooneoffourclasses,0o,90o,180o,or270orotation,whichweretheoutputlabelsgeneratedfromtheimagesthemselves.Morespeci?cally,thereisasetofK=4discretegeometrictrans-formationsG={g(.Iy)}whereg(.Iy)istheoperatorthatappliesageometrictransformationwithalabelofytoimageXtoproducethetransformedimageXy=g(XIy).Gidarisetal.usedadeepconvolutionalNN(CNN)F(.)topredictrotation;thisisafour-classcategorizationtask.TheCNNF(.)obtainsaninputimageXy*(wherey*isunknowntoF(.))andproducesaprobabilitydistributionoverallprobablegeometrictransformations:F╱Xy*Iθ、={Fy╱Xy*Iθ、},(1)whereFy╱Xy*Iθ、isthepredictedprobabilityforthege-ometrictransformationwithalabelofyandθdenotesthelearnableparametersofF(.).Intuitively,agoodCNNshouldbeabletocorrectlycategorizetheK=4classesofnaturalimages.Thus,givenasetofNtraininginstancesD={Xi},theself-supervisedtrainingobjectiveofF(.)isNi=1wherethelossfunctionloss(.)isni=1wherethelossfunctionloss(.)isKloss(Xi,θ)=_Llog(Fy(g(XiIy)Iθ)).(3)y=1In[34],therelativerotationanglewasconstrainedtobewithintherange[_30o,30o].TherotationswerebinnedintoColorization:Colorizationwas?rstproposedin[35],and[35]–[38]showedthatcolorizationcanbeapowerfulpretexttaskforSSL.Colorpredictionhastheadvantageouscharacteristicthatthetrainingdatacanbetotallyfree.TheLchannelofanycolorimagecanbeusedastheinputofanNNsystem,lsintheCIELabcolorutlightnesschannelXeRHxWx1,theobjectiveistopredicttheabcolorchannelsYeRHxWx2,whereHandWaretheheightandwidthdimensionality,respectively.WeuseYandtodenotethegroundtruthandthepredictedvalue,respectively.AnaturalobjectivefunctionminimizestheFrobeniusnormbetweenYand:L=_Y.(4)[35]usedthemultinomialcross-entropylossratherthan(4)predictitsabcolorchannels.Then,theLchannelandtheabcolorchannelscanbeconcatenatedtomaketheoriginalgrayscaleimagecolorful.Jigsaw:Thejigsawapproachusesjigsawpuzzlesasproxytasks.Itreliesontheintuitionthatanetworkaccomplishestheproxytasksbyunderstandingthecontextualinformationcontainedintheexamples.Morespeci?cally,itbreaksuppicturesintodiscretepatches,thenrandomlychangestheirpositionsandtriestorecovertheoriginalorder.[39]studiedtheeffectofscalingtwoself-supervisedmethods(jigsaw[40]–[43]andcolorization)alongthreedimensions:datasize,modelcapacity,andproblemcomplexity.Theresults[39]showedthattransferperformanceincreaseslog-linearlywiththedatasize.Therepresentationqualityalsoimproveswithhigher-capacitymodelsandincreasedproblemcom-plexity.Closelyrelatedworksto[40]include[44],[45].Thepretexttaskof[46],[47]wasaconditionalmotionprop-agationproblem.Noroozietal.[48]enforcedanadditionconstraintonthefeaturerepresentationprocess:thesumofthefeaturerepresentationsofallimagepatchesshouldOLNOAUGUSTorotationorotationFig.5:Rotation.Foreachoriginalimage(‘0orotation”),Gidarisetal.[33]createdthreerotatedimages:90o,180o,and270orotations.beapproximatelyequaltothefeaturerepresentationofthewholeimage.Manypretexttasksleadtorepresentationsthatarecovariantwithimagetransformations.[49]arguedthatsemanticrepresentationsshouldbeinvariantundersuchtransformations,andapretext-invariantrepresentationlearning(PIRL)approachthatlearnsinvariantrepresenta-ionsbasedonpretexttaskswasdeveloped2.2.2CLFollowingsimpleinstancediscriminationtasks[50]–[52],manyCL-basedSSLmethodssuchasmomentumcontrast(MoCo)v1[53],MoCov2[54],MoCov3[55],asimpleframeworkforCLofvisualrepresentations(SimCLR)v1dSimCLRvhaveemergedClassicalgorithmssuchasMoCohavepushedtheper-formanceofself-supervisedpretrainingtoalevelcompara-bletothatofsupervisedlearning,makingSSLrelevantforlarge-scaleapplicationsforthe?rsttime.EarlyCLapproacheswereconstructedbasedontheideaofnegativeexamples.WiththedevelopmentofCL,anumberofCLmethodsthatdonotusenegativeexampleshaveemerged.Theyfollowdifferentideas,suchasself-distillationandfeaturedecorrelation,buttheyallobeytheideaofpositiveexampleconsistency.WedescribethedifferentavailableCLmethodsbelow.NegativeexamplesbasedCLfollowsasimilarpretexttask:instancediscrimination.Thebasicideaistomakepositiveexamplesclosetoeachotherandnegativeexamplesfarfromeachotherinthelatentspace.Theexactwayinwhichrdingtothegivenmodalityandotherfactors,whichcanincludespatialandtemporalconsistencyinvideounderstandingorthecooccurrencebetweenmodalitiesinmulti-modallearn-ing.MoCo.Heetal.[53]viewedCLasadictionarylook-uptask.ConsideranencodedqueryqandseveralencodedAssumethatasinglekey(denotedask+)inthedictionarymatchesq.Acontrastiveloss[58]isafunctionwhosevalueislowifqissimilartoitspositivekeyk+anddissimilartoallothernegativekeys.Withsimilaritymeasuredbythedotproduct,onecontrastivelossfunctionformcalledInfoNCE[59]wasconsideredinMoCov1[53]:Lq=_log(5) exp(q.Lq=_log(5)Lexp(q.ki/τ),whereτdenotesthetemperaturehyperparameter.ThesumiscalculatedoveronepositiveexampleandKnegativeexamples.InfoNCEwasderivedfromnoisecontrastivees-timation(NCE)[60],whoseobjectiveisexp(q.k+/τ)+exp(q.k_/τ),exp(q.k+/τ)+exp(q.k_/τ),whereqissimilartoapositiveexamplek+anddissimilartoanegativeexamplek_.BasedonMoCov1[53]andSimCLRv1[56],MoCov2[54]usesanmultilayerperceptron(MLP)projectionheadandmoredataaugmentations.hofNinstancesandde?nesacontrastivepredictiontaskonpairsofaugmentedinstancesfromtheminibatch,producing2Ninstances.SimCLRv1doesnotexplicitlysampleneg-ativeinstances.Instead,givenapositivepair,SimCLRv1treatstheother2(N_1)augmentedinstancesinthemini-batchasnegativeinstances.Letsim(u,v)=uTv\(|u||v|)bethecosinesimilaritybetweentwoinstancesuandv.Then,thelossfunctionofSimCLRv1forapositivepairofinstances(i,j)isL1[ki]exp(sim(zi,zk)/τ), exp(simL1[ki]exp(sim(zi,zk)/τ),where1[ki]e{0,1}isanindicatorfunctionequalto1iffkiandτisthetemperaturehyperparameter.The?nallossiscomputedbyallpositivepairs,both(i,j)and(j,i),inthemini-batch.BothMoCoandSimCLRrequiredataaugmentationtechniquessuchascropping,resizing,andcolordistor-tion.Otheraugmentationmethodsareavailable[61].Forexample,[62]estimatedtheforegroundsaliencylevelsinimagesandcreatedaugmentationsbycopyingandpastingtheimageforegroundsontodifferentbackgrounds,suchashomogeneousgrayscaleimageswithrandomgrayscalelevels,textureimages,andImageNetimages.However,whyaugmentationhelpsandhowwecanperformmoreeffectiveaugmentationsarestillunclearandrequirefurtherstudies.CLmethodsbasedonself-distillation:Bootstrapyourownlatent(BYOL)[63]isarepresentativeself-distillationalgorithm.BYOLwasproposedforself-supervisedimagerepresentationlearningwithoutusingnegativepairs.BYOLusestwoNNs,whicharecalledonlineandtargetnetworks.SimilartoMoCo[53],BYOLupdatesthetargetnetworkwithaslow-movingaverageoftheonlinenetwork.JOURNALOFLATEXCLASSFILES,VOL.gradsimilarity&graddissimilarityshareweights encoder.....................encodersimilaritygrsimilaritypredictormovingaverageencodermovingaverageencoderencoderimageencoderSimCLRgradsimilarigradencoderencodergrad predictorencoderencoderimageBYOLsimilarity encoderimageSwAVimageSimSiamisreproducedfrom[65].SiamesenetworkssuchasSimCLR,BYOL,andSwAV[64]havebecomecommonstructuresinvariousrecentlydevelopedmodelsforself-supervisedvisualrepresentationlearning.Thesemodelsmaximizethesimilaritybetweentwoaugmentationsofoneimage;theyaresubjecttocertainconditionstopreventcollapsingsolutions.[65]proposedsimpleSiamese(SimSiam)networksthatcanlearnusefulrepresentationswithoutusingthefollow-ing:negativesamplepairs,largebatches,andmomentumencoders.Foreachdatapointx,wehavetworandomlyaugmentedviewsx1andx2.AnencoderfandanMLPpredictionheadhareusedtoprocessthetwoviewsDenot-ingthetwooutputsbyp1=h(f(x1))andz2=f(x2),[65]minimizedtheirnegativecosinesimilarity|p1|2|z2|2,D(p1,z2)|p1|2|z2|2,where||2isthel2-norm.Similarto[63],[65]de?nedaclossasL=(D(p1,z2)+D(p2,z1)),(9)wherethislossisde?nedbasedontheexamplexandthetotallossistheaverageofallexamples.Moreimportantly,[65]usedastop-gradient(stopgrad)operationbyrevisingasfollows:Dpstopgradz(10)whichmeansthatz2isseenasaconstant.Analogously,(9)L=(D(p1,stopgrad(z2))+D(p2,stopgrad(z1))).(11)SiamareshowninFig.6.SinceBYOLandSimSiamdonotusenegativeexamples,whethertheybelongtoCLisbelongtoCLinthispaper.CLmethodsbasedonfeaturedecorrelation:Featuredecorrelationaimstolearndecorrelatedfeatures.Barlowtwins.Barlowtwins[67]wereproposedwithanovellossfunction;theymaketheembeddingvectorsofdistortedversionsofanexamplesimilarwhileminimizingtheredundancybetweenthecomponentsofthesevectors.Morespeci?cally,similartootherSSLmethods[53],[56],BarlowtwinsproducetwoviewsforallimagesofabatchXsampledfromadatabaseand?nallyproducebatchesofem-beddingsZAandZB,respectively.TheobjectivefunctionofBarlowtwinsisLBT=L(1_Cii)2+λLLC,(12)iijiationmatrixcomputedbetweentheoutputsoftwoequivalentnetworksalongthebatchdimension:Cij=(13) LbzCij=(13)│Lb╱zi、2│Lb╱zj、2,wherebisthebatchexampleindexandi,jisthevectordimensionindexofthenetworkoutputs.Cisasquarema-trixwithasizeequaltothedimensionalityofthenetworkoutput.lartoBarlowtwins[67],variance-invariance-covariancereg-twinsconsideracross-correlationmatrix,whileVICRegconsidersvariance,invariance,andcovariance.Letd,n,andzdenotethedimensionalityofthevectorsinZA,thebatchsize,andthevectorconsistingofeveryvaluewithdimensionalityjamongallexamplesofZA,respectively.ThevarianceregularizationtermvofVICRegisde?nedasahingelossfunctiononthestandarddeviationoftheembeddingsalongthebatchdimension:v╱ZA、=max(0,γ_S╱z,ε、),(14)whereSistheregularizedstandarddeviation,whichisdenedasS(y,ε)=│Var(y)+ε,(15)whereγisaconstantforthestandarddeviation,whichissetto1intheexperiments,andεisasmallscalarforpreventingnumericalinstabilities.Thiscriterionencouragesthanγforeverydimension,preventingcollapseincaseswherealldataaremappedtothesamevector.TheinvariancecriterionsofVICRegbetweenZAandZBisde?nedasthemean-squaredEuclideandistancebe-tweeneachpair

人人文庫> 全部分類> 行業(yè)資料 > 管理策劃

溫馨提示

1. 本站所有資源如無特殊說明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽，若沒有圖紙預(yù)覽就沒有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 人人文庫網(wǎng)僅提供信息存儲空間，僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理，對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對任何下載內(nèi)容負(fù)責(zé)。
6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容，請與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

東南大學(xué) －自監(jiān)督學(xué)習(xí)算法、理論與應(yīng)用 A Survey of Self-Supervised Learning from Multiple Perspectives Algorithms,Theory,Applications and Future Trends

文檔簡介

溫馨提示

最新文檔

評論

東南大學(xué) －自監(jiān)督學(xué)習(xí)算法、理論與應(yīng)用 A Survey of Self-Supervised Learning from Multiple Perspectives Algorithms,Theory,Applications and Future Trends

文檔簡介

溫馨提示

最新文檔

評論

相關(guān)文檔

東南大學(xué) －自監(jiān)督學(xué)習(xí)算法、理論與應(yīng)用 A Survey of Self-Supervised Learning from Multiple Perspectives Algorithms,Theory,Applications and Future Trends