




版權(quán)說(shuō)明:本文檔由用戶(hù)提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
ClusteringClusteringOverviewPartitioningMethodsK-MeansSequentialLeaderModelBasedMethodsDensityBasedMethodsHierarchicalMethods2OverviewPartitioningMethods2Whatisclusteranalysis?FindinggroupsofobjectsObjectssimilartoeachotherareinthesamegroup.Objectsaredifferentfromthoseinothergroups.UnsupervisedLearningNolabelsDatadriven3Whatisclusteranalysis?FindiClustersInter-ClusterIntra-Cluster4ClustersInter-ClusterIntra-CluClusters5Clusters5ApplicationsofClusteringMarketingFindinggroupsofcustomerswithsimilarbehaviours.BiologyFindinggroupsofanimalsorplantswithsimilarfeatures.BioinformaticsClusteringmicroarraydata,genesandsequences.EarthquakeStudiesClusteringobservedearthquakeepicenterstoidentifydangerouszones.WWWClusteringweblogdatatodiscovergroupsofsimilaraccesspatterns.SocialNetworksDiscoveringgroupsofindividualswithclosefriendshipsinternally.6ApplicationsofClusteringMarkEarthquakes7Earthquakes7ImageSegmentation8ImageSegmentation8TheBigPicture9TheBigPicture9RequirementsScalabilityAbilitytodealwithdifferenttypesofattributesAbilitytodiscoverclusterswitharbitraryshapeMinimumrequirementsfordomainknowledgeAbilitytodealwithnoiseandoutliersInsensitivitytoorderofinputrecordsIncorporationofuser-definedconstraintsInterpretabilityandusability10RequirementsScalability10PracticalConsiderationsScalingmatters!11PracticalConsiderationsScalinNormalizationorNot?12NormalizationorNot?121313EvaluationVS.14EvaluationVS.14Evaluation15Evaluation15SilhouetteAmethodofinterpretationandvalidationofclustersofdata.Asuccinctgraphicalrepresentationofhowwelleachdatapointlieswithinitsclustercomparedtootherclusters.a(i):averagedissimilarityofiwithallotherpointsinthesameclusterb(i):thelowestaveragedissimilarityofitootherclusters16SilhouetteAmethodofinterpreSilhouette17Silhouette17K-Means18K-Means18K-Means19K-Means19K-Means20K-Means20K-MeansDeterminethevalueofK.ChooseKclustercentresrandomly.Eachdatapointisassignedtoitsclosestcentroid.Usethemeanofeachclustertoupdateeachcentroid.Repeatuntilnomorenewassignment.ReturntheKcentroids.ReferenceJ.MacQueen(1967):"SomeMethodsforClassificationandAnalysisofMultivariateObservations",Proceedingsofthe5thBerkeleySymposiumonMathematicalStatisticsandProbability,vol.1,pp.281-297.21K-MeansDeterminethevalueofCommentsonK-MeansProsSimpleandworkswellforregulardisjointclusters.Convergesrelativelyfast.RelativelyefficientandscalableO(t?k?n)t:iteration;k:numberofcentroids;n:numberofdatapointsConsNeedtospecifythevalueofKinadvance.Difficultanddomainknowledgemayhelp.Mayconvergetolocaloptima.Inpractice,trydifferentinitialcentroids.Maybesensitivetonoisydataandoutliers.Meanofdatapoints…NotsuitableforclustersofNon-convexshapes22CommentsonK-MeansPros22TheInfluenceofInitialCentroids23TheInfluenceofInitialCentrTheInfluenceofInitialCentroids24TheInfluenceofInitialCentrSequentialLeaderClusteringAveryefficientclusteringalgorithm.NoiterationAsinglepassofthedataNoneedtospecifyKinadvance.Chooseaclusterthresholdvalue.Foreverynewdatapoint:Computethedistancebetweenthenewdatapointandeverycluster'scentre.Iftheminimumdistanceissmallerthanthechosenthreshold,assignthenewdatapointtothecorrespondingclusterandre-computeclustercentre.Otherwise,createanewclusterwiththenewdatapointasitscentre.Clusteringresultsmaybeinfluencedbythesequenceofdatapoints.25SequentialLeaderClusteringA2626GaussianMixture27GaussianMixture27ClusteringbyMixtureModels28ClusteringbyMixtureModels28K-MeansRevisited
modelparameterslatentparameters29K-MeansRevisited
modelparamExpectationMaximization30ExpectationMaximization30
31
31EM:GaussianMixture32EM:GaussianMixture323333DensityBasedMethodsGenerateclustersofarbitraryshapes.Robustagainstnoise.NoKvaluerequiredinadvance.Somewhatsimilartohumanvision.34DensityBasedMethodsGenerateDBSCANDensity-BasedSpatialClusteringofApplicationswithNoiseDensity:numberofpointswithinaspecifiedradiusCorePoint:pointswithhighdensityBorderPoint:pointswithlowdensitybutintheneighbourhoodofacorepointNoisePoint:neitheracorepointnoraborderpointCorePointNoisePointBorderPoint35DBSCANDensity-BasedSpatialClDBSCANpqdirectlydensityreachablepqdensityreachableoqpdensityconnected36DBSCANpqdirectlydensityreachDBSCANAclusterisdefinedasthemaximalsetofdensityconnectedpoints.StartfromarandomlyselectedunseenpointP.IfPisacorepoint,buildaclusterbygraduallyaddingallpointsthataredensityreachabletothecurrentpointset.Noisepointsarediscarded(unlabelled).37DBSCANAclusterisdefinedasHierarchicalClusteringProduceasetofnestedtree-likeclusters.Canbevisualizedasadendrogram.Clusteringisobtainedbycuttingatdesiredlevel.NoneedtospecifyKinadvance.Maycorrespondtomeaningfultaxonomies.38HierarchicalClusteringProduceAgglomerativeMethodsBottom-upMethodAssigneachdatapointtoacluster.Calculatetheproximitymatrix.Mergethepairofclosestclusters.Repeatuntilonlyasingleclusterremains.Howtocalculatethedistancebetweenclusters?SingleLinkMinimumdistancebetweenpointsCompleteLinkMaximumdistancebetweenpoints39AgglomerativeMethodsBottom-upExample
BAFIMINARMTOBA0662877255412996FI6620295468268400MI8772950754564138NA2554687540219869RM4122685642190669TO9964001388696690SingleLink40Example
BAFIMINARMTOBA06628772Example
BAFIMI/TONARMBA0662877255412FI6620295468268MI/TO8772950754564NA2554687540219RM412268564219
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 車(chē)輛轉(zhuǎn)換合同范本
- 材料合同和安裝合同范本
- 制訂藥柜合同范本
- Zopocianine-sodium-OTL-0078-sodium-生命科學(xué)試劑-MCE
- PTI-1-生命科學(xué)試劑-MCE
- P2Y1-antagonist-1-生命科學(xué)試劑-MCE
- BRD4-Inhibitor-39-生命科學(xué)試劑-MCE
- 建筑施工特種作業(yè)人員安全技術(shù)理論考核試題-高處作業(yè)、吊籃安裝拆卸工專(zhuān)業(yè)試題
- 拆除墻面合同范本
- 集體工程合同范本
- 最全麥當(dāng)勞員工手冊(cè)
- 潔凈室管理培訓(xùn)
- 城鎮(zhèn)詳細(xì)設(shè)計(jì)控制性詳細(xì)規(guī)劃
- 晶體學(xué)之晶體的宏觀(guān)對(duì)稱(chēng)PPT課件
- 質(zhì)量管理體系過(guò)程識(shí)別矩陣圖及與條款對(duì)照表
- 加班調(diào)休單(最新版)
- 智慧金字塔立體篇第四冊(cè)、第五冊(cè)答案全解
- 導(dǎo)論公共財(cái)政學(xué)概論.ppt
- 2022年人力資源管理師課程表
- 夢(mèng)中的婚禮鋼琴簡(jiǎn)譜(共6頁(yè))
- 新生兒心理的發(fā)生
評(píng)論
0/150
提交評(píng)論