




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)
文檔簡介
Chapter19ClusteringAnalysis
Chapter19ClusteringAnalysis1ContentSimilaritycoefficientHierarchicalclusteringanalysis
Dynamicclusteringanalysis
OrderedsampleclusteringanalysisContentSimilaritycoefficient2DiscriminantAnalysis:havingknownwithcertaintytocomefromtwoormorepopulations,it’samethodtoacquirethediscriminatemodelthatwillallocatefurtherindividualstothecorrectpopulation.
ClusteringAnalysis:astatisticmethodforgroupingobjectsofrandomkindintorespectivecategories.It’susedwhenthere’snopriorihypotheses,buttryingtofindthemostappropriatesortingmethodresortingtomathematicalstatisticsandsomecollectedinformation.Ithasbecomethefirstselectedmeanstouncovergreatcapacityofgeneticmessages.
Botharemethodsofmultivariatestatisticstostudyclassification.
DiscriminantAnalysis:h3Clusteringanalysisisamethodofexploringstatisticalanalysis.Itcanbeclassifiedintotwomajorspeciesaccordingtoitsaims.Forexample,mreferstothenumberofvariables(i.e.indexes)whilenreferstothatofcases(i.e.samples),youcandoasfollows:
(1)R-typeclustering:alsocalledindexclustering.Themethodtosortthemkindsofindexes,aimingatloweringthedimensionofindexesandchoosingtypicalones.
(2)Q-typeclustering:alsocalledsampleclustering.Themethodtosortthenkindsofsamplestofindthecommonnessamongthem.Clusteringanalysisisa4ThemostimportantthingforbothR-typeclusteringandQ-typeclusteringisthedefinitionofsimilarity,thatishowtoquantifysimilarity.Thefirststepofclusteringistodefinethemetricsimilaritybetweentwoindexesortwosamples-similaritycoefficientThemostimportantthingfo5§1similaritycoefficient
1similaritycoefficientofR-typeclusteringSupposetherearemkindsofvariables:X1,X2,…,Xm.R-typeclusteringusuallyusetheabsolutevalueofsimplecorrelationcoefficienttodefinethesimilaritycoefficientamongvariables:Thetwovariablestendtobemoresimilarwhentheabsolutevalueincreases.Similarly,Spearmanrankcorrelationcoefficientcanbeusedtodefinethesimilaritycoefficientofnon-normalvariables.Butwhenthevariablesareallqualitativevariables,it’sbesttousecontingencycoefficient.
§1similaritycoefficient162.SimilaritycoefficientcommonlyusedinQ-typeclustering:Supposetherearencasesregardasnspotsinamdimensionsspace,distancebetweentwospotscanbeusedtodefinesimilaritycoefficient,thetwosamplestendtobemoresimilarwhenthedistancedeclines.(1)Euclideandistance
(2)Manhattandistance
(3)Minkowskidistance:
AbsolutedistancereferstoMinkowskidistancewhenq=1;Euclideandistanceisdirect-viewingandsimpletocompute,buthavingnotregardedthecorrelatedrelationsamongvariables.That’swhyManhattandistancewasintroduced.(19-5)2.Similaritycoefficientcomm7(4)Mahalanobisdistance:it’susedtoexpressthesamplecovariancematrixamongmkindsofvariables.Itcanbeworkedoutasfollows:
Whenit’saunitmatrix,MahalanobisdistanceequalstothesquareofEuclideandistance.
Allofthefourdistancesrefertoquantitativevariables,forthequalitativevariablesandordinalvariables,quantizationisneededbeforeusing.(4)Mahalanobisdistance:it’s8§2HierarchicalClusteringAnalysisHierarchicalclusteringanalysisisamostcommonlyusedmethodtosortoutsimilarsamplesorvariables.Theprocessisasfollows:
1)Atthebeginning,samples(orvariables)areregardedrespectivelyasonesinglecluster,thatis,eachclustercontainsonlyonesample(orvariable).Thenworkoutsimilaritycoefficientmatrixamongclusters.Thematrixismadeupofsimilaritycoefficientsbetweensamples(orvariables).Similaritycoefficientmatrixisasymmetricalmatrix.
2)Thetwoclusterswiththemaximumsimilaritycoefficient(minimumdistanceormaximumcorrelationcoefficient)aremergedintoanewcluster.Computethesimilaritycoefficientbetweenthenewclusterwithotherclusters.Repeatsteptwountilallofthesamples(orvariables)aremergedintoonecluster.§2HierarchicalClustering9Thecalculationofsimilaritycoefficientbetweenclusters
Eachstepofhierarchicalclusteringhastocalculatethesimilaritycoefficientamongclusters.Whenthereisonlyonesampleorvariableineachofthetwoclusters,thesimilaritycoefficientbetweenthemequalstothatofthetwosamplesorthetwovariables,orcomputeaccordingtosectionone.
Whentherearemorethanonesampleorvariableineachcluster,manykindsofmethodscanbeusedtocomputesimilaritycoefficient.Justlist5kindsofmethodsasfollows.andrefertothetwoclusters,whichrespectivelyhasorkindsofsamplesorvariables.
Thecalculationofsimilarity101.ThemaximumsimilaritycoefficientmethodIfthere’rerespectively,samples(orvariables)inclusterand,here’realtogetherandsimilaritycoefficientsbetweenthetwoclusters,butonlythemaximumisconsideredasthesimilaritycoefficientofthetwoclusters.
Attention:theminimumdistancealsomeansthemaximumsimilaritycoefficient.
2.TheMinimumsimilaritycoefficientmethodsimilaritycoefficientbetweenclusterscanbe
calculatedasfollows:
1.Themaximumsimilaritycoeff113.Thecenterofgravitymethod(onlyusedinsampleclustering)Theweightsaretheindexmeansamongclusters.Itcanbecomputedasfollows:
4.Clusterequilibrationmethod(onlyusedin
sample
clustering)workouttheaveragesquaredistancebetweentwosamplesofeachcluster.
Clusterequilibrationisoneofthegoodmethodsinthehierarchicalclustering,becauseitcanfullyreflecttheindividualinformationwithinacluster.
3.Thecenterofgravitymeth125.sumofsquaresofdeviations
methodalsocalledWardmethod,onlyforsampleclustering.Itimitatesthebasicthoughtsofvarianceanalysis,thatis,arationalclassificationcanmakethesumofsquaresofdeviationwithinaclustersmaller,whilethatamongclusterslarger.Supposethatsampleshavebeenclassifiedintogclusters,includingand.Thesumofsquaresofdeviationsofclusterfromsamplesis:(isthemeanof).Themergedsumofsquaresofdeviationsofallthegclustersis.Ifandaremerged,therewillbeg-1clusters.
Theincrementofmergedsumofsquaresofdeviationsis,whichisdefinedasthesquaredistancebetweenthetwoclusters.Obviously,whennsamplesrespectivelyformsasinglecluster,themergedsumofsquaresofdeviationis0.5.sumofsquaresofdeviations13Sample19-1There’refourvariablessurveyingfrom3454femaleadults:height(X1)、lengthoflegs(X2)、waistline(X3)andchestcircumference(X4).Thecorrelationmatrixhasbeenworkedoutasfollows:
Trytousehierarchicalclusteringtoclusterthe4indexes.
ThisisacaseofR-type(index)clustering.Wechoosesimplesimilaritycoefficientasthesimilaritycoefficient,andusemaximumsimilaritycoefficientmethodtocalculatethesimilaritycoefficientamongclusters.Sample19-1There’refou14
Theclusteringprocedureislistedasfollows:(1)eachindexisregardedasasingleclusterG1={X1},G2={X2},G3={X3},G4={X4}.There’realtogether4clusters.
(2)Mergethetwoclusterswithmaximumsimilaritycoefficientintoanewcluster.Inthiscase,wemergeG1andG2(similaritycoefficientis0.852)asG5={X1,X2}.CalculatethesimilaritycoefficientamongG5、G3andG4.
ThesimilarmatrixamongG3,G4andG5:Theclusteringprocedure15
(3)MergeG3andG4asG6={G3,G4},forthistimethesimilaritycoefficientbetweenG3andG4ranksthelargest(0.732).ComputethesimilaritycoefficientbetweenG6andG5.
(4)LastlyG5andG6aremergedintooneclusterG7={G5,G6},whichinfactincludesalltheprimitiveindexes.(3)MergeG3andG4asG6={16Drawthehierarchicaldendrogram(picture19-1)accordingtotheprocessofclustering.Asthepictureindicates,it’sbettertobeclassifiedintotwoclusters:{X1,X2},{X3,X4}.Thatis,lengthindexasoneclusterwhilecircumferenceastheotherone.
height
lengthwaistlinechestoflegscircumference
Picture19-1hierarchicaldendrogramwith4indexesDrawthehierarchicalden17Sample19-2Table19-1liststhemeansofenergyexpenditureandsugarexpenditureoffourathleticitemsfromsixathletes.Inordertoprovidecorrespondentdietarystandardtoimproveperformancerecord,pleaseclustertheathleticitemsusinghierarchicalclustering.
Table19-1measurevaluesof4athleticitemsAthleticitemsEnergyexpenditureX1(joule/minute、m2)SugarexpenditureX2(%)WeightloadingcrouchingG127.89261.421.3150.688Pull-upG223.47556.830.1740.088Push-upsG318.92445.13-1.001-1.441Sit-upG420.91361.25-0.4880.665Sample19-2Table19-118
WechooseMinkowskidistanceinthissample,anduseminimumsimilaritycoefficientmethodtocalculatedistancesamongclusters.Toreducetheeffectofvariabledimensions,thevariablesshouldbestandardizedbeforeanalysis.respectivelyreferstothesamplemeanandstandarddeviationofXi.Thedataaftertransformationarelistedintable19-1.WechooseMinkowskidistanc19Theclusteringprocess:
(1)computethesimilaritycoefficientmatrix(i.e.distancematrix)ofthe4samples.Thedistanceofweightloadingcrouchingandpull-upscanbeworkoutusingformula(19-3).
Likewise,thedistancebetweenweightloadingcrouchingandpush-upscanbecomputedasfollows:Lastly,workoutthedistancematrix:
Theclusteringprocess:
(20(2)ThedistancebetweenG2andG4istheminimum,soG2andG4shouldbeemergedintoanewclusterG5={G2,G4}.ComputethedistancebetweenG5andotherclustersusingminimumsimilaritycoefficientmethodaccordingtoformula(19-8).
ThedistancematrixofG1,G3andG5:
(3)MergeG1andG5intoanewclusterG6={G1,G5}.ComputethedistancebetweenG6andG3:(4)lastlymergeG1andG6intoG7={G1,G6}.Alltheindexeshaveallbeenmergedintoalargecluster.(2)ThedistancebetweenG221
Accordingtotheprocessofclustering,drawoutthethehierarchydendrogram(chart19-2).Asthehierarchydendrogramshowsandexpertisewehavelearned,theindexesshouldbesortedintotwoclusters:{G1,G2,G4}and{G3}.Physicalenergyexpenditureinweightloadingcrouching、pull-upsandsit-upswouldbemuchhigher,dietarystandardimprovementmightberequiredinthoseitemsduringtraining.Accordingtotheprocess22
Analysisofclusteringexamples
Differentdefinitionofsimilaritycoefficientandthatamongclusterswillcausedifferentclusteringresults.Expertiseaswellasclusteringmethodisimportanttotheexplanationofclusteringanalysis.Analysisofclusteringexam23
Sample19-3twenty-sevenpetroleumpitchworkersandpyro-furnacemanaresurveyedabouttheirages,lengthofserviceandsmokinginformation.Inaddition,detectionsofsero-P21,sero-P53,peripheralbloodlymphocyteSCE,thenumberofchromosomalaberrationandthenumberofcellsthathadhappenedchromosomalaberrationwerecarriedoutamongtheseworkers(table19-3).(P21mutiple=P21detectionvalue/themeanofcontrolgroupP21)Pleasesortthe27workersusinghierarchicalclusteringserviceablymethod.
Sample19-3twenty-seven24Table19-3resultofbio-markerdetectionandclusteringanalysisofpetroleumpitchworkersandpyro-furnacemanSampleNumberageLengthofservicesmokeRamus/dSero-P21P21MultipleP53SCENumberofchromosomeaberrationNumberofcellsofChromosomeaberrationresultofculsterin680.358.1144235122035102.761.436.84331352252027842.190.544.1133143272024511.930.4711.4596153822032472.560.8011.68551651313037102.920.3711.6022174091031942.510.4011.40551834172046583.670.4611.3533195029050193.950.4713.4510811042202074825.890.1213.110021157301538002.990.1910.762211236152024781.950.2510.00001133712038273.010.8210.50441145232029842.350.1611.153311552321037492.950.7211.45111011642273049413.890.7313.807611744272039483.110.3313.6516141184021533602.640.3711.40001193821529362.310.6911.401112044272068515.390.9912.28762214327039263.090.4711.95001222610343813.450.5211.807512337182071425.620.8511.81552242892026122.060.3711.65111252593026382.080.7812.251112634142043223.400.4115.005512750322028622.250.698.80221Table19-3resultofbio-marke25ThisexampleapplyminimumsimilaritycoefficientmethodoriginatingfromEuclideandistance,clusterequilibrationmethodandsumofsquaresofdeviationsmethodtoclusterthedata.Theresultsarelistedinchart19-3,chart19-4andchart19-5.Allthevariableshavebeenstandardizedbeforeanalysis.Thisexampleapplyminimum26
chart19-3thehierarchydendrogramof27petroleumpitchworkersandpyro-furnacemenusingminimumsimilaritycoefficientmethodchart19-3thehierarchyden27Chart19-4thehierarchydendrogramof27petroleumpitchworkersandpyro-furnacemenusingclusterequilibrationmethodChart19-4thehierarchydend28Chart19-5thehierarchydendrogramof27petroleumpitchworkersandpyro-furnacemenusingsumofsquaresofdeviationsmethod
Chart19-5thehierarchydendr29Theoutcomesofthethreekindsofclusteringarenotthesame,fromwhichwecanseedifferentwayshavedifferentefficiency.Thedifferencesaremoredistinctincaseofmorevariables.Soyou’dbetterselectefficientvariablesbeforeclusteringanalysis.Suchasthep21andp53inthisexample.Youcangetmoreinformationbyreadingtheclusteringchart.Theoutcomesofthethreek30Accordingtoexpertise,wecanseetheoutcomeofequilibrationclusteringismorereasonable.Theclassifyingresultisfilledinthelastcolumn.Workersnumbered{10,20,23}areclassifiedasoneclass;othersareanother.researchersfindthatworkersnumbered{10,20,23}areinhighriskofcancer.Number{10,20,23,8,16,26}areclusteredtogetheraccordingtothechartofsumofsquaresofdeviations,remindingthatworkersof8,16,26maybeinhighrisktoo.Accordingtoexpertise,we31DynamicclusteringIftherearetoomanysamplesunderclassified,hierarchyclusteringanalysisdemandsmorespacetostoresimilaritycoefficientmatrix.andisquiteinefficient.What’smore,samplescan’tbechangedoncetheyareclassified.Becauseoftheseshortcomings,statistsputforwarddynamicclusteringwhichcanovercometheinefficiencyandadjusttheclassifyingalongwiththeprocessofclustering.DynamicclusteringIfther32Theprincipleofdynamicclusteringanalysisis:firstly,selectseveralrepresentativesamples,calledcohesionpoint,asthecoreofeachclass;secondly,classifyothers.adjustthecoreofeachclassuntilclassifyingisreasonable.Themostcommonwayofdynamicclusteringanalysisisk-means,whichisquiteefficientandit’sprincipleissimple.Wecangettheoutcomesevenifsamplesareinlargenumber.Howeverwehavetoknowhowmanyclassesthesamplesareclassifiedintobeforeanalysis.wemayknowundersomecircumstancesintermsofexpertise,butnotinothercases.Theprincipleofdynamicc33OrdinalClusteringMethodsClusteringanalysismentionedbeforearefornon-sequencedsamples.Butthereareanotherkindofdata,suchasagesofdevelopmentdata,incidencerateindifferentyearsandlocations.Thesedataareinorderintimeandspace,sotheyarecalledordinaldata.Wehavetotaketheorderintoconsiderationbeforeclassifyingandcannotdestroytheordersothatwecallitordinalclusteringmethods.OrdinalClusteringMethodsC34Attentions
1.Clusteringanalysisisusedtoexploredata.Explanationofoutcomesmustbeintegratedwithexpertise.trydifferentwaysofclusteringtogetreasonableoutcomes.2.pre-disposevariableandgetridofuselessvariablewhichchangelittleandthesewithtoomanyabsences.generallyspeaking,weneedtomakestandardtransformorrangetransformtoeliminateeffectofdimensionandcoefficientofvariation.Attentions
1.Clusteringana353.Reasonableoutcomesofclassifyingwillleadtodistinctdifferencesbetweenclasses,andminuteinclass.afterclassifyingwecanapplyanalysisofvarianceincaseofsinglevariable,incaseofmultiplevariabletocheckstatisticaldifferencesbetweenclasses.4.fuzzyclusteringanalysis,neuro-networksclusteringanalysis,andotherspecificanalysistoexploregeneticdataarenotintroducedhere,pleaseinquirerelatedinformationoninternet.3.Reasonableoutcomesofclas36Enjoylearning!Enjoylearning!37Chapter19ClusteringAnalysis
Chapter19ClusteringAnalysis38ContentSimilaritycoefficientHierarchicalclusteringanalysis
Dynamicclusteringanalysis
OrderedsampleclusteringanalysisContentSimilaritycoefficient39DiscriminantAnalysis:havingknownwithcertaintytocomefromtwoormorepopulations,it’samethodtoacquirethediscriminatemodelthatwillallocatefurtherindividualstothecorrectpopulation.
ClusteringAnalysis:astatisticmethodforgroupingobjectsofrandomkindintorespectivecategories.It’susedwhenthere’snopriorihypotheses,buttryingtofindthemostappropriatesortingmethodresortingtomathematicalstatisticsandsomecollectedinformation.Ithasbecomethefirstselectedmeanstouncovergreatcapacityofgeneticmessages.
Botharemethodsofmultivariatestatisticstostudyclassification.
DiscriminantAnalysis:h40Clusteringanalysisisamethodofexploringstatisticalanalysis.Itcanbeclassifiedintotwomajorspeciesaccordingtoitsaims.Forexample,mreferstothenumberofvariables(i.e.indexes)whilenreferstothatofcases(i.e.samples),youcandoasfollows:
(1)R-typeclustering:alsocalledindexclustering.Themethodtosortthemkindsofindexes,aimingatloweringthedimensionofindexesandchoosingtypicalones.
(2)Q-typeclustering:alsocalledsampleclustering.Themethodtosortthenkindsofsamplestofindthecommonnessamongthem.Clusteringanalysisisa41ThemostimportantthingforbothR-typeclusteringandQ-typeclusteringisthedefinitionofsimilarity,thatishowtoquantifysimilarity.Thefirststepofclusteringistodefinethemetricsimilaritybetweentwoindexesortwosamples-similaritycoefficientThemostimportantthingfo42§1similaritycoefficient
1similaritycoefficientofR-typeclusteringSupposetherearemkindsofvariables:X1,X2,…,Xm.R-typeclusteringusuallyusetheabsolutevalueofsimplecorrelationcoefficienttodefinethesimilaritycoefficientamongvariables:Thetwovariablestendtobemoresimilarwhentheabsolutevalueincreases.Similarly,Spearmanrankcorrelationcoefficientcanbeusedtodefinethesimilaritycoefficientofnon-normalvariables.Butwhenthevariablesareallqualitativevariables,it’sbesttousecontingencycoefficient.
§1similaritycoefficient1432.SimilaritycoefficientcommonlyusedinQ-typeclustering:Supposetherearencasesregardasnspotsinamdimensionsspace,distancebetweentwospotscanbeusedtodefinesimilaritycoefficient,thetwosamplestendtobemoresimilarwhenthedistancedeclines.(1)Euclideandistance
(2)Manhattandistance
(3)Minkowskidistance:
AbsolutedistancereferstoMinkowskidistancewhenq=1;Euclideandistanceisdirect-viewingandsimpletocompute,buthavingnotregardedthecorrelatedrelationsamongvariables.That’swhyManhattandistancewasintroduced.(19-5)2.Similaritycoefficientcomm44(4)Mahalanobisdistance:it’susedtoexpressthesamplecovariancematrixamongmkindsofvariables.Itcanbeworkedoutasfollows:
Whenit’saunitmatrix,MahalanobisdistanceequalstothesquareofEuclideandistance.
Allofthefourdistancesrefertoquantitativevariables,forthequalitativevariablesandordinalvariables,quantizationisneededbeforeusing.(4)Mahalanobisdistance:it’s45§2HierarchicalClusteringAnalysisHierarchicalclusteringanalysisisamostcommonlyusedmethodtosortoutsimilarsamplesorvariables.Theprocessisasfollows:
1)Atthebeginning,samples(orvariables)areregardedrespectivelyasonesinglecluster,thatis,eachclustercontainsonlyonesample(orvariable).Thenworkoutsimilaritycoefficientmatrixamongclusters.Thematrixismadeupofsimilaritycoefficientsbetweensamples(orvariables).Similaritycoefficientmatrixisasymmetricalmatrix.
2)Thetwoclusterswiththemaximumsimilaritycoefficient(minimumdistanceormaximumcorrelationcoefficient)aremergedintoanewcluster.Computethesimilaritycoefficientbetweenthenewclusterwithotherclusters.Repeatsteptwountilallofthesamples(orvariables)aremergedintoonecluster.§2HierarchicalClustering46Thecalculationofsimilaritycoefficientbetweenclusters
Eachstepofhierarchicalclusteringhastocalculatethesimilaritycoefficientamongclusters.Whenthereisonlyonesampleorvariableineachofthetwoclusters,thesimilaritycoefficientbetweenthemequalstothatofthetwosamplesorthetwovariables,orcomputeaccordingtosectionone.
Whentherearemorethanonesampleorvariableineachcluster,manykindsofmethodscanbeusedtocomputesimilaritycoefficient.Justlist5kindsofmethodsasfollows.andrefertothetwoclusters,whichrespectivelyhasorkindsofsamplesorvariables.
Thecalculationofsimilarity471.ThemaximumsimilaritycoefficientmethodIfthere’rerespectively,samples(orvariables)inclusterand,here’realtogetherandsimilaritycoefficientsbetweenthetwoclusters,butonlythemaximumisconsideredasthesimilaritycoefficientofthetwoclusters.
Attention:theminimumdistancealsomeansthemaximumsimilaritycoefficient.
2.TheMinimumsimilaritycoefficientmethodsimilaritycoefficientbetweenclusterscanbe
calculatedasfollows:
1.Themaximumsimilaritycoeff483.Thecenterofgravitymethod(onlyusedinsampleclustering)Theweightsaretheindexmeansamongclusters.Itcanbecomputedasfollows:
4.Clusterequilibrationmethod(onlyusedin
sample
clustering)workouttheaveragesquaredistancebetweentwosamplesofeachcluster.
Clusterequilibrationisoneofthegoodmethodsinthehierarchicalclustering,becauseitcanfullyreflecttheindividualinformationwithinacluster.
3.Thecenterofgravitymeth495.sumofsquaresofdeviations
methodalsocalledWardmethod,onlyforsampleclustering.Itimitatesthebasicthoughtsofvarianceanalysis,thatis,arationalclassificationcanmakethesumofsquaresofdeviationwithinaclustersmaller,whilethatamongclusterslarger.Supposethatsampleshavebeenclassifiedintogclusters,includingand.Thesumofsquaresofdeviationsofclusterfromsamplesis:(isthemeanof).Themergedsumofsquaresofdeviationsofallthegclustersis.Ifandaremerged,therewillbeg-1clusters.
Theincrementofmergedsumofsquaresofdeviationsis,whichisdefinedasthesquaredistancebetweenthetwoclusters.Obviously,whennsamplesrespectivelyformsasinglecluster,themergedsumofsquaresofdeviationis0.5.sumofsquaresofdeviations50Sample19-1There’refourvariablessurveyingfrom3454femaleadults:height(X1)、lengthoflegs(X2)、waistline(X3)andchestcircumference(X4).Thecorrelationmatrixhasbeenworkedoutasfollows:
Trytousehierarchicalclusteringtoclusterthe4indexes.
ThisisacaseofR-type(index)clustering.Wechoosesimplesimilaritycoefficientasthesimilaritycoefficient,andusemaximumsimilaritycoefficientmethodtocalculatethesimilaritycoefficientamongclusters.Sample19-1There’refou51
Theclusteringprocedureislistedasfollows:(1)eachindexisregardedasasingleclusterG1={X1},G2={X2},G3={X3},G4={X4}.There’realtogether4clusters.
(2)Mergethetwoclusterswithmaximumsimilaritycoefficientintoanewcluster.Inthiscase,wemergeG1andG2(similaritycoefficientis0.852)asG5={X1,X2}.CalculatethesimilaritycoefficientamongG5、G3andG4.
ThesimilarmatrixamongG3,G4andG5:Theclusteringprocedure52
(3)MergeG3andG4asG6={G3,G4},forthistimethesimilaritycoefficientbetweenG3andG4ranksthelargest(0.732).ComputethesimilaritycoefficientbetweenG6andG5.
(4)LastlyG5andG6aremergedintooneclusterG7={G5,G6},whichinfactincludesalltheprimitiveindexes.(3)MergeG3andG4asG6={53Drawthehierarchicaldendrogram(picture19-1)accordingtotheprocessofclustering.Asthepictureindicates,it’sbettertobeclassifiedintotwoclusters:{X1,X2},{X3,X4}.Thatis,lengthindexasoneclusterwhilecircumferenceastheotherone.
height
lengthwaistlinechestoflegscircumference
Picture19-1hierarchicaldendrogramwith4indexesDrawthehierarchicalden54Sample19-2Table19-1liststhemeansofenergyexpenditureandsugarexpenditureo
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 教育行業(yè)地推營銷方案
- 2025年會計職稱考試《初級會計實務(wù)》高頻考點串聯(lián)歷年真題解析題
- 2025年導(dǎo)游資格證考試筆試模擬試卷:導(dǎo)游業(yè)務(wù)與導(dǎo)游詞編寫綜合強化試題
- 脊椎骨折急救護理
- 2025年征信考試題庫:征信信用修復(fù)關(guān)鍵點試題
- 2025年注冊建筑師建筑環(huán)境創(chuàng)新設(shè)計考試試卷
- 2025年鄉(xiāng)村醫(yī)生考試題庫:農(nóng)村醫(yī)療衛(wèi)生機構(gòu)管理與基層醫(yī)療機構(gòu)發(fā)展模式創(chuàng)新試題試卷
- 2025年征信考試題庫:征信數(shù)據(jù)分析挖掘綜合能力測試與案例分析試題
- 2025年成人高考《語文》語言邏輯應(yīng)用題庫實戰(zhàn)演練
- 2025年茶藝師高級技能考核試卷:茶藝師茶葉泡制技巧試題
- 精益生產(chǎn)能力成熟度評價標準
- 2023-2024學(xué)年河南省焦作市八年級(下)期末數(shù)學(xué)試卷(含答案)
- 2024CSCO胃腸間質(zhì)瘤診療指南解讀
- 泛血管疾病抗栓治療中國專家共識(2024版)
- 營運能力分析國外研究現(xiàn)狀
- 國開2024《人文英語4》邊學(xué)邊練參考答案
- 統(tǒng)編版四年級下冊語文第六單元 口語交際:朋友相處的秘訣 課件
- 西北政法大學(xué)課件模板
- 第二單元大單元教學(xué)設(shè)計 2023-2024學(xué)年統(tǒng)編版高中語文必修上冊
- 注塑車間現(xiàn)場改善提案
- 中國流行音樂的發(fā)展史
評論
0/150
提交評論