




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
云計(jì)算與云數(shù)據(jù)管理陸嘉恒中國(guó)人民大學(xué)《先進(jìn)數(shù)據(jù)管理》前沿講習(xí)班2023/12/61主要內(nèi)容2
云計(jì)算概述Google云計(jì)算技術(shù):GFS,Bigtable和MapreduceYahoo云計(jì)算技術(shù)和Hadoop云數(shù)據(jù)管理的挑戰(zhàn)2023/12/62人民大學(xué)新開的《分布式系統(tǒng)與云計(jì)算》課程3
分布式系統(tǒng)概述分布式云計(jì)算技術(shù)綜述分布式云計(jì)算平臺(tái)分布式云計(jì)算程序開發(fā)2023/12/63第一篇分布式系統(tǒng)概述4第一章:分布式系統(tǒng)入門第二章:客戶-服務(wù)器端構(gòu)架第三章:分布式對(duì)象第四章:公共對(duì)象請(qǐng)求代理結(jié)構(gòu)(CORBA)2023/12/64第二篇云計(jì)算綜述5第五章:云計(jì)算入門
第六章:云服務(wù)第七章:云相關(guān)技術(shù)比較7.1網(wǎng)格計(jì)算和云計(jì)算7.2Utility計(jì)算(效用計(jì)算)和云計(jì)算7.3并行和分布計(jì)算和云計(jì)算7.4集群計(jì)算和云計(jì)算
2023/12/65第三篇云計(jì)算平臺(tái)6第八章:Google云平臺(tái)的三大技術(shù)第九章:Yahoo云平臺(tái)的技術(shù)第十章:Aneka云平臺(tái)的技術(shù)第十一章:Greenplum云平臺(tái)的技術(shù)第十二章:Amazondynamo云平臺(tái)的技術(shù)2023/12/66第四篇云計(jì)算平臺(tái)開發(fā)7第十三章:基于Hadoop系統(tǒng)開發(fā)第十四章:基于HBase系統(tǒng)開發(fā)第十五章:基于GoogleApps系統(tǒng)開發(fā)第十六章:基于MSAzure系統(tǒng)開發(fā)第十七章:基于AmazonEC2系統(tǒng)開發(fā)2023/12/67Cloudcomputing2023/12/682023/12/69Whyweusecloudcomputing?2023/12/610Whyweusecloudcomputing?Case1:WriteafileSaveComputerdown,fileislostFilesarealwaysstoredincloud,neverlost2023/12/611Whyweusecloudcomputing?Case2:UseIEdownload,install,useUseQQdownload,install,useUseC++download,install,use……Gettheservefromthecloud2023/12/612Whatiscloudandcloudcomputing?CloudDemandresourcesorservicesoverInternetscaleandreliabilityofadatacenter.2023/12/613Whatiscloudandcloudcomputing?
CloudcomputingisastyleofcomputinginwhichdynamicallyscalableandoftenvirtualizedresourcesareprovidedasaserveovertheInternet.Usersneednothaveknowledgeof,expertisein,orcontroloverthetechnologyinfrastructureinthe"cloud"thatsupportsthem.
2023/12/614CharacteristicsofcloudcomputingVirtual.software,databases,Webservers,operatingsystems,storageandnetworkingasvirtualservers.Ondemand.addandsubtractprocessors,memory,networkbandwidth,storage.2023/12/615IaaSInfrastructureasaServicePaaSPlatformasaServiceSaaSSoftwareasaServiceTypesofcloudservice2023/12/616SoftwaredeliverymodelNohardwareorsoftwaretomanageServicedeliveredthroughabrowserCustomersusetheserviceondemandInstantScalabilitySaaS2023/12/617ExamplesYourcurrentCRMpackageisnotmanagingtheloadoryousimplydon’twanttohostitin-house.UseaSaaSprovidersuchasS
Youremailishostedonanexchangeserverinyourofficeanditisveryslow.OutsourcethisusingHostedExchange.SaaS2023/12/618PlatformdeliverymodelPlatformsarebuiltuponInfrastructure,whichisexpensiveEstimatingdemandisnotascience!Platformmanagementisnotfun!PaaS2023/12/619ExamplesYouneedtohostalargefile(5Mb)onyourwebsiteandmakeitavailablefor35,000usersforonlytwomonthsduration.UseCloudFrontfromAmazon.Youwanttostartstorageservicesonyournetworkforalargenumberoffilesandyoudonothavethestoragecapacity…useAmazonS3.PaaS2023/12/620ComputerinfrastructuredeliverymodelAplatformvirtualizationenvironmentComputingresources,suchasstoringandprocessingcapacity.
VirtualizationtakenastepfurtherIaaS2023/12/621ExamplesYouwanttorunabatchjobbutyoudon’thavetheinfrastructurenecessarytorunitinatimelymanner.UseAmazonEC2.
Youwanttohostawebsite,butonlyforafewdays.UseFlexiscale.IaaS2023/12/622Cloudcomputingandothercomputingtechniques2023/12/623The21stCenturyVisionOfComputingLeonardKleinrock,oneofthechiefscientistsoftheoriginalAdvancedResearchProjectsAgencyNetwork(ARPANET)projectwhichseededtheInternet,said:“Asofnow,computernetworksarestillintheirinfancy,butastheygrowupandbecomesophisticated,wewillprobablyseethespreadof‘computerutilities’which,likepresentelectricandtelephoneutilities,willserviceindividualhomesandofficesacrossthecountry.”2023/12/624The21stCenturyVisionOfComputingSunMicrosystemsco-founderBillJoyHealsoindicated“Itwouldtaketimeuntilthesemarketstomaturetogeneratethiskindofvalue.Predictingnowwhichcompanieswillcapturethevalueisimpossible.Manyofthemhavenotevenbeencreatedyet.”2023/12/625The21stCenturyVisionOfComputing2023/12/626DefinitionsCloudGridClusterutility2023/12/627DefinitionsCloudGridClusterutilityUtilitycomputingisthepackagingofcomputingresources,suchascomputationandstorage,asameteredservicesimilartoatraditionalpublicutility2023/12/628DefinitionsCloudGridClusterutilityAcomputerclusterisagroupoflinkedcomputers,workingtogethercloselysothatinmanyrespectstheyformasinglecomputer.2023/12/629DefinitionsCloudGridClusterutilityGridcomputingistheapplicationofseveralcomputerstoasingleproblematthesametime—usuallytoascientificortechnicalproblemthatrequiresagreatnumberofcomputerprocessingcyclesoraccesstolargeamountsofdata2023/12/630DefinitionsCloudGridClusterutilityCloudcomputingisastyleofcomputinginwhichdynamicallyscalableandoftenvirtualizedresourcesareprovidedasaserviceovertheInternet.2023/12/631GridComputing&CloudComputingsharealotcommonalityintention,architectureandtechnology
Differenceprogrammingmodel,businessmodel,computemodel,applications,andVirtualization.2023/12/632GridComputing&CloudComputingtheproblemsaremostlythesamemanagelargefacilities;definemethodsbywhichconsumersdiscover,requestanduseresourcesprovidedbythecentralfacilities;implementtheoftenhighlyparallelcomputationsthatexecuteonthoseresources.2023/12/633GridComputing&CloudComputingVirtualizationGriddonotrelyonvirtualizationasmuchasCloudsdo,eachindividualorganizationmaintainfullcontroloftheirresourcesCloudanindispensableingredientforalmosteveryCloud2023/12/6342023/12/6352023/12/636Anyquestionandanycomments?2023/12/636主要內(nèi)容37
云計(jì)算概述Google云計(jì)算技術(shù):GFS,Bigtable和MapreduceYahoo云計(jì)算技術(shù)和Hadoop云數(shù)據(jù)管理的挑戰(zhàn)2023/12/637GoogleCloudcomputingtechniques2023/12/638TheGoogleFileSystem 2023/12/639TheGoogleFileSystem (GFS)AscalabledistributedfilesystemforlargedistributeddataintensiveapplicationsMultipleGFSclustersarecurrentlydeployed.Thelargestoneshave:1000+storagenodes300+TeraBytesofdiskstorageheavilyaccessedbyhundredsofclientsondistinctmachines2023/12/640IntroductionSharesmanysamegoalsaspreviousdistributedfilesystemsperformance,scalability,reliability,etcGFSdesignhasbeendrivenbyfourkeyobservationofGoogleapplicationworkloadsandtechnologicalenvironment2023/12/641Intro:Observations11.Componentfailuresarethenormconstantmonitoring,errordetection,faulttoleranceandautomaticrecoveryareintegraltothesystem2.Hugefiles(bytraditionalstandards)MultiGBfilesarecommonI/Ooperationsandblockssizesmustberevisited2023/12/642Intro:Observations23.MostfilesaremutatedbyappendingnewdataThisisthefocusofperformanceoptimizationandatomicityguarantees4.Co-designingtheapplicationsandAPIsbenefitsoverallsystembyincreasingflexibility2023/12/643TheDesignClusterconsistsofasinglemasterandmultiplechunkserversandisaccessedbymultipleclients2023/12/644TheMasterMaintainsallfilesystemmetadata.namesspace,accesscontrolinfo,filetochunkmappings,chunk(includingreplicas)location,etc.PeriodicallycommunicateswithchunkserversinHeartBeatmessagestogiveinstructionsandcheckstate2023/12/645TheMasterHelpsmakesophisticatedchunkplacementandreplicationdecision,usingglobalknowledgeForreadingandwriting,clientcontactsMastertogetchunklocations,thendealsdirectlywithchunkserversMasterisnotabottleneckforreads/writes2023/12/646ChunkserversFilesarebrokenintochunks.Eachchunkhasaimmutablegloballyunique64-bitchunk-handle.handleisassignedbythemasteratchunkcreationChunksizeis64MBEachchunkisreplicatedon3(default)servers2023/12/647ClientsLinkedtoappsusingthefilesystemAPI.CommunicateswithmasterandchunkserversforreadingandwritingMasterinteractionsonlyformetadataChunkserverinteractionsfordataOnlycachesmetadatainformationDataistoolargetocache.2023/12/648ChunkLocationsMasterdoesnotkeepapersistentrecordoflocationsofchunksandreplicas.Pollschunkserversatstartup,andwhennewchunkserversjoin/leaveforthis.StaysuptodatebycontrollingplacementofnewchunksandthroughHeartBeatmessages(whenmonitoringchunkservers)2023/12/649OperationLogRecordofallcriticalmetadatachangesStoredonMasterandreplicatedonothermachinesDefinesorderofconcurrentoperationsAlsousedtorecoverthefilesystemstate2023/12/650SystemInteractions:
LeasesandMutationOrderLeasesmaintainamutationorderacrossallchunkreplicasMastergrantsaleasetoareplica,calledtheprimaryTheprimarychosestheserialmutationorder,andallreplicasfollowthisorderMinimizesmanagementoverheadfortheMaster2023/12/651AtomicRecordAppendClientspecifiesthedatatowrite;GFSchoosesandreturnstheoffsetitwritestoandappendsthedatatoeachreplicaatleastonceHeavilyusedbyGoogle’sDistributedapplications.NoneedforadistributedlockmanagerGFSchosestheoffset,nottheclient2023/12/652AtomicRecordAppend:How?FollowssimilarcontrolflowasmutationsPrimarytellssecondaryreplicastoappendatthesameoffsetastheprimaryIfareplicaappendfailsatanyreplica,itisretriedbytheclient.Soreplicasofthesamechunkmaycontaindifferentdata,includingduplicates,wholeorinpart,ofthesamerecord2023/12/653AtomicRecordAppend:How?GFSdoesnotguaranteethatallreplicasarebitwiseidentical.Onlyguaranteesthatdataiswrittenatleastonceinanatomicunit.Datamustbewrittenatthesameoffsetforallchunkreplicasforsuccesstobereported.2023/12/654DetectingStaleReplicasMasterhasachunkversionnumbertodistinguishuptodateandstalereplicasIncreaseversionwhengrantingaleaseIfareplicaisnotavailable,itsversionisnotincreasedmasterdetectsstalereplicaswhenachunkserversreportchunksandversionsRemovestalereplicasduringgarbagecollection2023/12/655GarbagecollectionWhenaclientdeletesafile,masterlogsitlikeotherchangesandchangesfilenametoahiddenfile.Masterremovesfileshiddenforlongerthan3dayswhenscanningfilesystemnamespacemetadataisalsoerasedDuringHeartBeatmessages,thechunkserverssendthemasterasubsetofitschunks,andthemastertellsitwhichfileshavenometadata.Chunkserverremovesthesefilesonitsown2023/12/656FaultTolerance:
HighAvailabilityFastrecoveryMasterandchunkserverscanrestartinsecondsChunkReplicationMasterReplication“shadow”mastersprovideread-onlyaccesswhenprimarymasterisdownmutationsnotdoneuntilrecordedonallmasterreplicas2023/12/657FaultTolerance:
DataIntegrityChunkserversusechecksumstodetectcorruptdataSincereplicasarenotbitwiseidentical,chunkserversmaintaintheirownchecksumsForreads,chunkserververifieschecksumbeforesendingchunkUpdatechecksumsduringwrites2023/12/658Introductionto
MapReduce2023/12/659MapReduce:Insight
”Considertheproblemofcountingthenumberofoccurrencesofeachwordinalargecollectionofdocuments”Howwouldyoudoitinparallel?2023/12/660MapReduceProgrammingModel
InspiredfrommapandreduceoperationscommonlyusedinfunctionalprogramminglanguageslikeLisp.Usersimplementinterfaceoftwoprimarymethods:1.Map:(key1,val1)→(key2,val2)2.Reduce:(key2,[val2])→[val3]
2023/12/661Mapoperation
Map,apurefunction,writtenbytheuser,takesaninputkey/valuepairandproducesasetofintermediatekey/valuepairs.e.g.(doc—id,doc-content)DrawananalogytoSQL,mapcanbevisualizedasgroup-byclauseofanaggregatequery.
2023/12/662Reduceoperation
Oncompletionofmapphase,alltheintermediatevaluesforagivenoutputkeyarecombinedtogetherintoalistandgiventoareducer.Canbevisualizedasaggregatefunction(e.g.,average)thatiscomputedoveralltherowswiththesamegroup-byattribute.2023/12/663Pseudo-codemap(Stringinput_key,Stringinput_value)://input_key:documentname//input_value:documentcontentsforeachwordwininput_value: EmitIntermediate(w,"1");reduce(Stringoutput_key,Iteratorintermediate_values)://output_key:aword//output_values:alistofcountsintresult=0;foreachvinintermediate_values: result+=ParseInt(v);Emit(AsString(result));2023/12/664MapReduce:Executionoverview
2023/12/665MapReduce:Example
2023/12/666MapReduceinParallel:Example
2023/12/667MapReduce:FaultToleranceHandledviare-executionoftasks.TaskcompletioncommittedthroughmasterWhathappensifMapperfails?Re-executecompleted+in-progressmaptasksWhathappensifReducerfails?Re-executeinprogressreducetasksWhathappensifMasterfails?Potentialtrouble!!2023/12/668MapReduce:
WalkthroughofOnemoreApplication2023/12/6692023/12/670MapReduce:PageRank
PageRankmodelsthebehaviorofa“randomsurfer”.C(t)istheout-degreeoft,and(1-d)isadampingfactor(randomjump)The“randomsurfer”keepsclickingonsuccessivelinksatrandomnottakingcontentintoconsideration.Distributesitspagesrankequallyamongallpagesitlinksto.Thedampeningfactortakesthesurfer“gettingbored”andtypingarbitraryURL.2023/12/671PageRank:KeyInsights
Effectsateachiterationislocal.i+1thiterationdependsonlyonithiterationAtiterationi,PageRankforindividualnodescanbecomputedindependently2023/12/672PageRankusingMapReduce
UseSparsematrixrepresentation(M)MapeachrowofMtoalistofPageRank“credit”toassigntooutlinkneighbours.TheseprestigescoresarereducedtoasinglePageRankvalueforapagebyaggregatingoverthem.2023/12/673PageRankusingMapReduceMap:distributePageRank“credit”tolinktargetsReduce:gatherupPageRank“credit”frommultiplesourcestocomputenewPageRankvalueIterateuntilconvergenceSourceofImage:Lin20082023/12/674
Phase1:ProcessHTML
Maptasktakes(URL,content)pairsandmapsthemto(URL,(PRinit,list-of-urls))PRinitisthe“seed”PageRankforURLlist-of-urlscontainsallpagespointedtobyURLReducetaskisjusttheidentityfunction2023/12/675
Phase2:PageRankDistribution
Reducetaskgets(URL,url_list)andmany(URL,val)valuesSumvalsandfixupwithdtogetnewPREmit(URL,(new_rank,url_list))Checkforconvergenceusingnonparallelcomponent2023/12/676MapReduce:SomeMoreAppsDistributedGrep.CountofURLAccessFrequency.Clustering(K-means)GraphAlgorithms.IndexingSystemsMapReduceProgramsInGoogleSourceTree2023/12/677MapReduce:Extensionsandsimilarapps
PIG(Yahoo)Hadoop(Apache)DryadLinq(Microsoft)2023/12/678LargeScaleSystemsArchitectureusingMapReduceUserAppMapReduceDistributedFileSystems(GFS)2023/12/679BigTable:ADistributedStorageSystemforStructuredData2023/12/680IntroductionBigTableisadistributedstoragesystemformanagingstructureddata.DesignedtoscaletoaverylargesizePetabytesofdataacrossthousandsofserversUsedformanyGoogleprojectsWebindexing,PersonalizedSearch,GoogleEarth,GoogleAnalytics,GoogleFinance,…Flexible,high-performancesolutionforallofGoogle’sproducts2023/12/681MotivationLotsof(semi-)structureddataatGoogleURLs:Contents,crawlmetadata,links,anchors,pagerank,…Per-userdata:Userpreferencesettings,recentqueries/searchresults,…Geographiclocations:Physicalentities(shops,restaurants,etc.),roads,satelliteimagedata,userannotations,…ScaleislargeBillionsofURLs,manyversions/page(~20K/version)Hundredsofmillionsofusers,thousandsorq/sec100TB+ofsatelliteimagedata2023/12/682WhynotjustusecommercialDB?ScaleistoolargeformostcommercialdatabasesEvenifitweren’t,costwouldbeveryhighBuildinginternallymeanssystemcanbeappliedacrossmanyprojectsforlowincrementalcostLow-levelstorageoptimizationshelpperformancesignificantlyMuchhardertodowhenrunningontopofadatabaselayer2023/12/683GoalsWantasynchronousprocessestobecontinuouslyupdatingdifferentpiecesofdataWantaccesstomostcurrentdataatanytimeNeedtosupport:Veryhighread/writerates(millionsofopspersecond)EfficientscansoverallorinterestingsubsetsofdataEfficientjoinsoflargeone-to-oneandone-to-manydatasetsOftenwanttoexaminedatachangesovertimeE.g.Contentsofawebpageovermultiplecrawls2023/12/684BigTableDistributedmulti-levelmapFault-tolerant,persistentScalableThousandsofserversTerabytesofin-memorydataPetabyteofdisk-baseddataMillionsofreads/writespersecond,efficientscansSelf-managingServerscanbeadded/removeddynamicallyServersadjusttoloadimbalance2023/12/685BuildingBlocksBuildingblocks:GoogleFileSystem(GFS):RawstorageScheduler:schedulesjobsontomachinesLockservice:distributedlockmanagerMapReduce:simplifiedlarge-scaledataprocessingBigTableusesofbuildingblocks:GFS:storespersistentdata(SSTablefileformatforstorageofdata)Scheduler:schedulesjobsinvolvedinBigTableservingLockservice:masterelection,locationbootstrappingMapReduce:oftenusedtoread/writeBigTabledata2023/12/686BasicDataModelABigTableisasparse,distributedpersistentmulti-dimensionalsortedmap(row,column,timestamp)->cellcontentsGoodmatchformostGoogleapplications2023/12/687WebTableExampleWanttokeepcopyofalargecollectionofwebpagesandrelatedinformationUseURLsasrowkeysVariousaspectsofwebpageascolumnnamesStorecontentsofwebpagesinthecontents:columnunderthetimestampswhentheywerefetched.2023/12/688RowsNameisanarbitrarystringAccesstodatainarowisatomicRowcreationisimplicituponstoringdataRowsorderedlexicographicallyRowsclosetogetherlexicographicallyusuallyononeorasmallnumberofmachines2023/12/689Rows(cont.)Readsofshortrowrangesareefficientandtypicallyrequirecommunicationwithasmallnumberofmachines.Canexploitthispropertybyselectingrowkeyssotheygetgoodlocalityfordataaccess.Example: ,,, VS edu.gatech.math,edu.gatech.phys,edu.uga.math,edu.uga.phys2023/12/690ColumnsColumnshavetwo-levelnamestructure:family:optional_qualifierColumnfamilyUnitofaccesscontrolHasassociatedtypeinformationQualifiergivesunboundedcolumnsAdditionallevelsofindexing,ifdesired2023/12/691TimestampsUsedtostoredifferentversionsofdatainacellNewwritesdefaulttocurrenttime,buttimestampsforwritescanalsobesetexplicitlybyclientsLookupoptions:“ReturnmostrecentKvalues”“Returnallvaluesintimestamprange(orallvalues)”Columnfamiliescanbemarkedw/attributes:“OnlyretainmostrecentKvaluesinacell”“KeepvaluesuntiltheyareolderthanKseconds”2023/12/692Implementation–ThreeMajorComponentsLibrarylinkedintoeveryclientOnemasterserverResponsiblefor:AssigningtabletstotabletserversDetectingadditionandexpirationoftabletserversBalancingtablet-serverloadGarbagecollectionManytabletserversTabletservershandlereadandwriterequeststoitstableSplitstabletsthathavegrowntoolarge2023/12/693Implementation(cont.)Clientdatadoesn’tmovethroughmasterserver.Clientscommunicatedirectlywithtabletserversforreadsandwrites.Mostclientsnevercommunicatewiththemasterserver,leavingitlightlyloadedinpractice.2023/12/694TabletsLargetablesbrokenintotabletsatrowboundariesTabletholdscontiguousrangeofrowsClientscanoftenchooserowkeystoachievelocalityAimfor~100MBto200MBofdatapertabletServingmachineresponsiblefor~100tabletsFastrecovery:100machineseachpickup1tabletforfailedmachineFine-grainedloadbalancing:MigratetabletsawayfromoverloadedmachineMastermakesload-balancingdecisions2023/12/695TabletLocationSincetabletsmovearoundfromservertoserver,givenarow,howdoclientsfindtherightmachine?Needtofindtabletwhoserowrangecoversthetargetrow2023/12/696TabletAssignmentEachtabletisassignedtoonetabletserveratatime.Masterserverkeepstrackofthesetoflivetabletserversandcurrentassignmentsoftabletstoservers.Alsokeepstrackofunassignedtablets.Whenatabletisunassigned,masterassignsthetablettoantabletserverwithsufficientroom.2023/12/697APIMetadataoperationsCreate/deletetables,columnfamilies,changemetadataWrites(atomic)Set():writecellsinarowDeleteCells():deletecellsinarowDeleteRow():deleteallcellsinarowReadsScanner:readarbitrarycellsinabigtableEachrowreadisatomicCanrestrictreturnedrowstoaparticularrangeCanaskforjustdatafrom1row,allrows,etc.Canaskforallcolumns,justcertaincolumnfamilies,orspecificcolumns2023/12/698Refinements:CompressionManyopportunitiesforcompressionSimilarvaluesinthesamerow/columnatdifferenttimestampsSimilarvaluesindifferentcolumnsSimilarvaluesacrossadjacentrowsTwo-passcustomcompressionsschemeFirstpass:compresslongcommonstringsacrossalargewindowSecondpass:lookforrepetitionsinsmallwindowSpeedemphasized,butgoodspacereduction(10-to-1)2023/12/699Refinements:BloomFiltersReadoperationhastoreadfromdiskwhendesiredSSTableisn’tinmemoryReducenumberofaccessesbyspecifyingaBloomfilter.AllowsusaskifanSSTablemightcontaindataforaspecifiedrow/columnpair.SmallamountofmemoryforBloomfiltersdrasticallyreducesthenumberofdiskseeksforreadoperationsUseimpliesthatmostlookupsfornon-existentrowsorcolumnsdonotneedtotouchdisk2023/12/6100Refinements:BloomFiltersReadoperationhastoreadfromdiskwhendesiredSSTableisn’tinmemoryReducenumberofaccessesbyspecifyingaBloomfilter.AllowsusaskifanSSTablemightcontaindataforaspecifiedrow/columnpair.SmallamountofmemoryforBloomfiltersdrasticallyreducesthenumberofdiskseeksforreadoperationsUseimpliesthatmostlookupsfornon-existentrowsorcolumnsdonotneedtotouchdisk2023/12/6101主要內(nèi)容102
云計(jì)算概述
Google云計(jì)算技術(shù):GFS,Bigtable和MapreduceYahoo云計(jì)算技術(shù)和Hadoop云數(shù)據(jù)管理的挑戰(zhàn)2023/12/6102Yahoo!Cloudcomputing2023/12/6103babycenterepicuriousSearchResultsoftheFutureLinkedInwebmdGawkerNewYorkTimes2023/12/6104What’sintheHorizontalCloud?CommonApproachestoQA,ProductionEngineering,PerformanceEngineering,DatacenterManagement,andOptimizationID&AccountManagementMonitoring&QoSSharedInfrastructureMetering,Billing,AccountingHorizontalCloudServicesEdgeContentServicese.g.,YCS,YCPIProvisioning&Virtualizatione.g.,EC2BatchStorage&Processinge.g.,Hadoop&PigOperationalStoragee.g.,S3,MObStor,SherpaOtherServicesMessaging,Workflow,virtualDBs&WebservingSecuritySimpleWebServiceAPI’s2023/12/6105Yahoo!CloudStackProvisioning(Self-serve)HorizontalCloudServices…YCSYCPIBrooklynEDGEMonitoring/Metering/SecurityHorizontalCloudServices…HadoopBATCHHorizontalCloudServices…SherpaMOBStorSTORAGEHorizontalCloudServicesVM/OS…APPHorizontalCloudServicesVM/OSyApacheWEBDataHighwayServingGridPHPAppEngine2023/12/6106WebDataManagementLargedataanalysis(Hadoop)Structuredrecordstorage(PNUTS/Sherpa)Blobstorage(SAN/NAS)ScanorientedworkloadsFocusonsequentialdiskI/O$percpucycleCRUDPointlookupsandshortscansIndexorganizedtableandrandomI/Os$perlatencyObjectretrievalandstreamingScalablefilestorage$perGB2023/12/6107TheWorldHasChangedWebservingapplicationsneed:Scalability!PreferablyelasticFlexibleschemasGeographicdistributionHighavailabilityReliablestorageWebservingapplicationscandowithout:ComplicatedqueriesStrongtransactions2023/12/6108PNUTS/SHERPAToHelpYouScaleYourMountainsofData2023/12/6109Yahoo!ServingStorageProblemSmallrecords–100KBorlessStructuredrecords–lotsoffields,evolvingExtremedatascale-TensofTBExtremerequestscale-Tensofthousandsofrequests/secLowlatencyglobally-20+datacentersworldwideHighAvailability-outagescost$millionsVariableusagepatterns-asapplicationsanduserschange
1102023/12/6110ThePNUTS/SherpaSolutionThenextgenerationglobal-scalerecordstoreRecord-orientation:Routing,datastorageoptimizedforlow-latencyrecordaccessScaleout:Addmachinestoscalethroughput(whilekeepinglatencylow)Asynchrony:Pub-subreplicationtofar-flungdatacenterstomaskpropagationdelayConsistencymodel:ReducecomplexityofasynchronyfortheapplicationprogrammerClouddeploymentmodel:Hosted,managedservicetoreduceapptime-to-marketandenableondemandscaleandelasticity1112023/12/6111E75656CA42342EB42521WC66354WD12352EF15677EWhatisPNUTS/Sherpa?E75656CA42342EB42521WC66354WD12352EF15677ECREATETABLEParts( IDVARCHAR, StockNumberINT, StatusVARCHAR …)ParalleldatabaseGeographicreplicationStructured,flexibleschemaHosted,managedinfrastructureA42342EB42521WC66354WD12352EE75656CF15677E1122023/12/6112WhatWillItBecome?E75656CA42342EB42521WC66354WD12352EF15677EE75656CA42342EB42521WC66354WD12352EF15677EE75656CA42342EB42521WC66354WD12352EF15677ECREATETABLEParts( IDVARCHAR, StockNumberINT, StatusVARCHA
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 黃山2025年安徽黃山市休寧縣事業(yè)單位招聘26人筆試歷年參考題庫(kù)附帶答案詳解
- 2024年五年級(jí)數(shù)學(xué)上冊(cè) 3 小數(shù)除法2課時(shí) 除數(shù)是整數(shù)的小數(shù)除法(2)配套教學(xué)實(shí)錄 新人教版
- 2024年五年級(jí)數(shù)學(xué)下冊(cè) 二 長(zhǎng)方體(一)第5課時(shí) 露在外面的面教學(xué)實(shí)錄 北師大版
- Unit 2 Neighborhood Grammar 教學(xué)設(shè)計(jì) 2024-2025學(xué)年譯林版(2024)七年級(jí)英語下冊(cè)
- 人力資源規(guī)劃與策略
- 2025年鈹箔及鈹合金箔項(xiàng)目發(fā)展計(jì)劃
- 中國(guó)傳統(tǒng)音樂教育現(xiàn)狀分析
- 企業(yè)內(nèi)訓(xùn)師專業(yè)技能提升
- 供油協(xié)議-合同標(biāo)準(zhǔn)文本
- 中西傳統(tǒng)藝術(shù)的現(xiàn)代轉(zhuǎn)型
- 大學(xué)《思想道德與法治》期末考試復(fù)習(xí)題庫(kù)(含答案)
- 欽州市充電站建設(shè)計(jì)劃書
- 婚紗行業(yè)基礎(chǔ)知識(shí)培訓(xùn)課件
- 智能燈具故障排除方案
- 《少先隊(duì)員采茶歌》課件
- 新外研版高一英語必修二unit6課文
- 氣排球比賽積分表
- 20道瑞幸咖啡營(yíng)運(yùn)經(jīng)理崗位常見面試問題含HR常問問題考察點(diǎn)及參考回答
- 教師調(diào)課申請(qǐng)表
- 急性心力衰竭中國(guó)急診管理指南2022
- 《利用導(dǎo)數(shù)研究函數(shù)的零點(diǎn)問題》教學(xué)設(shè)計(jì)
評(píng)論
0/150
提交評(píng)論