




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
MorePipeline1BasicRISCPipeliningBasicidea:Eachinstructionspends1clockcycleineachofthe5executionstages.During1clockcycle,thepipelinecanprocess(indifferentstages)5differentinstructions.2SimpleRISCDatapathIFIDEXMEMWBProgram
CounterNextPCInst.
Reg.Load
fr.Mem.
Data3DescriptionofPipeStages4Hazards5ThehazardsofpipeliningPipelinehazardspreventnextinstructionfromexecutingduringdesignatedclockcycleThereare3classesofhazards:StructuralHazards:ArisefromresourceconflictsHWcannotsupportallpossiblecombinationsofinstructionsDataHazards:OccurwhengiveninstructiondependsondatafromaninstructionaheadofitinpipelineControlHazards:Resultfrombranch,otherinstructionsthatchangeflowofprogram(i.e.changePC)6Howdowedealwithhazards?Often,pipelinemustbe
stalledStallingpipelineusuallyletssomeinstruction(s)inpipelineproceed,another/otherswaitfordata,resource,etc.7StallsandperformanceStallsimpede(阻止)progressofapipelineandresultindeviationfrom1instructionexecuting/clockcyclePipeliningcanbeviewedto:DecreaseCPIorclockcycletimeforinstructionLet’sseewhataffectstallshaveonCPI…CPIpipelined=IdealCPI+Pipelinestallcyclesperinstruction1+PipelinestallcyclesperinstructionIgnoringoverheadandassumingstagesarebalanced:8Evenmorepipelineperformanceissues!Thisresultsin:Whichleadsto:Ifnostallsinidealcasespeedup==numberofpipelinestages91.StructuralhazardsMostcommoninstancesofstructuralhazards(結(jié)構(gòu)冒險(xiǎn)大多發(fā)生在):Whenafunctionalunitnotfullypipelined(完全流水)WhensomeresourcenotduplicatedenoughOnewaytoavoidstructuralhazardsistoduplicateresourcesPipelinesstallresultofhazards,CPIincreasedfromtheusual“1〞10AnexampleofastructuralhazardALURegMemDMRegALURegMemDMRegALURegMemDMRegALURegMemDMRegTimeALURegMemDMRegLoadInstruction1Instruction2Instruction3Instruction4What’stheproblemhere?Theprocessorhasacombinedinstruction+datamemorywithonly1readport11Howisitresolved?ALURegMemDMRegALURegMemDMRegALURegMemDMRegTimeALURegMemDMRegLoadInstruction1Instruction2StallInstruction3BubbleBubbleBubbleBubbleBubblePipelinegenerallystalledbyinsertinga“bubble〞orNOP12Oralternatively…Inst.#12345678910LOADIFIDEXMEMWBInst.i+1IFIDEXMEMWBInst.i+2IFIDEXMEMWBInst.i+3stallIFIDEXMEMWBInst.i+4IFIDEXMEMWBInst.i+5IFIDEXMEMInst.i+6IFIDEXClockNumberLOADinstruction“steals〞aninstructionfetchcyclewhichwillcausethepipelinetostall.Thus,noinstructioncompletesonclockcycle813Rememberthecommoncase!But,insomecasesitmaybebettertoallowthemthantoeliminatethem.Thesearesituationsacomputerarchitectmighthavetoconsider:IspipeliningfunctionalunitsorduplicatingthemcostlyintermsofHW?Doesstructuralhazardoccuroften?What’sthecommoncase?142.DatahazardsWhydotheyexist???Pipeliningchangesorder(i.e.read/writeaccessestooperands)Orderdiffersfromorderseenbysequentiallyexecutinginstructionsonunpipelinedmachine(流水執(zhí)行序不同于非流水機(jī)器的順序執(zhí)行指令序)Considerthisexample:ADDR1,R2,R3SUBR4,R1,R5ANDR6,R1,R7ORR8,R1,R9XORR10,R1,R11AllinstructionsafterADDuseresultofADDADDwritestheregisterinWBbutSUBneedsitinID.Thisisadatahazard15IllustratingadatahazardALURegMemDMRegALURegMemDMRegALURegMemDMRegMemTimeADDR1,R2,R3SUBR4,R1,R5ANDR6,R1,R7ORR8,R1,R9XORR10,R1,R11ALURegMemADDinstructioncausesahazardinnext3instructionsb/c(because)
registernotwrittenuntilafterthose3readit.16DatahazardspecificsThereareactually3differentkindsofdatahazards!ReadAfterWrite(RAW)WriteAfterWrite(WAW)WriteAfterRead(WAR)Assumethathazardswilluseinstructionsi&j.iisalwaysissuedbeforej.Thus,iwillalwaysbefurtheralonginpipelinethanj.Withanin-orderissue/in-ordercompletionmachine,we’renotasconcernedwithWAW,WAR17ThreeTypesofDataHazardsThereareactually3differentkindsofdatahazards!Let
i
beanearlierinstruction,
j
alaterone.RAW(readafterwrite)jtriestoreadavaluebefore
i
writesitWAW(writeafterwrite)i
andj
writetosameplace,butinthewrongorder.發(fā)生條件:Onlyoccursif>1pipelinestagecanwrite(in-order)WAR(writeafterread)j
writesanewvaluetoalocationbeforei
hasreadtheoldone.發(fā)生條件:Onlyoccursifwritescanhappenbeforereadsinpipeline(in-order).18Readafterwrite(RAW)hazardsWithRAWhazard,instructionjtriestoreadasourceoperandbeforeinstructioniwritesit.Thus,jwouldincorrectlyreceiveanoldorincorrectvalueGraphically/Example:Canusestallingorforwardingtoresolvethishazard…ji…InstructionjisareadinstructionissuedafteriInstructioniisawriteinstructionissuedbeforeji:ADDR1,R2,R3j:SUBR4,R1,R619ForwardingItcanactuallybesolvedrelativelyeasily–withforwardingInthisexample,resultoftheADDinstructionnotreallyneededuntilafterADDactuallyproducesitCanwemovetheresultfromEX/MEMregistertothebeginningofALU(whereSUBneedsit)?Generallyspeaking:Forwarding
occurswhenaresultispasseddirectlytofunctionalunitthatrequiresit.Resultgoesfromoutputofoneunittoinputofanother20Whencanweforward?ALURegMemDMRegALURegMemDMRegALURegMemDMRegMemTimeADDR1,R2,R3SUBR4,R1,R5ANDR6,R1,R7ORR8,R1,R9XORR10,R1,R11ALURegMemSUBgetsinfo.fromEX/MEMpiperegisterANDgetsinfo.fromMEM/WBpiperegisterORgetsinfo.byforwardingfromregisterfileRuleofthumb: Iflinegoes“forward〞youcandoforwarding. Ifitsdrawnbackward,it’sphysicallyimpossible.21DataHazardDetection22HazardDetectionLogicExample:Detectingwhetheraninstructionthathasjustbeenfetchedneedstobestalledbecauseofaprecedingload.23ForwardingSituationsinDLX24HWChangeforForwardingMuxMuxALUZero?DatamemoryID/EXEX/MEMMEM/WB25Forwarding:Itdoesn’talwaysworkALURegIMDMRegALURegIMDMALURegIMTimeLWR1,0(R2)SUBR4,R1,R5ANDR6,R1,R7ORR8,R1,R9RegIMLoadhasalatencythatforwardingcan’tsolve.Pipelinemuststalluntilhazardcleared(startingwithinstructionthatwantstousedatauntilsourceproducesit).26ThesolutionALURegIMDMRegRegIMIMTimeLWR1,0(R2)SUBR4,R1,R5ANDR6,R1,R7ORR8,R1,R9BubbleBubbleBubbleALURegRegIMALUDMInsertionofbubblecauses#ofcyclestocompletethissequencetogrowby127DatahazardsandthecompilerCompilershouldbeabletohelpeliminatesomestallscausedbydatahazardsi.e.compilercouldnotgenerateaLOADinstructionthatisimmediatelyfollowedbyinstructionthatusesresultofLOAD’sdestinationregister.Techniqueiscalled“pipeline/instructionscheduling〞28AsimpleExampleAclevercompilercanoftenrescheduleinstructionstoavoidastall.Asimpleexample:Originalcode:
lwr2,0(r4)
addr1,r2,r3Note:Stallhappenshere!
lwr5,4(r4)Transformedcode:
lwr2,0(r4)
lwr5,4(r4)
addr1,r2,r3Nostallneeded!
29SimpleRISCPipelineStallStatistics%ofloadsthatcauseastall30Writeafterwrite(WAW)hazardsWithWAWhazard,instructionjtriestowriteanoperandbeforeinstructioniwritesit.ThewritesareperformedinwrongorderleavingthevaluewrittenbyearlierinstructionGraphically/Example:…ji…InstructionjisawriteinstructionissuedafteriInstructioniisawriteinstructionissuedbeforeji:DIVF1,F2,F3j:SUBF1,F4,F631Writeafterread(WAR)hazardsWithWARhazard,instructionjtriestowriteanoperandbeforeinstructionireadsit.Instructioniwouldincorrectlyreceivenewervalueofitsoperand;Insteadofgettingoldvalue,itcouldreceivesomenewer,undesiredvalue.Graphically/Example:…ji…InstructionjisawriteinstructionissuedafteriInstructioniisareadinstructionissuedbeforeji:DIVF7,F1,F3j:SUBF1,F4,F6323.Control(Branch)HazardsSupposethenewPCvalueisnotcomputeduntiltheMEMstage.Thenwemuststall3clocksaftereverybranch!33BranchHazardsneedtoconsiderhazardsinvolvingbranches:Example:40: beq $1,$3,2844: and $12,$2,$548: or $13,$6,$252: add $14,$2,$272: lw $4,50($7)34PipelineimpactonbranchHowdowedealwiththis?AlwaysstallAssumebranch-not-takenBranchdelayslots35AssumebranchnottakenOnaverage,branchesaretaken?thetimeIfbranchnottaken…ContinuenormalprocessingElse,ifbranchistaken…NeedtoflushimproperinstructionfrompipelineCutsoveralltimeforbranchprocessingin?36AssumebranchnottakenCase1:nottakenExecutionproceedsnormallynopenalty37AssumebranchnottakenCase2:takenbranchBubblesinjectedinto3stagesduringcycle538Sum:BranchPenaltyImpactAssume16%ofallinstructionsarebranches4%unconditionalbranches:3cyclepenalty12%conditional:50%taken,3cyclepenaltyForasequenceofNinstructions(assumeNislarge)Ncyclestoinitiateeach3*0.04*Ndelaysduetounconditionalbranches0.5*3*0.12*NdelaysduetoconditionaltakenAlso,anextra4cyclesforpipelinetoemptyTotal:1.3*N+4totalcycles(or1.3cycles/instruction)(CPI)30%PerformanceHit!!!(Badthing)39BranchdelayslotDelayslot:FindoneinstructionthatwillbeexecutednomatterwhichwaythebranchgoesBranchesalwaysexecutenext1or2instructionsInstructionsoexecutedsaidtobeindelayslotbranchinstruction
Delayslotinstruction1
Delayslotinstruction2
Delayslotinstructionn
branchtargetiftaken
Branchdelayslotoflengthn40SchedulingDelayedBranchADDR1,R2,R3ifR2=0thenifR2=0thenSUBR4,R5,R6ADDR1,R2,R3ifR1=0thenSUBR4,R5,R6ADDR1,R2,R3ifR1=0thenADDR1,R2,R3ifR1=0thenADDR1,R2,R3SUBR4,R5,R6ORR7,R8,R9SUBR4,R5,R6ADDR1,R2,R3ifR1=0thenSUBR4,R5,R6ORR7,R8,R9FrombeforeFromtargetFromfallthrough41SchedulingDelayedBranchWheretogetinstructionstofillbranchdelayslot?BeforebranchinstructionalwaysvaluableFromthetargetaddress:onlyvaluablewhenbranchtakenFromfallthrough:onlyvaluablewhenbranchnottaken42FastBranchResolutionPerformancepenaltycouldbemorethan30%Deeperpipelines,somecodeisverybranchheavyFastBranchResolutionAdderinIDforPC+immediatetargetsOnlyworksforsimpleconditions(compareto0)Comparingtworegistervaluescouldbetooslow4344NewPipelineLogic45ExampleAssumethefollowingMIPSinstructionmix:WhatistheresultingCPIforthepipelinedMIPSwithforwardingandbranchaddresscalculationinIDstagewhenusingabranchnot-takenscheme?CPI=IdealCPI+Pipelinestallclockcyclesperinstruction=1+stallsbyloads+stallsbybranches=1+.3x.25x1+.2x.45x1=1+.075+.09=1.165Type Frequency Arith/Logic 40% Load 30%ofwhich25%arefollowedimmediatelybyaninstructionusingtheloadedvalueStore 10% branch 20%ofwhich45%aretaken46Exceptions47TypesofExceptions(Interrupts,Faults)I/Odevicerequest,timereventInvokingOSservicesfromauserprogramTracing(single-stepping)throughprogramBreakpointsIntegerarithmeticoverflow,dividebyzeroFParithmeticanomaly(overflow,underflow,etc.)Pagefault(pagenotinphysicalmemory)MisalignedmemoryaccessMemory-protectionviolation(acc.mem.notalloc’edtoproc.)Illegal(undefinedorunimplemented)instructionHardwaremalfunctionPower-relatedinterrupt(e.g.batterylow,powerfailure)……48ExceptionCharacterization1Synchronousvs.asynchronousEventsynchronizedwithprogramexecution?Synchronous:eventoccurssameplaceeverytimeAsynchronous:causedbydevicesexternaltoCPU&memory,alsohwmalfunctionsUserrequestedvs.coercedEventcausedintentionallybyuserprogram?Requested:usertaskasksforitCoerced:hweventnotundercontrolofuserprogram49ExceptionCharacterization2Usermaskable(canbedisabled)ornotCaneventbedisabled?Maskable:eventthatcanbedisabledbyusertaskWithininstructionsorbetweeninstructionsDoeseventpreventinstructionfromcompleting?Within:duringexecutionoftask,hardtohandle,usuallysynchronoussinceinstructionistriggerResumevsterminateDoestheprogramcontinuefromwhereitleftoffafterexceptionishandled,ordoesitstop?Terminating:executionalwaysstopsaftertheinterrupt50RestartableExceptionsRequirements:Exceptionmayoccurwithininstruction.Programmustcontinueafterexceptionishandled.Examples:Virtualmemorypagefault.Difficultbecause:Pipelinestatemustbesaved.Oneapproach,foreasycases:1.Forceatrapinst.intopipelineonnextIF.2.Clearpipelinebehindfaultinginstruction.3.ExceptionhandlersavesPCoffaultinginstr.51Precisevs.ImpreciseHandlingMachinesmaysupporteitherorbothmodesofexceptionhandling:Preciseexceptionhandling:Correctlyimplementallpossiblecombinationsofexceptionsinallcircumstances.Maybearequirementforsomesystems/applications.Maybe10xslower!Easierforintegerthanfloating-point.Usefulfordebuggingcode.Impreciseexceptionhandling:Onlycorrectlyimplementthemostcommoncases.Softwaremayavoidsomeexceptions.Onlystatisticalguaranteesofcorrectness,throughtesting.52ExceptionsinDLXpipelineInstructionFetch,&MemorystagesPagefaultoninstruction/datafetchMisalignedmemoryaccessMemory-protectionviolationInstructionDecodestageUndefined/illegalopcodeExecutionstageArithmeticexceptionWrite-BackstageNone!53Out-of-OrderExceptionsConsiderthefollowingcodesequence:LWIFIDEXMEMWBADDIFIDEXMEMWBTheADDmaycauseanexceptionduringIF,beforeLWcausesanexceptionduringMEM!Can’trestartPContheADD!Solution:Notetheexceptioninastatusvector,carriedalong.Disablewritesforthatinstruction.Resolveallexceptionsatalatestage(e.g.WB).54PipeliningComplicationsComplexaddressingmodesandinstructionsAutoincrementaddressmodes:causesregisterchangeduringinstructionexecutioninterrupts?NeedtorestoreregisterstateAddsWARandWAWhazardssincewritesnolongerinlaststageFloatingpoint:longexecutiontime;outofordercompletion55StoppingandStartingExecutionMostdifficultexceptionoccurrenceshave2properties TheyoccurwithininstructionsTheymustberestartableThepipelinemustbeshutdownsafelyandthestatemustbesavedforcorrectrestartingRestartingisusuallydonebysavingPCofinstructionatwhichtostartBranchesanddelayedbranchesrequirespecialtreatmentPreciseexceptionsallowinstructionsjustbeforetheexceptiontobecompleted,whilerestartinginstructionsaftertheexception56Multi-cycleOperations57Multi-cycleOperationsforFP58PipelinedMultiple-IssueFPU59Out-of-ordercompleteNoticeinstructionsmaycompleteout-of-order:MULTDIFIDM1M2M3M4M5M6M7MEWBADDDIFIDA1A2A3A4MEWBLDIFIDEX
MEWBSDIFIDEX
MEWB60TypicalFPCodeSeq.WAR.StallsClockCycleNumberInstruction1234567891011121314151617L.DF4,0(R2)IFIDEXMEWBMUL.DF0,F4,F6IFIDstallM1M2M3M4M5M6M7MEWBADD.DF2,F0,F8IFstallIDstallstallstallstallstallstallA1A2A3A4MEWBS.DF2,0(R2)IFstallstallstallstallstallstallIDEXstallstallstallME61Structurehazards62Sum:multiple-cyclesproblemsRaisesthepossibilityofWAWhazards,andstructuralhazardsinMEM&WBstages.Structuralhazardsmayoccurespeciallyoftenwithnon-pipelinedDIVunit.Out-of-ordercompletionimpactsexceptionhandling.63附錄:TheMIPSR4000Pipeline64TheMIPSR4300PipelineManufacturedbyNEC64-bitprocessorimplementsMIPS64ISAUsedinembeddedapplicationsNintendo-64(任天堂)gameprocessor,networkrouter,…MultipleEXstagesforfloating-pointpipelineOut-of-ordercompletion,preciseexceptionsNECVR4122:Integerdata
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 雅禮教育集團(tuán)2024年九年級(jí)《化學(xué)》上冊(cè)期末試題與參考答案
- 2025年食品飲料行業(yè)數(shù)字化營(yíng)銷策略深度分析與電商運(yùn)營(yíng)優(yōu)化報(bào)告
- 微晶石行業(yè)研究報(bào)告
- 工業(yè)互聯(lián)網(wǎng)平臺(tái)數(shù)據(jù)加密算法在2025年的高性能加密算法與效能評(píng)估報(bào)告
- 2025年農(nóng)村電商農(nóng)產(chǎn)品上行模式創(chuàng)新驅(qū)動(dòng)與品牌價(jià)值提升報(bào)告
- 2025年城市地下空間開發(fā)利用可行性及文化遺產(chǎn)保護(hù)研究報(bào)告
- 2025年歷史文化街區(qū)保護(hù)與城市更新社區(qū)共建共享研究報(bào)告
- 生態(tài)休閑農(nóng)業(yè)園項(xiàng)目可行性研究報(bào)告-立項(xiàng)書
- 電氣作業(yè)復(fù)習(xí)測(cè)試題
- 2025年工業(yè)互聯(lián)網(wǎng)平臺(tái)網(wǎng)絡(luò)流量整形技術(shù)在工業(yè)控制系統(tǒng)穩(wěn)定性中的應(yīng)用報(bào)告
- 《老年肺炎臨床診斷與治療專家共識(shí)(2024年版)》臨床解讀
- 2023版設(shè)備管理體系標(biāo)準(zhǔn)
- 廣聯(lián)達(dá)BIM智慧工地
- 光纜測(cè)試報(bào)告
- 初中物理教育科學(xué)八年級(jí)下冊(cè)第十一章 機(jī)械與功《功》教學(xué)設(shè)計(jì)
- 神經(jīng)病學(xué)人衛(wèi)版習(xí)題集題庫(kù)
- 入網(wǎng)安評(píng)基線核查常用檢查項(xiàng)
- (統(tǒng)編版小學(xué)語(yǔ)文教師)語(yǔ)文新課標(biāo)新舊對(duì)比變化
- 達(dá)希納(尼洛替尼)毒副反應(yīng)及處理
- 【圖文】SEW變頻器設(shè)置參數(shù)說明
- 中班語(yǔ)言活動(dòng)《傘》
評(píng)論
0/150
提交評(píng)論