計算機體系結(jié)構(gòu)DLX處理器程序設(shè)計實驗報告_第1頁
計算機體系結(jié)構(gòu)DLX處理器程序設(shè)計實驗報告_第2頁
計算機體系結(jié)構(gòu)DLX處理器程序設(shè)計實驗報告_第3頁
計算機體系結(jié)構(gòu)DLX處理器程序設(shè)計實驗報告_第4頁
計算機體系結(jié)構(gòu)DLX處理器程序設(shè)計實驗報告_第5頁
已閱讀5頁,還剩5頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

1、計算機系統(tǒng)結(jié)構(gòu)實驗實驗三DLX處理器程序設(shè)計一、實驗?zāi)康膶W(xué)習(xí)使用DLX匯編語言編程,進(jìn)一步分析相關(guān)現(xiàn)象。二、實驗設(shè)備環(huán)境DLX匯編語言環(huán)境三、實驗內(nèi)容和要求自編一段匯編代碼,完成一維向量加法運算,并輸出結(jié)果。觀察程序中出現(xiàn)的數(shù)據(jù)/控制/結(jié)構(gòu)相關(guān)。(注:使用一維數(shù)組表示一維向量。)四、代碼清單及注釋*向量加法設(shè)計*1、源代碼.dataVectorLength:.word16Vector1:Vector2:.word1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16.word1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16Printf1:.as

2、ciizVector=Printf2:.asciiz%f.align2PrintPrompt:.wordPrintf1PrintPar:.wordPrintf2Result:.space4.textmain:addir14,r0,PrintPrompttrap5lwr20,VectorLengthaddir2,r0,0Loop:ldf10,Vector1(r2)ldf12,Vector2(r2)cvti2df0,f10cvti2df2,f12adddf4,f2,f0Finish;:*Finish,writeresultintostdoutsdResult,f4addir14,r0,PrintP

3、artrap5addir2,r2,4subir20,r20,1bnezr20,Loop;*Endtrap12、注釋VectorLength:.word16Vector1:.word1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16Vector2:.word1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16聲明向量長度以及聲明向量1、2Result:.space4存放打印數(shù)據(jù)的空間申請Loop:ldf10,Vector1(r2)ldf12,Vector2(r2)循環(huán)體中讀入向量adddf4,f2,f0加法運算trap5系統(tǒng)中斷,輸出結(jié)果3、運行

4、結(jié)果毎WINDLX|FileWindewExecuteMemoryConfigrationPipelineDLX-Standard-I/OSimulationisrunning.C.=ncel00:01k?7e=.tox-.:.0000004.OCOOOJ6.OOOOOJS.0000001C.OOJOOC|2=T-T-JxJrFrrMR-H-HT-PMDDDD113ranchZJumps:alls:C0.00ofFAWs:alls)-i_-l::.l.i.ii.nmrin*A1.:.1-11.1毎WINCLXjFieWindowExscjte-MemoryConfigurationPipeli

5、neDLK-itandard-1,0.|kector=5.0000004.OOOJCO6.0000008.OOOOOJ13.0000001.OOOOOJ11.00000016.OQ13t.00000020.OOOOOJ22.00000024.0000032S.00000020.OOOOOJ3.3.00000032.00:l!OOJIIIIl.tT-T-7tFFFFSranch/JuTi匚stalls:162E.0dofRAWstalls)-i_l::.l.l_ii.a(ii?cnn-x_n、11.1F&!Vrrkv;血n:ryCcrfgjrjlonCkckCydeCiaraR=Jx3Z:7皿:

6、加:耶卻.=M0iI:=hQi蟲護(hù)俱I眼=MOJI心:訕JixiViJI旺T詳MOJIA並M:迂砧門匚=k0iF:nE-.I-I;:Fm1:PER=加0ixi曲就睡紀(jì)0i氓加0i湎工伍;P|moji広=:耐1口芒Lx:E匸換口五、程序相關(guān)性分析結(jié)果Jk0id。專:i國1、數(shù)據(jù)、控制、結(jié)構(gòu)相關(guān)&訕朋1:_:1)mftnvwfwKia*2問酥li:IDJ:HEHpB數(shù)據(jù)相關(guān)。所以當(dāng)對當(dāng)前指令的操作數(shù)寄存器進(jìn)行操作EX)的時候,前幾條指令的運算結(jié)果還未寫回(WB)結(jié)果寄存器,由此產(chǎn)生數(shù)據(jù)相關(guān)。結(jié)構(gòu)相關(guān)。由于程序只簡單的做了一次加法,所以沒有結(jié)構(gòu)相關(guān)產(chǎn)生。structions:Cyclesiezr2U

7、.LocipapUwLlf10.Vector1fr2lIF|RStall|IDintEXMEMWBIFaborted1IF1ID11MEM1WB2、浮點運算部件帶來的影響.otal:2B3Cycle(s)executed.IDexecutedby101Insbuction(s).2Instruc:tionsjcurrentlyinPipeline.lar-dwazreconfiguration:Memorysize:3276BBytesfaddEX-Stages:1*requiredCycles:2fmulEX-Stages:1,requiredCycles:5fdivEX-Stages:1,

8、requiredCycles:19Forwaidingenabled.stalls:RAWstalls:32(11.31ofallCycles,thereof:LDstalls:(O.mcfRAWstalls)Elranch/Jumpstalls:16(50.00ofRAWstallsFloatingpoint,stalls:16(50.00ofRAWstalls)WAWstalls:0(0.00:ofallCycles)Structuralstalls:0(0.00ofallCycles)Total:203Cycle(sexecuted.IDexecutedby181lnstruction(

9、s,2InstructionfajcurrentlyinPipeline.Hardwareconfiguration:Memorysize:3276SEiytes:faddEX-Stages:2,requiredCycles:2fmulEX-Stages:1requiredCycles:5fdivEX-S(ages:1,requiredCycles:19Forwardingerabled.StalIs:FlAWstalls:32(11.31%ofallCycles,thereof:LDstalls:0(0.00ofRAWstalls)Branch/Jumpstalls:16(50.00ofRA

10、Wstalls)Floalingpointslalls:1650.00ofRAWstallsWAWstalls:00.00ofallCycles)Structuralstalls:0(0.00cfallCycles)控制相關(guān)。于系統(tǒng)按照預(yù)測成功來執(zhí)行指令,所以執(zhí)行bnez后馬上將其下一條指令trap讀進(jìn)來。將浮點加法器有1個增加為2個后,程序執(zhí)行的性能未得到提升。由于該程序為產(chǎn)生浮點加法器的結(jié)構(gòu)相關(guān),所以增加浮點加法器的數(shù)量對程序執(zhí)行的性能提升沒有幫助。3、forward部件的影響-:233匚uclefslexecuted301CV=le(s)executed.IDJ爲(wèi)郵另181In.火cul

11、edby181呼tn吧艸.2Instructionls)current2lrstructlongentlyinPipeline.T=,vrl斫iconfiguratron:Stalls:LDstalls:0(0.00ofRAWstalls)Branch/Jumpstalls:1E(50.00ofRAWIs:RAWstalls:130(34.1及ofallCycles)WAWstalls:0(0.00ofallCycles)Structuralstalls:0(Cl.00ofallCycles)Conlrolstalls:15394迄ofallCycles)Trapstalls:54(14.17

12、ofallCclesJTotal:199Stall(s)(52.23ofallCycles)FlAWstalls:32(11.31cfallCclesflhereof:Floatingpointstalls:16(50.00ofRAWsWAWstalls:0(0.00ofallCycles)Structuralstalls:0(0.00ofallCycles)Controlstalls::155.30ofallCyclesTrapstalls:54(19.08ofallCycles)TntAl-1mETI窗口斥7FIS:nfaIIRurlAsl關(guān)閉farward后,運行時間由283增加到381

13、.所以forward技術(shù)為該程序帶來的加速比為:381/282=1.35。4、轉(zhuǎn)移成功和不成功structions/Cyclesiezr20Xoop沖OhQH0.Vector1fr21在控制相關(guān)中,轉(zhuǎn)移成功和不成功的差別。由于系統(tǒng)按照預(yù)測成功來執(zhí)行指令,所以執(zhí)行bnez后馬上將其下一條指令trap讀進(jìn)來判斷出是轉(zhuǎn)移不成功時,系統(tǒng)對trap指令進(jìn)行的操作被全部作廢,轉(zhuǎn)而去執(zhí)行跳轉(zhuǎn)到的指令id。*雙精度浮點加法求和設(shè)計*1、代碼及注釋.data;初始化兩個用于相加的一維向量(雙精度浮點數(shù)),向量長度為20a:.double1.9,2.1,3.7,4.6,5.5,6.4,7.3,8.2,9.1,10

14、.1,11.2,12.3,13.4,14.5,15.6,16.7,17.8,18.9,19.0,20.2b:.double1.1,2.2,3.3,4.4,5.5,6.6,7.7,8.8,9.8,10.9,11.8,12.7,13.6,14.5,15.4,16.3,17.2,18.1,19.9,20.8PrintfFormat:.asciizTheresultisnn%ft%ft%ft%ft%ft%ft%ft%ft%ft%ft%ft%ft%ft%ft%ft%ft%ft%ft%ft%ftnn.align2PrintfPar:.wordPrintfFormatr:.space200;r為保存相加結(jié)果

15、的向量空間.text.globalmainmain:addir1,r0,0;r1計數(shù)相加的次數(shù)adduir4,r0,8;r4為常數(shù)8loop:;循環(huán)計算向量相加結(jié)果subir2,r1,20;r1=20時,跳轉(zhuǎn)到finishbeqzr2,finishmultur3,r1,r4;r3為當(dāng)前分量相對于向量基址的偏移(每個分量占8B)Idf0,a(r3);取玄中第r1個分量Idf2,b(r3);取匕中第r1個分量adddf4,f0,f2;相加結(jié)果放在f4中sdr(r3),f4;將相加結(jié)果放入結(jié)果向量r中addir1,r1,1;下一分量jIoopfinish:;輸出向量相加的結(jié)果addir14,r0,P

16、rintfPartrap52、程序中出現(xiàn)的數(shù)據(jù)/控制/結(jié)構(gòu)相關(guān)控制相關(guān):beqzr2,finish;條件跳轉(zhuǎn)jloop;無條件跳轉(zhuǎn)trap5;調(diào)用5號中斷結(jié)構(gòu)相關(guān):無數(shù)據(jù)相關(guān):均為RW相關(guān)addir1,r0,0subir2,r1,20subir2,r1,20beqzr2,finishmultur3,r1,r4ldf0,a(r3)ldf2,b(r3)adddf4,f0,f2adddf4,f0,f2sdr(r3),f43、增加浮點運算部件對性能的影響。Total:497Cycle(s)executed.IDexecutedby189Instruction(s).5lnstructicm(sjcur

17、rentlyinPipeline.Hardwareconfiguraticin:Memorysize:32760BytesfaddEX-Stages:1,requiredCycles:4fmulEX-Stages:1,requiredCycles:4fdivEX-Stages:1,requiredCycles:4Forwardingdisabled.Stalls:RAWstalls:283(56.94ofallCyclesWAWstalls:0(0.00ofallCycles)Structuralstalls:0(0.00ofallCycles)Controlstalls:21(4.22ofa

18、llCvclesTrapstalls:3(0.60ofallCycles)Total:307Stall(s)(61.77ofallCycles)ConditionalBranches):Total:21(11.11ofallInstructions),thereof:taken:1(4.76ofallcond.Branches)nottaken:2095.24鬼ofallcond.Branches)Load-/Store-1nstructions:Total:60(31.75ofallInstructions,thereof:Loads:40(66.67ofLoad-/Store-lnstru

19、ctions)Stores:20(33.33ofLoad-/Store-instructions)Floatinmpointstageinstruuticinw:Total:40(21.16ofallInstructions),thereof:Additions:20(50.00ofFloatingpointstageinst.)Multiplications:20(50.00ofFloatingpointstageinst.)Divisions:0(0.00ofFloatingpointstageinst.)Traps:Traps:1(0.53ofallInstructions)Total:

20、497Cycle(s)executed.IDexecutedby189Instructinnfs.5lnstructiunscurrentljr1inFip已li門已.Hardwareconfiguraticm:Memurvsize:32768BytesfaddEX-Stag已:工5.requiredCycles:4frrmlEX-Stdges:5.requiredCycles:4fdivEX-Stages::5.wqui已dCycles:4Forwardingdisabled.Stalls:RAWstalls:283(56.94ofallCycles)WAWstalls:Ll0.OUS:of

21、allCycles)Eitructuralstalls:0O.UUofallCycles匚口門Mlstalls:214.22%ofallCjrdesTrapstalls:3(0.60ofallCycles)Total:307Stalls(61.77ofallCycles)Conditiona.lBranches):Total:21(11.11ofallInstructions.,thereof:taken:14.76%ofallcond-Branchesnottaken:20(95.24;iofallcond.BranchesLoad-.-E;t口i?巳一Irumtructions:Totdl

22、:60(31.75ofallInstructions),thereof:Loads:40(66.67ofLoad-/Store-Instructions)Stores:20(33.33ofLoad7Store-lnstructions)F1oa.tingpoint:s:tagsinstructicms:Tutal:40(21.16ofallInstructions)thereof:.Additions:2050.LlUofFluatinqpLiintstageinst/lMultiplications:2050.UUofFlu-atingpuintstageinsLLJivisions:U(0

23、.00/:=:ofFloatingpointstageinst/lTraps:Traps:1(0.53ofallInstructions)總結(jié):改變幾次浮點運算部件個數(shù)后再執(zhí)行,發(fā)現(xiàn)統(tǒng)計結(jié)果均相同,增加浮點運算部件對該函數(shù)沒有任何性能方面的改進(jìn),這主要是因為函數(shù)中沒有連續(xù)的浮點加/乘/除指令。4、增加forward部件對性能的影響。forward部件對具體指令執(zhí)行的影響:IF|IDintEX|MEMWBAIFID*EXMEMWBIF|Fl尿ll|下MEM有沒有addir1j0,0 x0forwardadduir4j0/0 xS部件subir2/1,0 x14beqzr2Jinishaddir1

24、j0,0h0forwardadduii4j0,0 x8部件subir2.r14beqzr2.finishIFR-StallIDintEX|intEX|MEtJIF|IDintEXMEMWBVIFIDintEXMEM|WBIFIDintEX|MEM|WBIF|R-Stall|ID|intEXMEMWB統(tǒng)計結(jié)果的比較:沒有forward部件有forward部件Tot*41二土匚I已:習(xí)已::已匚:Lt0IDexecutedby1U9Instruction(s).5Instructicm(s)currentlyinPipeline.5Cyce(sexecuted?eWuulbdbpIo37nstru

25、ction(s.5Instruction(s)currentlyinPipeline.Hardwareconfiguration:Memorysize:32768BytesfaddEX-Stages:1,requiredCycles:4fmulEX-Stages:1,requiredCycles:4fdivEX-Stages:LrequiredCycles:4Hardwareconfiguration:Memorysize:32730BytesfaddEX-Stages:1,requiredCycles:fmulEX-Stages:1.requiredCycles:4JdiT匸忙血菲聖1“ro

26、quigdCycles:4Corwardhgenabled.AWstas:26.94ofallCycles).00ofallCycles)Structuralstalls:0(0.00ofallCycles)Controlstalls:21(4.22ofallCvclesTrapstalls:3(0.60ofallCycles)Total:307Stall(s)(61.77ofallCycles)S33%:allCycles),therecf:LDitalls:20112.42cfRAWstalls)ConditionalBranches):Total:21(11.11ofallInstruc

27、tions),thereof:taken:1(4.76ofallcond.Branches)nottaken:2095.24鬼ofallcond.Branches)Load-/Store-1nstructions:Total:60(31.75ofallInstructions,thereof:Loads:40(66.67ofLoad-/Store-lnstructions)Stores:20(33.33ofLoad-/Store-instructions)Floatinmpointstageinstruuticinw:Total:40(21.16ofallInstructions),there

28、of:Additions:20(50.00ofFloatingpointstageinst.)Multiplications:20(50.00ofFloatingpointstageinst.)Divisions:0(0.00ofFloatingpointstageinst.)Traps:Traps:1(0.53ofallInstructions)Branch/Jumpstalls:21(13.04ofRAWstalls)Floatingpointstalls:120(74.53ofRAV/stalls)WAWstalls:0(O.JOofallCycles)Structuralstalls:

29、0(0.00ofallCycles)Controlstalls:21(5.60ofall匚yules:Trapstals:3(0.00ofallCycles)Total:185Stall(s(49.33allCycles)ConditionalEranches):Total:21(11.11ofallInslructions),thereof:taken:14.7魄ofallcond.Branches)notlaken:20(95.24ofallcond.BranchesLoad/Store-Instruetions:Total:60(31.75ofallInslructions,thereo

30、f:Loads:40(66.67ofLoad-/Stcre-lnstructions)Storss:20(33.33ofLoad-ZStcre-lnstructions)Floatingpointstageinstruuti.cins:Total:40(21.16ofallInslructions),thereof:Addtions:20(50.00ofFloatingpointstageinst.)Multiplication;:20(50.00ofFloatingpohtstageinst.)Divisions:0(0.00ofFloatingpintstageinst.)Traps:Traps:1(0.53dallInstrictions)增加forward部件后,流水線的加速比為

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論