




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)
文檔簡介
1、并行程序設(shè)計課程實驗報告實驗 1: Intel 多核編譯器及 Intel Parallel Studio XE姓名葛書衡院系軟件學(xué)院學(xué)號 1153730109任課教師張偉哲指導(dǎo)教師實驗地點軟件學(xué)院三樓機房實驗時間2017.4.12實驗課表現(xiàn)出勤、表現(xiàn)得分實驗報告 得分實驗總分操作結(jié)果得分、實驗?zāi)康囊螅盒璺治霰敬螌嶒灥幕灸康?,并綜述你是如何實現(xiàn)這些目的的?基本目的:.掌握六步驟的優(yōu)化過程;.掌握利用編譯器選項來優(yōu)化代碼;.掌握針對不同CPU使用自動向量化進行性能調(diào)優(yōu);.學(xué)習(xí)增加并行性的三步驟;.使用Cilk Plus來增加并行性;6.使用OpenMP來增加并行性實現(xiàn):通過實驗指導(dǎo)書、結(jié)合老師
2、上課所講的知識,在實驗室上機實現(xiàn)以上目的二、實驗內(nèi)容該部分填寫在實驗過程中,你都完成了哪些工作。實驗1.1使用intel編譯器生成優(yōu)質(zhì)代碼步驟一:不使用優(yōu)化技術(shù)構(gòu)建應(yīng)用程序步驟二:使用通用優(yōu)化步驟三:使用處理器相關(guān)的優(yōu)化步驟四:增加過程間優(yōu)化步驟五:性能測評指導(dǎo)的優(yōu)化步驟六:自動向量化的調(diào)優(yōu)實驗 1.2 Parallel Studio XE 快速上手1、使用 Cilk Plus2、三步驟增加并行性:步驟1:分析串行程序步驟2:用Cilk Plus實現(xiàn)并行性步驟3:調(diào)試及錯誤檢查三、實驗結(jié)果.不使用優(yōu)化技術(shù)構(gòu)建應(yīng)用程序使用通用優(yōu)化/OdC:Xtestintel noapt.exeTineElap
3、sed 5.279257 SecsTotal=6798.&80541 Check Sun = 16阻6則TineElapsed 5.277541 SeccTotal=679ea680541 Check Sun = 163160000Tine5 .252626 SeesTotal=6798.580541 Check Suin - 169160000TineElapsed 5.247493 SecsTotal=6798.680541 Check Sun = 163160300TineElapsed 5.24702S SeccTotal=t79Ha&e0541 Check Sim = 1G316S9
4、O0TineEldpseil 5 .247576 SeesTotal-6798.680541 Clieck Sun - 163160900C= tes:t /O1Xtestintel.01.exeT imeElapsed0.966629SecsTotal=6798.680541 Check Sum=160160030T imeElapsed0.978060SecsTotal=6798.680541 Check Sum=160160030T imeElapsed0.972101SecsTotal=6798.680541 Check Sum=1601600301 imeElapsed0.97266
5、3Secsiotal=b7yB.6HU541 Check Sum=lbU16UMkJMT imeElapsed0.990346SecsTotal=6798.680541 Check Sum=160160030T inkcEla口苕ud0.5725783匕匚工TuLal-G798 .G80541 Clkeuh Sunt-1G01G0&0/O2C:Xtestintel.02.exeT imeElapsed 3.370606 Secs Total=6798.680541 CheckSun = 160169000T imeElapsed 0.367711 Secs Total=6798.680541
6、CheckSun = 160169000T imeElapsed 3.369070 Secs Total=6798.680541 CheckSun = 160169000T imeElapsed 3.270407 Gees Total=t79H-G90541 CheckSun = 1GO1G9000T imeElapsed 3.368370 Secs Total=6798.680541 CheckSun = 160169000T imeElapsed 0.369067 Secs Total=6798.680541 CheckSun = 160169000/O3|c: Xtestintel_03
7、.exeT imeElapsed0.354772SecsTotal=6798.680541CheckSun = 160160000T imeElapsed&.355271SecsTotal=6798.660541CheckSun = 160160000T imeElapsed0.254022SecsTotal=G79aCheckSun = 1G01G00S0T imeElapsedQ.353975SecsTotal=6798.680541CheckSun = 160160000T imeElapsed0.354375SecsTotal=6798.680541CheckSun = 1601600
8、00T imeElapsed0.353231SecsTotal=6798.680541CheckSun = 160160000/Ox|C: testintel-0 x-exeT imeElapsed0.367272SecsTotal=6798.680541CheckSum=16S160000TimfiElapsedSecsTntal=A79RChfinkSum=1 Am ARPinnT imeElapsed0.366461SecsTotal=6798.680541CheckSum=16S160000T imeElapsed0.365486SecsTotal=6798.680541CheckSu
9、m=16S160000T imeElapsed0.3G51OSSeesTotal-G7?8.C8S541CliccltGum-1G61GQS0ST imeElapsed0.366016SecsTotal=6798.680541CheckSum=16S160000使用處理器相關(guān)的優(yōu)化(1)不使用選項,構(gòu)建和運行應(yīng)用程序,增加Qvec-到CFLAGS選項中以關(guān)閉自動向量化C:testintel.34.exeT luteE Icq邛匕a0.3G3797它已心石ToLintel.SSE.exeT imeElapsed0.392221SecsTotal=6798.680541CheckSum = 160
10、160000T imeElapsed0.381096SecsTotal=6798.680541CheckSum = 160160000T imeElapsed0.384797SecsTotal=6798.680541CheckSum = 160160000T imeElapsed0.383872SecsTotal=6798.680541CheckSum = 160160000T imeElapsed0.386879SecsTotal=6798.680541CheckSum = 160160000T imeElapsed0.386884SecsTotal=6798.680541CheckSum
11、= 160160000SSE2C:Xtestintel.SSE2.exeT imeElapsed0.393706SecsTotal=6798.680541CheckSum = 160160000T imeElapsed0.378600SecsTotal=6798.680541CheckSum = 160160000T imeElapsed0.378578SecsTotal=6798.680541CheckSum = 160160000T imeElapsed0.377979SecsTotal=6798.680541CheckSum = 160160000T imeElapsed0.381179
12、SecsTotal=6798.680541CheckSum = 160160000T imeElapsed0.377378SecsTotal=6798.680541CheckSum = 160160000SSE3C:Xtestintel.SSE3.exeT imeElapsed 0.382814 SecsTotal=6798.680541Check Sum = 16016000。T imeElapsed 0.378148 SecsTotal=6798.680541Check Sum = 160160000T imeElapsed 0.376223 SecsTotal=6798.680541Ch
13、eck Sum = 160160000T imeElapsed 0.377690 SecsTotal=6798.680541Check Sum = 16016000。T imeElapsed 0.374411 SecsTotal=6798.680541Check Sum = 160160000T imeElapsed 0.380942 SecsTotal=6798.680541Check Sum = 160160000SSE4.12-Xtestinte.SSE4.1.(axer imeElapsed0.384171SecsTotal=6798.680541CheckSum = 16016000
14、0r imeElapsed0.375404SecsTotal=6798.680541CheckSum = 160160000r imeElapsed0.377257SecsTotal=6798.680541CheckSum = 160160000r imeElapsed0.378104SecsTotal=6798.680541CheckSum = 160160000r imeElapsed0.382731SecsTotal=6798.680541CheckSum = 160160000r imeElapsed0.374886SecsTotal=6798.680541CheckSum = 160
15、160000SSE4.2|C:Xtestintel.SSE4.2.exeT lineElapsedQ.382833SecsTotal=6798.680541CheckSum = 160160000T in)eElapsed&.373490SecsTotal=6798.680541CheckSun = 160160000T lineElapsed&.373207SecsTotal=6798.680541CheckSum = 160160000T lineElapsed0.373814SecsTotal=6798.680541CheckSum = 160160000T in)eElapsed&.3
16、74750SecsTotal=6798.680541CheckSun = 160160000T lineElapsed&.372234SecsTotal=6798.680541CheckSum = 160160000使用QaxAVX選項重新構(gòu)建應(yīng)用程序AVXC:Xtestintel.axAUX.exeT ineElapsed0.379616SecsTotal=6798.680541CheckSun = 160160000T ineElapsed0.377457SecsTotal=6798.680541CheckSum = 160160000T ineElapsed0.381858SecsTot
17、al=6798.680541CheckSum = 160160000T ineElapsed0.380582SecsTotal=6798.680541CheckSun = 160160000T ineElapsed0.379117SecsTotal=6798.680541CheckSum = 160160000T ineElapsed0.383025SecsTotal=6798.680541CheckSum = 160160000.增加過程間優(yōu)化使用/Qipo選項來構(gòu)建和運行應(yīng)用程序,添加使用的平臺上最高級別的自動向量化記錄運行時 間SSE2、SSE3、SSSE3)性能測評指導(dǎo)的優(yōu)化QipoT
18、 ineElapsed0T ineElapsed0.282410SecsTotal=6798.680541CheckSumT ineElapsed0.253400SecsTotal=6798.680541CheckSumT ineElapsed0.254489SecsTotal=6798.680541CheckSumT ineElapsed0.255311SecsTotal=6798.680541CheckSumT ineElapsed0.258279SecsTotal=6798.680541CheckSumT ineElapsed0.256907SecsTotal=6798.680541Ch
19、eckSumC:Xtestintel.QipoSEE2.exe160160600160160600160160600160160600160160600160160600|c:Xtestintel.Qipo.exeT ineElapsed0.265214SecsTotal=6798.680541CheckSum = 160160000T ineElapsed0.254297SecsTotal=6798.680541CheckSum = 160160000T imeElapsed0.251030SecsTotal=6798.680541CheckSum = 160160000T ineElaps
20、ed0.254059SecsTotal=6798.680541CheckSum = 160160000T ineElapsed0.255383SecsTotal=6798.680541CheckSum = 160160000T imeElapsed0.254185SecsTotal=6798.680541CheckSum = 160160000SEE2SEE4.2C:Xtestintel.QipoSEE4.2.exeT imeElapsed 0.317439SecsTotal=6798.680541 Check Sum = 160160000T imeElapsed 0.257017SecsT
21、otal=6798.680541 Check Sum = 160160000T imeElapsed 0.251601SecsTotal=6798.680541 Check Sum = 160160000T imeElapsed 0.250948SecsTotal=6798.680541 Check Sum = 160160000T imeElapsed 0.251120SecsTotal=6798.680541 Check Sum = 160160000T imeElapsed 0.254430SecsTotal=6798.680541 Check Sum = 160160000SEE3C:
22、Xtestintel.QipoSEE3.exeT imeElapsed 0.255557SecsTotal=6798.680541Check Sum = 160160600T imeElapsed 0.252098SecsTotal=6798.680541Check Sum = 160160600T imeElapsed 0.251505SecsTotal=6798.680541Check Sum = 160160S00T imeElapsed 0.251814SecsTotal=6798.680541Check Sum = 160160600T imeElapsed 0.250546Secs
23、Total=6798.680541Check Sum = 160160600T imeElapsed 0.252690SecsTotal=6798.680541Check Sum = 160160S00(1)打開PGO功能,執(zhí)行intel.pgo.gen.exe程序,記錄實驗結(jié)果C:Xtestintel.pgo.gen.exeT imeElapsed2.475525SecsTotal=6798.680541CheckSum = 160160000T imeElapsed2.483680SecsTotal=6798.680541CheckSum = 160160000T imeElapsed2.
24、495488SecsTotal=6798.680541CheckSum = 160160000T imeElapsed2.486739SecsTotal=6798.680541CheckSum = 160160000T imeElapsed2.469373SecsTotal=6798.680541CheckSum = 160160000T imeElapsed2.450192SecsTotal=6798.680541CheckSum = 160160000,5 a cf4ce3J01440. dyn2018/4/12 20:11DYN文件3 KBc addy.c2016/3/22 21:14C
25、 Source1 KBaddy.optrpt201B/4/12 19:37OPTRPT文件0 KBC1 chapter4.c2016/4/21 20:55C Source2 KB1 chapter4.h2O1&/3/22 216C/C+4- Header1 KEj chapter4.optrpt201B/4/12 19:37OPTRPT文件3 KB回 iritel.pgo.gen.exe201&/4/12 20:10150 KB.Makefile2O1&/3/23 10:48文件1 KB回 seiries.c2O1&/3/22 21:13C Source1 KBseries.optrpt201
26、3/4/12 19:27OPTRPT文件1 KB work.c2016/3/22 21:10C Source1 KBwork.optrpt201&/4/12 19:37OPTRPT文件1 KB wtime.c201S/3/23 9:32C Source1 KBwtime.optrpt201B/4/12 19:37OPTRPT文件0 KB重新構(gòu)建應(yīng)用程序,告訴編譯器需要使用剛才產(chǎn)生的動態(tài)信息運行intel.pgo.exeC:testinte1-pgo.exeT imeElapsed1.050985SecsTotal=6798.680541CheckSum = 160160000T imeElap
27、sed1.034428SecsTotal=6798.680541CheckSum = 160160000T imeElapsed1.031015SecsTotal=6798.680541CheckSum = 160160000T imeElapsed1.027866SecsTotal=6798.680541CheckSum = 160160000T imeElapsed1.020331SecsTotal=6798.680541CheckSum = 160160000T imeElapsed1.019246SecsTotal=6798.680541CheckSum = 160160000.自動向
28、量化的調(diào)優(yōu)(1)編譯code1.2中的test.cpp程序,要求自動向量化器產(chǎn)生報告:te&t.optrpt -記莪口 口 文件歸 編卷EJ *:卬 堂看&J 幫動(HJBegin optimizatian report for:float *, floal float fluat 東-fl oat *)FLcport froiri: Vcciur DptimEatLons rec JLOOP BEGIN at CtBsicodel, 2test. cpp(3, 2)remark #15344: loop 卬as not /ectorised: vectar dependence preven
29、ts VECtorizalian. First remark #15346: vector depEndence: assumed FLOW depEndcncE beiwecn Line 5 and 1 inc 5UOOF ENDLOUP BEGIN at C:tcs-tcodcl. ytest. cpp(3B lALOOP ENDrir使用GAPC:testcodel.2icl Zc test-epp /Qguide ZcIntel C + + Intel 64 Compiler XE for applications: running on Intel 64, U psion 08 Bu
30、ild 20140726Copyright 1985-2014 Intel Corporation - All rights peserued.test-eppGAP REPORT LOG OPENED ON Thu Apr 12 20:18:07 2018re mark #30761 : Add -Qparalle 1 option if you wan t the compiler to generate Iecomi endations fop imppouing auto-parallelization.C:XtestXcodel.2test.cpp(3): remark H30536
31、: Add -Qno-alias-args option F p better type-based disambiguation analysis by the compiler, if appropriate This will improve optinizations such as uectorization f of the loop at line 3 . UERIF? J Make sLire that the seman ics of this option is obeyed for the entire compilation. ALTERNATIUE Another a
32、y to get the same effect is to add the restrictM keyword to each pointer-type opmal parameter o the routineThis allows optimisations such as uectoii:at ion to be applied to the loop at line 3. EUERIFY1 Make sLire that semant ics of the pestiictM pointer qua 1 ifier is satisfied: in the poutine, all
33、data accesse thiough the pointer must not be accessed through any other pointer.Numbep of aduice-messages emitted for this compilation session: 1. END OF GAP REPORT LOG根據(jù)GAP建議,用命令行選項/Qno-alias-args選項來幫助編譯器成功實現(xiàn)向量 化,編譯代碼,要求產(chǎn)生報告(4)編譯運行conde1.2下的所有代碼,并對比向量化與未向量化的性能teit.optrpt -記營本I Q I 回例中Fj漏輯(EJ格前CQ金舌M
34、 m(l-DJChin 二口二ini. e口: i匚n revert 二七r : 存;二七 _za-_*. _*. floit *. floit w)Report from: 丫丁二 m 口匚二imi.二二ti.二ns -c_。E”_A at Um 底口北,c口: LA JreiDcr: U二二344二 Icop ja? 口匚二.ectorize-:.: .ectcT depenzlenre pren7 = =cTLz.icn. FirstI nlli-rlt 2一二 34,:二 /ri. I i r L:n;jrli -ln.- : -SSJin/二口 L:n;jrli rli .r : J
35、r ILJ - n 11 - I I in H -II - i I n T OOP END_UU. EMat V:二二m1 北,Zit 二 3二,二 X,工 .Rejaider?TiO- EnrC:Xtestcodel.2icl Zc test.cpp /Qguide Zc /Qno-alias-argsIntel C+ IntelCE 64 Coinpilep XE for applications running on Intel 64, Ui Lesion 08 Build 20140726Copyright 1985-2014 Intel Copporation. All rights
36、 reserued.test.eppGAP EEPORT LOG OPENED ON Thu Apr 12 20:20:01 2018remark #30761 : Add -parallel option if you uant the compiler to generate pecoini endations for imppouing auto-parallelization.Number of aduice-messages emitted fop this compilation session: Q.END OF GAP REPORT LOGC: Xtestcodel .2,3
37、testoptrpt -記本本三性的編輯(日稻式口)食等W 韶刖&if-Bc5ll ctciiDLLDticn report for : FC.intj float /, floaT +, floa7 .匕 flea? flcit r.1Fri m in 二 Vi-ic:- !- i i: 11 h -.-il 11 ir 八中,:1LOJ?at L :iec7 23qe1. 12? ejt. cjp J. 2)rcmcik_oop r匚w no7 丁二二t匚zuizcH: vcctcr 3cn二口。cr-oc pr二卡匚匚3 CLtDrization. FixsiLclliclk #1 JJ
38、C ;匚、Lr l 匚 Lclrl 二:i-匚:aiiLULiz J FLCtJ de JzLjl.c1.l.c 1j ; Lr 匚匚 L _ i:iz 匚 c!lJ 1 LIjc 匚1.DT PhTiLOOP BEGIN at C:-testcodel. 2tBEt. epp(35 2) icl /o fff.exe main.epp test.epp /Quec-report:2 Intel C + + Intel 64 Compiler XE for applications running on Intel 64, Ue fsion 08 Build 20140726 Copyrigh
39、t 1985-2014 Intel Copporation. All rights veserued.icl: command line remark ttl0010: option 1/Quec-report:21 is deprecated and will be pemoued in a Liture release. See f /help deprecatedficl: pemapk #10397: optimization reports are generated in *.optrpt files in the output location main.epp test.epp
40、 Microsoft Incremental Linker Uersion 10.00.30319.01 Copyright Microsoft Copporation. All rights veserued.-out:ff.exe main.obj test.objC:XtestXcodel.2fff.exeTime Elapsed 22.096068 Secs uhen N = 5000000,LOOP UAS UECTORIZED5.三步驟增加并行性步驟一分析串行程序Hal VluMiWrt4ii KE 2D15-Hal VluMiWrt4ii KE 2D15-步驟二 用Cilk Pl
41、us實現(xiàn)并行性步驟三調(diào)試及錯誤檢查include include injclude -.c lU:/ reduc er_ cpadd. hconst long int VEF:YBL& - 1000;/本市立球歸市市x京木十才善木木市才木中市才木木木才才方加十才杷本市才魏庠東才.善本市市才就卡木木木木本東才方東市才方本木才注泰引才.才酢木木才木意木小才片人木才方加+.才木奉市本注本 int Jiaintvaid):int i.long int L sun;double sumEji sun/, i ot al;DU口 RD Mt art tin舊 5 elapsed.1 inue ;/ Out
42、put a start til仁工占口工。prirrtf (Cilk Plus Par-all cl linings for Wd it era! icais ritn”. VERYE;IG);/ repeat esperinerft several 1 iniesfor (1 = 0: 1 : i+)/ g ct st art ing 十 imesi arttine = tineC-ertliM Q ;/reset check sum & rumiLne total/ sum = 0;/iDtal = 0. D;cilks: reducer_apadd sm (0);cilk:reducer_opadd total (0,0):/Vorh Loopj da same vork by looping VERYEIC tinescilk for (int j-0; jYERYEIG; j+)I/ increneni: check sumsun 4= L;/ Calculate first
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025年特色風(fēng)味及小吃服務(wù)項目發(fā)展計劃
- 2025年地毯、掛毯類工藝品項目發(fā)展計劃
- 書編纂合同樣本
- 住月子中心合同標(biāo)準(zhǔn)文本
- 代理小區(qū)廣告合同范例
- erp系統(tǒng)合同標(biāo)準(zhǔn)文本
- 二手房居間合同樣本實例
- 2023-2024學(xué)年四年級上冊科學(xué)同步教案(教科版)
- 兩個單位內(nèi)部經(jīng)營合同樣本
- 個體餐飲購銷合同樣本
- 美女金喜善寫真集
- 大學(xué)物理平面電磁波ppt課件
- 入伍簡歷當(dāng)兵簡歷.doc
- 前列腺癌臨床路徑(最全版)
- 國家旅游局新版團隊出境旅游合同模板
- 4S店三表一卡標(biāo)準(zhǔn)模板
- 南京地鐵四號線風(fēng)井主體結(jié)構(gòu)施工方案
- 高中生物競賽 第九章 染色體畸變課件
- 四年級下冊《小數(shù)的意義和性質(zhì)》整理和復(fù)習(xí)
- 土壤污染修復(fù)技術(shù)對比分析
- 3萬高爐轉(zhuǎn)爐混合煤氣柜技術(shù)規(guī)程
評論
0/150
提交評論