2013年7月1日開始的新模擬器項(xiàng)目.alpha fullsystem參考gem5guide_第1頁
2013年7月1日開始的新模擬器項(xiàng)目.alpha fullsystem參考gem5guide_第2頁
2013年7月1日開始的新模擬器項(xiàng)目.alpha fullsystem參考gem5guide_第3頁
2013年7月1日開始的新模擬器項(xiàng)目.alpha fullsystem參考gem5guide_第4頁
2013年7月1日開始的新模擬器項(xiàng)目.alpha fullsystem參考gem5guide_第5頁
已閱讀5頁,還剩72頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

1、Wang HuiSino-German Joint Software Institution Gem5 Guide2arch-node1:35arch-node2:36arch-node3:37arch-node4:38whjsi412wget wget Gem5 Guide OutlineWhat is Gem5?Build & Run Gem5 SimulatorGem5 BasicsRun your code under SE modeRun SPLASH2 Benchmark under SE modeRun your c

2、ode under FS modeRun SPLASH2 Benchmark under FS modeInside the Gem5Modify to satisfy your needsSummary3Gem5 Guide OutlineWhat is Gem5?Build & Run Gem5 SimulatorGem5 BasicsRun your code under SE modeRun SPLASH2 Benchmark under SE modeRun your code under FS modeRun SPLASH2 Benchmark under FS modeInsid

3、e the Gem5Modify to satisfy your needsSummary4What is Gem5The combination of M5 and GEMS into a new simulatorGoogle scholar statisticsM5 (IEEE Micro, CAECW): 440 citationsGEMS (CAN): 588 citationsBest aspects of both glued togetherM5: CPU models, ISAs, I/O devices, infrastructureGEMS (essentially Ru

4、by): cache coherence protocols, interconnect models5Main GoalsFlexibilityMultiple CPU models across the speed vs. accuracy spectrumTwo execution modes: System-call Emulation & Full-systemTwo memory system models: Classic & RubyOnce you learn it, you can apply to a wide-range of investigationsAvailab

5、ilityFor both academic and corporate researchersNo dependence on proprietary codeBSD licenseCollaborationCombined effort of many with different specialtiesActive community leveraging collaborative technologies6Key FeaturesPervasive object-oriented designProvides modularity, flexibilitySignificantly

6、leverages inheritance e.g. SimObjectPython integrationPowerful front-end interfaceProvides initialization, configuration, & simulation controlDomain-Specific LanguagesISA DSL: defines ISA semanticsCache Coherence DSL (a.k.a.SLICC): defines coherence logicStandard interfaces: Ports and MessageBuffers

7、7CapabilitiesExecution modes: System-call Emulation (SE) & Full-System (FS)ISAs: Alpha, ARM, MIPS, Power, SPARC, x86CPU models: AtomicSimple, TimingSimple, InOrder, and O3Cache coherence protocols: broadcast-based, directories, etc.Interconnection networks: Simple & Garnet (Princeton, MIT)Devices: N

8、ICs, IDE controller, etc.Multiple systems: communicate over TCP/IP8To usPython and C+ with an event queue and a bunch of APIs9Gem5 Guide OutlineWhat is Gem5?Build & Run Gem5 SimulatorGem5 BasicsRun your code under SE modeRun SPLASH2 Benchmark under SE modeRun your code under FS modeRun SPLASH2 Bench

9、mark under FS modeInside the Gem5Modify to satisfy your needsSummary10Start with a simple examplesuppose we want to run a hello world programand suppose we have installed a number of packages and tools that gem5 depend ong+, python, scons, swig, zlib, m4, mercurialUbuntu Server: sudo apt-get install

10、 mercurial scons swig python-dev g+ build-essential texinfo first we need to download the GEM5 Simulator source codeMercurial: hg clone -stablethen we need to compile GEM5 Simulator11DependenceToolsGCC/G+ 3.4.6+Most frequently tested with 4.2-4.5Python 2.4+SCons 0.98.1+We generally test versions 0.9

11、8.5 and 1.2.0 SWIG 1.3.31+ Other materials:wget 12Start with a simple exampleCompile Targets: build/configsBy convention, usually _ALPHA_SE (Alpha syscall emulation)ALPHA_FS (Alpha full system)Other ISAs: ARM, MIPS, POWER, SPARC, X86Sometimes followed by Ruby protocol:ALPHA_SE_MOESI_hammerYou can de

12、fine your own configsbinarygem5.debug debug build, symbols, tracing, assertgem5.opt optimized build, symbols, tracing, assertgem5.fast optimized build, no debugging, no symbols, no tracing, no f gem5.fast + profiling support13Start with a simple exampleso lets try this command to c

13、ompile Gem5 Simulator: and run the simulator: Notes:If errors, first check the packages GEM5 depend on are installedscons build/ALPHA_SE/gem5.opt./build/ALPHA_SE/gem5.opt configs/example/se.py14Question on the simple examplewhat the output means? where is the hello world executable file?what is conf

14、igs/example/se.py? how it works?15Gem5 Guide OutlineWhat is Gem5?Build & Run Gem5 SimulatorGem5 BasicsRun your code under SE modeRun SPLASH2 Benchmark under SE modeRun your code under FS modeRun SPLASH2 Benchmark under FS modeInside the Gem5Modify to satisfy your needsSummary16How se.py works?17How

15、se.py works?18How se.py works?19How se.py works?20How se.py works?21Summary on se.py Modesgem5 has two fundamental modesFull system (FS)For booting operating systemsModels bare hardware, including devicesInterrupts, exceptions, privileged instructions, fault handlersSyscall emulation (SE)For running

16、 individual applications, or set of applications on MP/SMTModels user-visible ISA plus common system callsSystem calls emulated, typ. by calling host OSSimplified address translation model, no schedulingSelected via compile-time optionVast majority of code is unchanged, though22Summary on se.py Obje

17、ctsEverything you care about is an object (C+/Python)Derived from SimObject base classCommon code for creation, configuration parameters, naming, checkpointing, etc.Uniform method-based APIs for object typesCPUs, caches, memory, etc. patibility across implementationsFunctional vs. detailed CPUConven

18、tional vs. indirect-index cacheEasy replication: cores, multiple systems, . . .23Summary on se.py EventsStandard event queue timing modelGlobal logical time in “ticks”No fixed relation to real timeNormally picoseconds in our examplesObjects schedule their own eventsFlexibility for detail vs. perform

19、ance trade-offsE.g., a CPU typically schedules event at regular intervalsEvery cycle or every n picosecondsWont schedule self if stalled/idleso Now you knows how a Event Driven Simulator works the Simulator just fetch events from the EQ(Event Queue), all events generated by Objects and it produce ne

20、w events to insert to the EQ24Summary on se.py PortsMethod for connecting MemObjects togetherEach MemObject subclass has its own Port subclass(es)Specialized to forward packets to appropriate methods of MemObject subclassEach pair of MemObjects is connected via a pair of Ports (“peers”)Function pair

21、s pass packets across portssendTiming() on one port calls recvTiming() on peerResult: class-specific handling with arbitrary connections and only a single virtual function call25Summary on se.py Access ModeThree access modes: Functional, Atomic, TimingSelected by choosing function on initial Port:se

22、ndFunctional(), sendAtomic(), sendTiming()Functional mode:Just “make it happen”Used for loading binaries, debugging, etc.Accesses happen instantaneously updating data everywhere in the hierarchyIf devices contain queues of packets they must be scanned and updated as wellAtomic mode:Requests complete

23、 before sendAtomic() returnsModels state changes (cache fills, coherence, etc.)Returns approx. latency w/o contention or queuing delayUsed for fast simulation, fast forwarding, or warming cachesTiming mode:Models all timing/queuing in the memory systemSplit transactionsendTiming() just initiates sen

24、d of request to targetTarget later calls sendTiming() to send response packetAtomic and Timing accesses can not coexist in system26Summary on se.py m5out/* config.iniThe simulated Systemstats.txtSimulation Statisticsyou can generate statistic you needed by add some code, check GEM5 Tutorial for deta

25、ils27How to Debug?TracingUsing gdb to debug gem5Python Debugging28Tracingprintf() is a nice debugging toolKeep good printfs for tracingLots of debug output is a very good thingExample flags:Fetch, Decode, Ethernet, Exec, TLB, DMA, Bus, Cache, Loader, O3CPUAll, etc.Print out all flags with -debug-hel

26、p optionsrc/base/trace.*29Enabling TracingSelecting flags:-debug-flags=Cache,Bus-debug-flags=Exec,-ExecTicksSelecting destination:-trace-file=my_trace.out-trace-file=my_trace.out.gzSelecting start:-trace-start=3000000./build/ALPHA_SE/gem5.opt -debug-flags=MemoryAccess -trace-start=3000000 configs/ex

27、ample/se.py30Adding DebugingPrint statement put in source codeEncourage you to add ones to your models or contribute ones you find particularly usefulMacros remove them for gem5.fast or f binariesSo you must be using gem5.debug or gem5.opt to get any outputAdding an extra tracing statement:D

28、PRINTF(Flag, “normal printf %snn”, “arguments”);Adding a new debug flags (in a SConscript):DebugFlag(MyFlag)31Using GDB with Gem5Several gem5 functions designed to be called from GDB:schedBreakCycle() also with -debug-breaksetDebugFlag()/clearDebugFlag()dumpDebugStatus()eventqDump()SimObject:find()t

29、akeCheckpoint()32Using GDB with Gem5wharch-node1:/gem5-stable$ gdb -args ./build/ALPHA_SE/gem5.opt configs/example/se.py GNU gdb (Ubuntu/Linaro 7.2-1ubuntu11) 7.2.(gdb) b mainBreakpoint 1 at 0 x4087e0: file build/ALPHA_SE/sim/main.cc, line 41.(gdb) runStarting program: /home/wh/gem5-stable/build/ALP

30、HA_SE/gem5.opt configs/example/se.pyThread debugging using libthread_db enabledBreakpoint 1, main (argc=2, argv=0 x7fffffffe688) at build/ALPHA_SE/sim/main.cc:4141(gdb) call schedBreakCycle(1000000)warn: need to stop all queues33Using GDB with Gem5(gdb) continueContinuing.gem5 Simulator System. gem5

31、 is copyrighted software; use the -copyright option for details.gem5 compiled Aug 29 2011 22:41:08gem5 started Aug 29 2011 22:47:08gem5 executing on arch-node1command line: /home/wh/gem5-stable/build/ALPHA_SE/gem5.opt configs/example/se.pyGlobal frequency set at 1000000000000 ticks per second0: syst

32、em.remote_gdb.listener: listening for remote gdb #0 on port 7000* REAL SIMULATION *info: Entering event queue 0. Starting : Increasing stack size by one page.Program received signal SIGTRAP, Trace/breakpoint trap.0 x00007ffff638dfe7 in kill () from /lib/x86_64-linux-gnu/libc.so.6(gdb)

33、 p _curTick$1 = 100000034Using GDB with Gem5(gdb) print SimObject:find(system.cpu)$2 = (SimObject *) 0 x16aa980(gdb) print (BaseCPU*)SimObject:find(system.cpu)$3 = (BaseCPU *) 0 x16aa980(gdb) p $3-instCnt$4 = 94699(gdb) continue Continuing.Hello world!hack: be nice to actually delete the event hereE

34、xiting tick 3252000 because target called exit()Program exited normally.35Python DebuggingIt is possible to drop into the python interpreter (-i flag)This currently happens after the script file is runIf you want to do this before objects are instantiated, remove them from scriptIt is possible to dr

35、op into the python debugger (-pdb flag)Occurs just before your script is invokedLets you use the debugger to debug your script codeCode that enables this stuff is in src/python/m5/main.pyAt the bottom of the main functionCan copy the mechanism directly into your scripts, if in the wrong place for yo

36、u needsimport pdbpdb.set_trace()36More 37how to configure your architecture a small sample:mytest.pyabc.py38Break?39Gem5 Guide OutlineWhat is Gem5?Build & Run Gem5 SimulatorGem5 BasicsRun your code under SE modeRun SPLASH2 Benchmark under SE modeRun your code under FS modeRun SPLASH2 Benchmark under

37、 FS modeInside the Gem5Modify to satisfy your needsSummary40Cross CompilerThe first tool your need to preparedcheck the Gem5 Status Matrix, ALPHA is the best supported architectureI had compiled a alpha cross compiler, so your can copy it to use as your wishHow to use?append this command to /.bashrc

38、export PATH=/bin:/alphaev67-unknown-linux-gnu/bin:$PATH41Run your code under SE modecompile your code with static flag, pilerusing config/example/se.py c to run your_own_coderesults:alphaev67-unknown-linux-gnu-gcc o sum sum.c static O2./build/ALPHA_SE/gem5.opt configs/example/se.py c /PATH/TO/sum42R

39、un your code under SE modeMultiprogrammed workloadsTwo hello workloads, then modify se.py to smt.pyresults:./build/ALPHA_SE/gem5.opt configs/example/smt.py n 243Gem5 Guide OutlineWhat is Gem5?Build & Run Gem5 SimulatorGem5 BasicsRun your code under SE modeRun SPLASH2 Benchmark under SE modeRun your

40、code under FS modeRun SPLASH2 Benchmark under FS modeInside the Gem5Modify to satisfy your needsSummary44Run SPLASH2 under SE modeGet SPLASH2 Benchmark from Run./build/ALPHA_SE/gem5.opt configs/splash2/run.py rootdir=/home/wh/benchmark/splash2/codes n 1 b FFT45Gem5 Guide OutlineWhat is Gem5?Build &

41、Run Gem5 SimulatorGem5 BasicsRun your code under SE modeRun SPLASH2 Benchmark under SE modeRun your code under FS modeRun SPLASH2 Benchmark under FS modeInside the Gem5Modify to satisfy your needsSummary46What is FS modeload linux kernelhow to compile your kernel image?47Full System related files mo

42、n/SysPaths.pywhere is the disk image mon/FSConfig.pypal, kernel mon/Benchmarks.pydisk image namem5termcd util/termmakesudo make install48Run your code under FS modePreparation: put your code into the imageRunsudo mount o loop,offset=32256 linux-latest.img /mntsudo mkdir p /mnt/benchmark/mybenchsudo

43、cp sum /mnt/benchmark/mybenchsudo umount /mntscons build/ALPHA_FS/gem5.opt./build/ALPHA_FS/gem5.opt configs/example/fs.pym5term 3456./sum49Gem5 Guide OutlineWhat is Gem5?Build & Run Gem5 SimulatorGem5 BasicsRun your code under SE modeRun SPLASH2 Benchmark under SE modeRun your code under FS modeRun

44、SPLASH2 Benchmark under FS modeInside the Gem5Modify to satisfy your needsSummary50Run SPLASH2 under FS modePreparation: put your code into the imageRunsudo mount o loop,offset=32256 linux-latest.img /mntsudo mkdir p /mnt/benchmark/mybenchsudo cp FFT /mnt/benchmark/mybenchsudo umount /mntscons build

45、/ALPHA_FS/gem5.opt./build/ALPHA_FS/gem5.opt configs/example/fs.pym5term 3456./FFT -t51Run SPLASH2 under FS modemore convenient way?Runvi mon/Benchmarks.py+ fft:SysConfig(fft.rcS, 512MB),vi configs/boot/ffs.rcS+ #!/bin/sh+ cd benchmarks/mybench+ echo “Running FFT now”+ ./FFT t p1+ /sbin/m5 exitscons

46、build/ALPHA_FS/gem5.opt./build/ALPHA_FS/gem5.opt configs/example/fs.py n 1 b fftcat m5out/system.terminal52Gem5 Guide OutlineWhat is Gem5?Build & Run Gem5 SimulatorGem5 BasicsRun your code under SE modeRun SPLASH2 Benchmark under SE modeRun your code under FS modeRun SPLASH2 Benchmark under FS modeI

47、nside the Gem5Modify to satisfy your needsSummary53Inside Gem5Source Code Tree Organization54Inside Gem5Source Code Tree Organizationconfigs: sample m5 scriptssrc/arch: architecture definition & ISA-specific componentssrc/base: general data structures/facilitiessrc/python: Python config codesrc/cpu,

48、 src/mem, src/dev: specific modelssrc/sim: simulator base functionalitysystem: platform specific code (palcode, firmware, bios, etc.) packaged separatelytest: regression testsutil: utility programs55CPU Models OverviewSupported CPU ModelsAtomicSimpleCPUTimingSimpleCPUInOrderCPUO3CPUCPU Model Interna

49、lsParametersTime BuffersKey Interfaces56CPU Models Overview57Supported CPU ModelsSimple CPUsModels Single-Thread 1 CPI MachineTwo Types: AtomicSimpleCPU and TimingSimpleCPUCommon Uses:Fast, Functional Simulation: 2.9 million and 1.2 million instructions per second on the “twolf” benchmarkWarming Up

50、CachesStudies that do not require detailed CPU modelingDetailed CPUsParameterizable Pipeline Models w/SMT supportTwo Types: InOrderCPU and O3CPU“Execute in Execute”, detailed modelingSlower than SimpleCPUs: 200K instructions per second on the “twolf” benchmarkModels the timing for each pipeline stag

51、eForces both timing and execution of simulation to be accurateImportant for Coherence, I/O, Multiprocessor Studies, etc.src/cpu/*.hh,cc58Inside Gem5CPU Model59Inside Gem5CPU Model60Inside Gem5CPU Model61Inside Gem5CPU Model62Inside Gem5CPU Model63Inside Gem5Memory ModelGeneral Memory SystemPortsPack

52、etsRequestsAtomic/Timing/Functional accessesTwo memory system modelsClassicRubyCheck for details64Ruby Memory ModelFlexible Memory SystemRich configuration - Just run itSimulate combinations of caches, coherence, interconnect, etc.Rapid prototyping - Just create itDomain-Specific Language (SLICC) fo

53、r coherence protocolsModular componentsDetailed statisticse.g., Request size/type distribution, state transition frequencies, etc.Detailed component simulationNetwork (fixed/flexible pipeline and simple)Caches (Pluggable replacement policies)Memory (DDR2)65Ruby Memory ModelCan build many different m

54、emory systemsCMPs, SMPs, SCMPs1/2/3 level cachesPt2Pt/Torus/Mesh TopologiesMESI/MOESI coherenceEach components is individually configurableBuild heterogeneous cache architectures (new)Adjust cache sizes, bandwidth, link latencies, etc.66Ruby Memory Model8 core CMP, 2-Level, MESI protocol, 32K L1s, 8MB 8-banked L2s, crossbar interconnectscons build/ALPHA_FS/gem5.opt PROTOCOL=MESI_CMP_directory RUBY=T

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論