版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)
文檔簡介
1、Memory Management in Linux,Anand Sivasubramaniam,Two Parts,Architecture Independent Memory Should be flexible and portable enough across platforms Implementation for a specific architecture,Architecture Independent Memory Model,Process virtual address space divided into pages Page size given in PAGE
2、_SIZE macro in asm/page.h (4K for x86 and 8K for Alpha) The pages are divided between 4 segments User Code, User Data, Kernel Code, Kernel Data In User mode, access only User Code and User Data But in Kernel mode, access also needed for User Data,put_user(), get_user(), memcpy_tofs(), memcpy_fromfs(
3、) allow kernel to access user data (defined in asm/segment.h) Registers cs and ds point to the code and data segments of the current mode fs points to the data segment of the calling process in kernel mode. Get_ds(), get_fs(), and set_fs() are defined in asm/segment.h,Segment + Offset = 4 GB Linear
4、address (32 bits) Of this, user space = 3 GB (defined by TASK_SIZE macro) and kernel space = 1GB Linear Address converted to physical address using 3 levels,Index into Page Dir.,Index into Page Middle Dir.,Index into Page Table,Page Offset,Page Dir. And Middle Dir. Access Functions (in asm/page.h an
5、d asm/pgtable.h),Structures pgd_t and pmd_t define an entry of these tables. pgd_alloc_alloc()/pgd_free() to allocate and free a page for the page directory pmd_alloc(),pmd_alloc_kernel()/pmd_free(),pmd_free_kernel() allocate and free a page middle directory in user and kernel segments. pgd_set(),pg
6、d_clear()/pmd_set(),pmd_clear() set and clear a entry of their tables. pgd_present()/pmd_present() checks for presence of what the entries are pointing to. pgd_page()/pmd_page() returns the base address of the page to which the entry is pointing .,Page Table Entry (pte_t),Attributes Presence (is pag
7、e present in VAS?) Read, Write and Execute Accessed ? (age) Dirty Macros of Pgprot_type PAGE_NONE (invalid) PAGE_SHARED (read-write) PAGE_COPY/READ_ONLY (read only, used by copy-on-write) PAGE_KERNEL (accessibe only by kernel),Page Table Functions,mk_pte(), Pte_clear(), set_pte() pte_mkclean(), pte_
8、mkdirty(), pt_mkread(), . pte_none() (check whether entry is set) pte_page() (returns address of page) pte_dirty(), pte_present(), pte_young(), pte_read(), pte_write(),Process Address Space (not to scale!),Kernel,0 xC0000000,File name, Environment,Arguments,Stack,bss,_end,_bss_start,Data,_edata,_ete
9、xt,Code,Header,0 x84000000,Shared Libs,Address Space Descriptor,mm_struct defined in the process descriptor. (in linux/sched.h) This is duplicated if CLONE_VM is specified on forking. struct mm_struct int count; / no. of processes sharing this descriptor pgd_t *pgd; /page directory ptr unsigned long
10、 start_code, end_code; unsigned long start_data, end_data; unsigned long start_brk, brk; unsigned long start_stack; unsigned long arg_start, arg_end, env_start, env_end; unsigned long rss; / no. of pages resident in memory unsigned long total_vm; / total # of bytes in this address space unsigned lon
11、g locked_vm; / # of bytes locked in memory unsigned long def_flags; / status to use when mem regions are created struct vm_area_struct *mmap; / ptr to first region desc. struct vm_area_struct *mmap_avl; / faster search of region desc. ,Region Descriptors,Why even allocate all of the VAS? Allocate on
12、ly on demand. Use region descriptors for each allocated region of VAS Map allocated but unused regions to same physical page to save space. struct vm_area_struct struct mm_struct *vm_mm; / descriptor of VAS unsigned long vm_start, vm_end; / of this region pgprot_t vm_page_prot; / protection attribut
13、es for this region short vm_avl_height; struct vm_avl_left; vm_area_struct *vm_avl_permission; / right hand child vm_area_struct * vm_next_share, *vm_prev_share; / doubly linked vm_operations_struct *vm_ops; struct inode *vm_inode; / of file mapped, or NULL = “anonymous mapping” unsigned long vm_off
14、set; / offset in file/device ,If vm_inode is NULL (anonymous mapping), all PTEs for this region point to the same page. If the process does a write to any of these pages, the faulting mechanism creates a new physical page (copy-on-write). This is used by the brk() system call. Operations specific to
15、 this region (including fault handling) are specified in vm_operations_struct. Hence, different regions can have different functions.,Struct vm_operations_struct void (*open)(struct vm_area_struct *); void (*close)(struct vm_area_struct *); void (*unmap)(); void (*protect)() void (*sync)(); unsigned
16、 long (*nopage)(struct vm_area_struct *, unsigned long address, unsigned long page, int write_access); void (*swapout)(struct vm_area_struct *, unsigned long, pte_t *); pte_t (*swapin)(struct vm_area_struct *, unsigned long, unsigned long); ,Traditional mmap(),int do_mmap(struct file *, unsigned lon
17、g addr, unsigned long len, unsigned long prot, unsigned long flags, unsigned long off); Creates a new memory region Creates the required PTEs Sets the PTEs to fault later The handler (nopage) will either copy-on-write if anonymous mapping, or will bring in the required page of file.,How is brk() imp
18、lemented?,Check whether to allocate (deny if not enough physical memory, exceeds its VA limits, or crosses stack). Then call do_mmap() for anonymous mapping between the old and new values of brk (in process table). Return the new brk value.,Kernel Segment,On a sys call, CS points to kernel segment.
19、DS and ES are set to kernel segment as well. Next, FS is set to user data segment. Put_user() and get_user() can then access user space if needed. The address parameters to these functions cannot exceed 0 xc0000000. Violation of this should result in a trap, together with any writes to a read-only p
20、age (creates a problem on 386, while the problem does not exist in 486/Pentium) Hence, verify_area() is typically called before performing such operations. Physical and Virtual addresses are same except for those allocated using vmalloc(). Kernel segment shared across processes (not switched!),Memor
21、y Allocn for Kernel Segment,Static Memory_start = console_init(memory_start, memory_end); Typically done for drivers to reserve areas, and for some other kernel components. Dynamic Void *kmalloc(size, priority), Void kfree (void *) / in mm/kmalloc.c Void *vmalloc(size), void *vmfree(void *) / in mm/
22、vmalloc.c Kmalloc is used for physically contiguous pages while vmalloc does not necessarily allocate physically contiguous pages Memory allocated is not initialized (and is not paged out).,kmalloc() data structures,sizes,bh,bh,bh,bh,bh,bh,Null,Null,page_descriptor,size_descriptor,vmalloc(),Allocate
23、d virtually contiguous pages, but they do not need to be physically contiguous. Uses _get_free_page() to allocate physical frames. Once all the required physical frames are found, the virtual addresses are created (and mappings set) at an unused part. The virtual address search (for unused parts) on
24、 x86 begins at the next address after physical memory on an 8 MB boundary. One (virtual) page is left free after each allocation for cushioning.,vmalloc vs kmalloc,Contiguous vs non-contiguous physical memory kmalloc is faster but less flexible vmalloc involves _get_free_page() and may need to block
25、 to find a free physical page DMA requires contiguous physical memory,Paging,All kernel segment pages are locked in memory (no swapping) User pages can be paged out: Complete block device Fixed length files in a file system First 4096 bytes are a bitmap indicating that space for that page is availab
26、le for paging. At byte 4086, string “SWAP_SPACE” is stored. Hence, max swap of 4086*8-1 = 32687 pages = 130784KB per device or file MAX_SWAPFILES specifies number of swap files or devices Swap device is more efficient than swap file.,Inform swap space to kernel using int sys_swapon(char * swapfile,
27、int swapflags); Ceates an entry in swap_info table. struct swap_info_struct unsigned int flags; kdev_t swap_device; struct indoe *swap_file; unsigned char *swap_map; / ptr to table, with 1 byte for each page to indicate how many processes are referring to this page unsigned char *swap_lockmap; / ptr
28、 to bitmap, bit indicating lock int lowest_bit, highest_bit; / to calculate maximum page number unsigned long max; / highest_bit + 1 int prio; / priority for this swap space int cluster_nr, cluster_next; / to cluster pages on storage device int next; ,For each physical frame (mm.h): typedef struct p
29、age struct page *prev, *next; / doubly linked struct inode *inode; unsigned long offset; / where to swap struct page *prev_hash, next_hash; / in hash list of pages in page cache atomic_t count; / number of users of this page unsigned dirty:16, age:8; struct buffer_head * buffers; / if it is part of
30、a block buffer unsigned long map_nr; / frame # struct wait_queue *wait; / Tasks waiting for page to be unlocked unsigned flags; mem_map_t;,Finding a Physical Page,unsigned long _get_free_pages(int priority, unsigned long order, int dma) in mm/page_alloc.c Priority = GFP_BUFFER (free page returned on
31、ly if available in physical memory) GFP_ATOMIC (return page if possible, do not interrupt current process) GFP_USER (current process can be interrupted) GFP_KERNEL (kernel can be interrupted) GFP_NOBUFFER (do not attempt to reduce buffer cache) order says give me 2order pages (max is 128KB) dma spec
32、ifies that it is for DMA purposes,First tries to find a free frame using Buddy system. Table free_area keeps appropriate data structures.,If you cannot find a free page, int try_to_free_page(int priority, int dma, int wait) static int state = 6; int I = 6; int stop; stop = 3; if (wait) stop = 0; swi
33、tch (state) do Case 0: if (shrink_mmap(i,dma) return 1; state = 1; Case 1: if (shm_swap(i,dma) return 1; state = 2; Default: if (swap_out(i,dma,wait) return 1; state = 0; i-; while (i-stop) = 0); return 0; ,shrink_mmap() tries to discard pages in page cache or buffer cache that have only one user cu
34、rrently, and have not been references since the last cycle. The number of examined pages depends on priority. shm_swap() tries pages allocated for shared memory. swap_out() Uses swap_cnt to determine how many pages to swap out for current process before moving on to next. Always start where you left
35、 off last time (Clock algorithm) Uses swap_out_process() function, which then calls try_to_swap_out() for each possible page present in memory (and is not locked). try_to_swap_out() checks the age attribute in mem_map data structure, and the page is selected if this is 0. VM areas swapout() operatio
36、n is called. Write back if the page is dirty Invalidate page table entry.,kswapd kernel thread running in background is activated each time the number of free pages falls below a critical level. This thread calls the try_to_free_page() function. A block of memory is released using free_pages(). When the number of users reaches 0, the frames are entered in free_area.,Page Fault,Error code written onto stack, and the VA is stored in register CR2 do_page_fault(struct pt_regs *regs, unsigned long error_code) is now called. If faulting address is in kernel segment, alarm messages are printe
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 蘇科版八年級物理上冊《2.3平面鏡》同步測試題及答案
- 自考財務(wù)報表分析重點教學(xué)總結(jié)
- 電子政務(wù)的前景
- 高一化學(xué)達標訓(xùn)練:第一單元化石燃料與有機化合物
- 2024屆天一大聯(lián)考皖豫聯(lián)盟高考化學(xué)一模試卷含解析
- 2024高中地理第三章區(qū)域自然資源綜合開發(fā)利用章末整合學(xué)案新人教版必修3
- 2024高中物理第四章牛頓運動定律2實驗:探究加速度與力質(zhì)量的關(guān)系課后作業(yè)含解析新人教版必修1
- 2024高中語文第一單元第3課邊城提升訓(xùn)練含解析新人教版必修5
- 2024高中語文精讀課文一第2課2魯迅:深刻與偉大的另一面是平和二課堂練習(xí)含解析新人教版選修中外傳記蚜
- 2024高考化學(xué)二輪復(fù)習(xí)專題限時集訓(xùn)11有機化學(xué)基礎(chǔ)含解析
- 外配處方章管理制度
- 2025年四川長寧縣城投公司招聘筆試參考題庫含答案解析
- 2024年06月上海廣發(fā)銀行上海分行社會招考(622)筆試歷年參考題庫附帶答案詳解
- TSG 51-2023 起重機械安全技術(shù)規(guī)程 含2024年第1號修改單
- 計算機科學(xué)導(dǎo)論
- 浙江省杭州市錢塘區(qū)2023-2024學(xué)年四年級上學(xué)期英語期末試卷
- 《工程勘察設(shè)計收費標準》(2002年修訂本)
- 2024年一級消防工程師《消防安全技術(shù)綜合能力》考試真題及答案解析
- 2024-2025學(xué)年六上科學(xué)期末綜合檢測卷(含答案)
- 安徽省森林撫育技術(shù)導(dǎo)則
- 【MOOC】PLC技術(shù)及應(yīng)用(三菱FX系列)-職教MOOC建設(shè)委員會 中國大學(xué)慕課MOOC答案
評論
0/150
提交評論