版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)
文檔簡介
1、The TLB (Translation Lookaside Buffer) miss services have been concealed from operating systems, but some new RISC architectures manage the TLB in software. Since software-managed TLBs provide flexibility to an operating system in page translation, they are considered an important factor in the desi
2、gn of microprocessors for open system environments. However, software-managed TLBs suffer from larger miss penalty than hardware-managed TLBs, since they require more extra context switching overhead than hardware-managed TLBs. This paper introduces a new technique for reducing the miss penalty of s
3、oftware-managed TLBs by prefetching necessary TLB entries before being used. This technique is not inherently limited to specific applications. The key of this scheme is to perform the prefetch operations to update the TLB entries before first accesses so that TLB misses can be avoided. Using trace-
4、driven simulation and a quantitative analysis, the proposed scheme is evaluated in terms of the miss rate and the total miss penalty. Our results show that the proposed scheme reduces the TLB miss rate by a factor of 6% to 77% due to TLB characteristics and page sizes. In addition, it is found that
5、reducing the miss rate by the prefetching scheme reduces the total miss penalty and bus traffics in software-managed TLBs.Most Prolog machines have been based on specialized architectures. Our goal is to start with a general-purpose architecture and determine a minimal set of extensions for high-per
6、formance Prolog execution. We have developed both the architecture and optimizing compiler simultaneously, drawing on results of previous implementations. We find that most Prolog-specific operations can be done satisfactorily in software; however, there is a crucial set of features that the archite
7、cture must support to achieve the best Prolog performance. In this paper, the costs and benefits of special architectural features and instructions are analyzed. In addition, we study the relationship between the strength of compiler optimization and the benefit of specialized hardware. We demonstra
8、te that our base architecture can be extended to include explicit support for Prolog with modest increase in chip area (13%), and yet attain a significant performance benefit (6070%). Experiments using optimized code that approximates the output of future optimizing compilers indicate that special h
9、ardware support can still provide a performance benefit of 3035%. The microprocessor described here, the VLSI-BAM, has been fabricated and incorporated into a working test system.It is well known that software maintenance and evolution are expensive activities, both in terms of invested time and mon
10、ey. Reverse engineering activities support the obtainment of abstractions and views from a target system that should help the engineers to maintain, evolve and eventually re-engineer it. Two important tasks pursued by reverse engineering are design pattern detection and software architecture reconst
11、ruction, whose main objectives are the identification of the design patterns that have been used in the implementation of a system as well as the generation of views placed at different levels of abstractions, which let the practitioners focus on the overall architecture of the system without worryi
12、ng about the programming details it has been implemented with. In this context we propose an Eclipse plug-in called MARPLE (Metrics and Architecture Reconstruction Plug-in for Eclipse), which supports both the detection of design patterns and software architecture reconstruction activities through t
13、he use of basic elements and metrics that are mechanically extracted from the source code. The development of this platform is mainly based on the exploitation of the Eclipse framework and plug-ins as well as of different Java libraries for data access and graph management and visualization. In this
14、 paper we focus our attention on the design pattern detection process.Access to sufficient resources is a barrier to scientific progress for many researchers facing large computational problems. Gaining access to large-scale resources (i.e., university-wide or federally supported computer centers) c
15、an be difficult, given their limited availability, particular architectures, and request/review/approval cycles. Simultaneously, researchers often find themselves with access to workstations and older clusters overlooked by their owners in favor of newer hardware. Software to tie these resources int
16、o a coherent Grid, however, has been problematic. Here, we describe our experiences building a Grid computing system to conduct a large-scale simulation study using “borrowed” computing resources distributed over a wide area. Using standard software components, we have produced a Grid computing syst
17、em capable of coupling several hundred processors spanning multiple continents and administrative domains. We believe that this system fills an important niche between a closely coupled local system and a heavyweight, highly customized wide area system.Article Outline1. Introduction2. Scientific con
18、text3. Implementation 3.1. System constraints3.2. General design of the grid system3.3. System requirements 3.3.1. Operating system3.3.2. Client3.3.3. Server3.3.4. Account3.4. System processes 3.4.1. user level processes3.4.2. grid_client processes3.4.3. project processes: executed once per invocati
19、on by grid_client process3.5. Basic features 3.5.1. Clientserver communications3.5.2. Authentication3.5.3. Architecture specific binaries3.5.4. Client-side security3.5.5. Server-side security3.5.6. System monitoring3.5.7. Error handling4. Performance considerations5. Future work 5.1. Secure communic
20、ations5.2. SQL transaction support5.3. A little language5.4. Validity checking6. ConclusionsAcknowledgementsReferencesVitaeThis paper describes the architecture of the first implementation of the In-VIGO grid-computing system. The architecture is designed to support computational tools for engineeri
21、ng and science research In Virtual Information Grid Organizations (as opposed to in vivo or in vitro experimental research). A novel aspect of In-VIGO is the extensive use of virtualization technology, emerging standards for grid-computing and other Internet middleware. In the context of In-VIGO, vi
22、rtualization denotes the ability of resources to support multiplexing, manifolding and polymorphism (i.e. to simultaneously appear as multiple resources with possibly different functionalities). Virtualization technologies are available or emerging for all the resources needed to construct virtual g
23、rids which would ideally inherit the above mentioned properties. In particular, these technologies enable the creation of dynamic pools of virtual resources that can be aggregated on-demand for application-specific user-specific grid-computing. This change in paradigm from building grids out of phys
24、ical resources to constructing virtual grids has many advantages but also requires new thinking on how to architect, manage and optimize the necessary middleware. This paper reviews the motivation for In-VIGO approach, discusses the technologies used, describes an early architecture for In-VIGO that
25、 represents a first step towards the end goal of building virtual information grids, and reports on first experiences with the In-VIGO software under development.Article Outline1. Introduction2. The In-VIGO concept3. Virtualization in In-VIGO 3.1. Virtual data and the virtual file system3.2. Virtual
26、 machines3.3. Virtual applications3.4. Virtual networks3.5. Virtual user interfaces4. The architecture of In-VIGO 4.1. The virtual application4.2. The virtual file system4.3. The resource manager4.4. The user interface manager4.5. The global information system4.6. The user manager5. Implementation6.
27、 ConclusionsAcknowledgementsReferencesVitaeLeveraging cost matrix structure for hardware implementation of stereo disparity computation using dynamic programmingOriginal Research ArticleComputer Vision and Image UnderstandingArticle Outline1. Introduction2. Related works 2.1. Design pattern detectio
28、n2.2. Software architecture reconstruction2.3. Concluding remarks3. An overview on MARPLE 3.1. The information detector engine module3.2. The Joiner module3.3. The classifier module3.4. The software architecture reconstruction module3.5. Distributed MARPLE4. Experimental results for DPD 4.1. Results
29、 for the information detector engine module4.2. Results for the Joiner module4.3. Results for the classifier module 4.3.1. Comparison with other tools4.3.2. Results on other design patterns4.4. Results for the SAR module5. Conclusions and future worksAcknowledgementsReferencesA tool for design patte
30、rn detection and software architecture reconstructionOriginal Research ArticleInformation Sciencespackage of Linux scripts for the parallelization of Monte Carlo simulationsOriginal Research ArticleComputer Physics CommunicationsDespite the fact that fast computers are nowadays available at low cost
31、, there are many situations where obtaining a reasonably low statistical uncertainty in a Monte Carlo (MC) simulation involves a prohibitively large amount of time. This limitation can be overcome by having recourse to parallel computing. Most tools designed to facilitate this approach require modif
32、ication of the source code and the installation of additional software, which may be inconvenient for some users. We present a set of tools, named clonEasy, that implement a parallelization scheme of a MC simulation that is free from these drawbacks. In clonEasy, which is designed to run under Linux
33、, a set of “clone” CPUs is governed by a “master” computer by taking advantage of the capabilities of the Secure Shell (ssh) protocol. Any Linux computer on the Internet that can be ssh-accessed by the user can be used as a clone. A key ingredient for the parallel calculation to be reliable is the a
34、vailability of an independent string of random numbers for each CPU. Many generatorssuch as RANLUX, RANECU or the Mersenne Twistercan readily produce these strings by initializing them appropriately and, hence, they are suitable to be used with clonEasy. This work was primarily motivated by the need
35、 to find a straightforward way to parallelize PENELOPE, a code for MC simulation of radiation transport that (in its current 2005 version) employs the generator RANECU, which uses a combination of two multiplicative linear congruential generators (MLCGs). Thus, this paper is focused on this class of
36、 generators and, in particular, we briefly present an extension of RANECU that increases its period up to and we introduce seedsMLCG, a tool that provides the information necessary to initialize disjoint sequences of an MLCG to feed different CPUs. This program, in combination with clonEasy, allows
37、to run PENELOPE in parallel easily, without requiring specific libraries or significant alterations of the sequential code. Program summary 1Title of program:clonEasy Catalogue identifier:ADYD_v1_0 Program summary URL: http:/cpc.cs.qub.ac.uk/summaries/ADYD_v1_0 Program obtainable from:CPC Program Li
38、brary, Queens University of Belfast, Northern Ireland Computer for which the program is designed and others in which it is operable:Any computer with a Unix style shell (bash), support for the Secure Shell protocol and a FORTRAN compiler Operating systems under which the program has been tested:Linu
39、x (RedHat 8.0, SuSe 8.1, Debian Woody 3.1) Compilers:GNU FORTRAN g77 (Linux); g95 (Linux); Intel Fortran Compiler 7.1 (Linux) Programming language used:Linux shell (bash) script, FORTRAN 77 No. of bits in a word:32 No. of lines in distributed program, including test data, etc.:1916 No. of bytes in d
40、istributed program, including test data, etc.:18202 Distribution format:tar.gz Nature of the physical problem:There are many situations where a Monte Carlo simulation involves a huge amount of CPU time. The parallelization of such calculations is a simple way of obtaining a relatively low statistica
41、l uncertainty using a reasonable amount of time. Method of solution:The presented collection of Linux scripts and auxiliary FORTRAN programs implement Secure Shell-based communication between a “master” computer and a set of “clones”. The aim of this communication is to execute a code that performs
42、a Monte Carlo simulation on all the clones simultaneously. The code is unique, but each clone is fed with a different set of random seeds. Hence, clonEasy effectively permits the parallelization of the calculation. Restrictions on the complexity of the program:clonEasy can only be used with programs
43、 that produce statistically independent results using the same code, but with a different sequence of random numbers. Users must choose the initialization values for the random number generator on each computer and combine the output from the different executions. A FORTRAN program to combine the fi
44、nal results is also provided. Typical running time:The execution time of each script largely depends on the number of computers that are used, the actions that are to be performed and, to a lesser extent, on the network connexion bandwidth. Unusual features of the program:Any computer on the Interne
45、t with a Secure Shell client/server program installed can be used as a node of a virtual computer cluster for parallel calculations with the sequential source code. The simplicity of the parallelization scheme makes the use of this package a straightforward task, which does not require installing an
46、y additional libraries. Program summary 2Title of program:seedsMLCG Catalogue identifier:ADYE_v1_0 Program summary URL: http:/cpc.cs.qub.ac.uk/summaries/ADYE_v1_0 Program obtainable from:CPC Program Library, Queens University of Belfast, Northern Ireland Computer for which the program is designed an
47、d others in which it is operable:Any computer with a FORTRAN compiler Operating systems under which the program has been tested:Linux (RedHat 8.0, SuSe 8.1, Debian Woody 3.1), MS Windows (2000, XP) Compilers:GNU FORTRAN g77 (Linux and Windows); g95 (Linux); Intel Fortran Compiler 7.1 (Linux); Compaq
48、 Visual Fortran 6.1 (Windows) Programming language used:FORTRAN 77 No. of bits in a word:32 Memory required to execute with typical data:500 kilobytes No. of lines in distributed program, including test data, etc.:492 No. of bytes in distributed program, including test data, etc.:5582 Distribution f
49、ormat:tar.gz Nature of the physical problem:Statistically independent results from different runs of a Monte Carlo code can be obtained using uncorrelated sequences of random numbers on each execution. Multiplicative linear congruential generators (MLCG), or other generators that are based on them s
50、uch as RANECU, can be adapted to produce these sequences. Method of solution:For a given MLCG, the presented program calculates initialization values that produce disjoint, consecutive sequences of pseudo-random numbers. The calculated values initiate the generator in distant positions of the random
51、 number cycle and can be used, for instance, on a parallel simulation. The values are found using the formula , which gives the random value that will be generated after J iterations of the MLCG. Restrictions on the complexity of the program:The 32-bit length restriction for the integer variables in
52、 standard FORTRAN 77 limits the produced seeds to be separated a distance smaller than 231, when the distance J is expressed as an integer value. The program allows the user to input the distance as a power of 10 for the purpose of efficiently splitting the sequence of generators with a very long pe
53、riod. Typical running time:The execution time depends on the parameters of the used MLCG and the distance between the generated seeds. The generation of 106 seeds separated 1012 units in the sequential cycle, for one of the MLCGs found in the RANECU generator, takes 3 s on a 2.4 GHz Intel Pentium 4
54、using the g77 compiler.Article Outline1. Introduction2. Pseudo-random number generators and parallel simulations 2.1. RANECU2.2. Parallel execution with an MLCG3. Description of the programs 3.1. clonEasy, a simple parallelization package3.2. seedsMLCG, sequence splitting of an MLCG4. ConclusionAckn
55、owledgementsAppendix A. AppendixA.1. Test run output for clonEasyA.2. Test run output for seedsMLCGReferencesStatic analysis of real-time component-based systems configurationsOriginal Research ArticleScience of Computer ProgrammingBackgroundSemantic interoperability is a basic challenge to be met f
56、or new generations of distributed, communicating and co-operating health information systems (HIS) enabling shared care and e-Health. Analysis, design, implementation and maintenance of such systems and intrinsic architectures have to follow a unified development methodology. MethodsThe Generic Component Model (GCM) is used as a framework for modeling any system to evaluate and harmonize state of the art
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
- 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025屆寧夏吳忠市高三上學(xué)期適應(yīng)性考試(一模)歷史試題(解析版)
- 《社區(qū)自治》課件
- 單位管理制度集合大全職員管理篇
- 單位管理制度匯編大全【人力資源管理】
- 單位管理制度合并選集人事管理
- 單位管理制度分享合集【人事管理】十篇
- 單位管理制度范例匯編【人力資源管理篇】十篇
- 單位管理制度呈現(xiàn)大全【人力資源管理篇】十篇
- 七年級英語Whatwouldyouliketohave課件
- 《珠海酒店信息》課件
- 重大隱患判定標準培訓(xùn)課件
- 畫法幾何及機械制圖課件
- 棋牌游戲自審自查報告
- 地質(zhì)災(zāi)害風險調(diào)查評價項目招標文件
- JJF 2088-2023大型蒸汽滅菌器溫度、壓力、時間參數(shù)校準規(guī)范
- 幼兒園食堂食品安全主體責任風險管控清單(日管控)
- 九年級上冊第二單元民主與法治 單元作業(yè)設(shè)計
- 陜西華縣皮影戲調(diào)研報告
- 2016年食堂期末庫存
- 運籌學(xué)課程設(shè)計報告
- (完整)雙溪課程評量表
評論
0/150
提交評論