NTClustersRequirementsNT集群的要求_第1頁
NTClustersRequirementsNT集群的要求_第2頁
NTClustersRequirementsNT集群的要求_第3頁
NTClustersRequirementsNT集群的要求_第4頁
全文預(yù)覽已結(jié)束

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)

文檔簡介

1、.Jim Grays NTclusters Research AgendaJim Gray, Gray MOctober 1995My Research AgendaMy research project at Microsoft is to · Take a census of the many cluster efforts both inside and outside Microsoft.· Identify a specific project that we in San Francisco can execute.· Work with the NT

2、clusters group in Redmond to bring key clustering features to NT.This document is a manifesto for the project. Why Clusters?Two apparently different demands, WorkGroups and SuperServers, drive the need for commodity clusters.WorkGroups: Each time a workgroup grows, it needs to add computing, storage

3、, and network resources to support the new members. The least expensive way to buy computing is to buy a desktop. Each desktop wants to be able to see the printers, files, and network ports of the others in the workgroup. The workgroup administrator wants to be able to manage the resources, security

4、, and versions of the entire workgroup from a single workstation.SuperServer: Processors, memories, discs and networks are getting faster and cheaper at astonishing rates. Still, some problems are so large that they exceed the performance of any single processor, memory, disk, or network link. There

5、 is a great desire to build high-capacity servers from an array of commodity components. This array must be as easy to program and administer as a single computer WorkGroup and SuperServer clusters are more alike than different. WorkGroup clusters tend to be more heterogeneous - but this heterogenei

6、ty comes at the cost of greatly increased management costs. Automatic course-grained parallelism differentiates SuperServers, but the other problems of administration, growth, transparency, security, and availability, are the same for both worlds. WorkGroup parallelism is inherent - it comes from ma

7、ny users. Some SuperServers have this same inherent parallelism. For example, a print, file, database, or transaction server receives many small and independent requests that can be serviced in parallel. Often, however, a SuperServer is asked to perform just one large task (data mining, utility, .)

8、which means that parallelism must be programmed into the application. One lesson we have learned is that only sophisticated software houses will explicitly write parallel programs. Everyone else expects the parallel execution and IO to be automatic.One approach to building a SuperServer is to tightl

9、y couple a few devices together to provide more performance. Common examples of this are 4-way Intel-P6 symmetric multi-processor (SMP) or a 5-way disk array (RAID) or two 64 kb lines (ISDN). This approach is valid but there are limits to this scaleability:· The need for proprietary hardware to

10、 interconnect components.· The need for proprietary software to exploit components.· Hardware bottlenecks· Software bottlenecks· Fault containment within a large array of hardware and software components.Despite these limitations, it is now common to build tightly coupled arrays

11、withup to10 processorsup to10 GB of RAMup to100 disksup to10 high-speed communications lines (T3 or ATM)Much beyond these limits, the bottlenecks and fault-containment problems make tightly-coupled SuperServers unwieldy. Plans and prototypes for NonUniformMemoryArchictecture machines (NUMAs) support

12、ing hundreds of processors. These machines may be programmed either as an SMP or software can map them as a shared-nothing multicomputer with very limited interconnect. If these NUMAs can be built with commodity hardware, then they may well be the platform of choice for SuperServers. Thus far, I hav

13、e not seen a credible plan for a commodity NUMA (one that scales from 1 node to 1,000 nodes using commodity parts). NUMAs are unlikely to form the basis of a WorkGroup solution. This almost automatically rules them out of the commodity category.Several companies have marketed successful SuperServers

14、 built as clusters. Each cluster node is a free-standing computer with its own resources and operating system. Each is a unit of service and availability. Each node owns and serves (provides services for) some disks, tapes, network links, database fragments or other resources. The clusters can grow

15、by adding processors, disks, and networking. The cluster appears to be a single server computer. Tandem, Teradata, VMScluster, Apollo Domain, IBM (Sysplex and SP2), Intel Paragon, and TMC CM5 are examples of cluster architectures. VMScluster and the Apollo Domain are frequently used as workgroup sys

16、tems as well as SuperServers.These clusters were built by hardware vendors in a non-portable way - requiring both specialized hardware and software. With the advent of commodity high-performance workstations and interconnects, the time has arrived to build cluster technology on a portable and commod

17、ity software base for both workgroups and SuperServers. It is paradoxical that little cluster technology has emerged from either the UNIX or NetWare communities. Certainly NetWare has some cluster concepts (the registry, device transparency, single logon,.) but it is far from a complete solution (se

18、e below for a list of requirements). Clustering for the various UNIX products (AIX, HPUX, Solaris, OSF/1.) is even more primitive.Long before I joined Microsoft I viewed NT as the natural vehicle for a commodity cluster operating system. NT is modern, portable, and has both work group and a server f

19、eatures. NT already has some cluster features in the domain concept (one user logon), the performance monitor (that can monitor all nodes of a domain), and the redirector (which maps local system calls to remote procedure calls and so gives some location transparency.) Some projects layered above NT

20、 will provide management (Starfighter SQL Server management) and load balancing (Viper transaction monitoring). The Tiger video server is an example of a special-purpose cluster application.AT&T, DEC, Intel, Microsoft, Tandem, Sequent, and others are augmenting NT to have cluster features. In ad

21、dition, applications like Informix Version 8 will soon provide an automatic data layout among NT nodes, and automatic parallel execution against the cluster.Commodity Cluster RequirementsThe main requirements for a commodity cluster are summarized here. One could write a volume on each of these topi

22、cs.Commodity Components: The most common cluster has one node, the next most common has two nodes, and so on - Zipfs law applies. The cluster should be built from commodity hardware and software components. You cannot buy ATM from RadioShack today, but you can buy 100Mb Ethernet, and ATM is only a f

23、ew years off. You can certainly buy NT and even SQLserver or Oracle. Any workgroup cluster design must scale from one node to 1,000 nodes with the same hardware and software base. It must have scaleup and scaledown. There are huge economies in having the same design work for both small workgroups an

24、d for SuperServers. This is really a price-performance requirement. Modular Growth: The clusters capacity can grow by a factor of a thousand by adding small components. As each component arrives, some storage, computation, and services automatically migrate to it.Availability: Fault tolerance is the

25、 flip side of modular growth. When a component fails, the services it was providing migrate to other cluster nodes. Ideally, this migration is instant and automatic. Certain failure modes (for example, power failure, fire, flood, and insurrection) are best dealt with by replicating services at remot

26、e sites. Remote replication is not part of the core cluster design.Location Transparency: The clusters resources should all appear to be local to each cluster. Boundaries among cluster nodes should be transparent to programs and users. All printers, storage, applications, and network interfaces shou

27、ld appear to be at the one node that the user or program currently occupies. This is sometimes called a single-system image. Transparency allows reorganizing data or resources by moving them to new nodes from busy nodes without disturbing users or applications.Manageability: Ideally, each hardware a

28、nd software module would be plug-n-play. Even so, the administrator must set policies stating who is allowed to do what, when periodic tasks should be performed, and what the system performance goals are. Short of this fully automated cluster mechanism, the system should automate much of the design,

29、 deployment, operations, diagnosis, tuning, and reorganization of the cluster. A large cluster should be as easy to manage as a single node. Security: A cluster appears to be one computer. It has one administrator, one security policy and one authenticator. A process authenticated by one node of the

30、 cluster is considered authenticated by all other nodes.Performance: Communication among cluster nodes must be fast and efficient. It must be possible to read at disk or ATM speed (100 MB/s) from anywhere to anywhere. This requires good software and hardware IO architecture. The hardware and softwar

31、e cannot have any bottleneck or centralized resource. Any centralized resource, even if it has a 0.1% utilization, will be saturated at a thousand nodes. The simple test of performance is that a thousand node system should run a thousand times larger problem in the same time (scaleup) or run a fixed

32、 sized problem a thousand times faster (speedup).Automatic Parallelism: Architects have been building parallel computers since the 1960s. Virtually all these systems have been useless because they were difficult to program. Users expect parallel execution to be automated. Print, mail, file and appli

33、cation servers, TP monitors, relational databases, and other search engines have all been commercial successes by hiding concurrent execution from the application programmer. Clusters must provide tools that automate most parallel programming tasks.How does a Cluster Differ From a Distributed System

34、? It may seem that a cluster is just a kind of distributed computing system - like the world wide web or a network of Solaris systems or. Certainly, a cluster is a simplified kind of distributed system. But, the differences are substantial enough to make cluster algorithms significantly different. Homogeneity: A cluster is a homogeneous system: it has one security po

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

最新文檔

評論

0/150

提交評論