Operating Systems: Distributed Coordina.ppt_第1頁
Operating Systems: Distributed Coordina.ppt_第2頁
Operating Systems: Distributed Coordina.ppt_第3頁
Operating Systems: Distributed Coordina.ppt_第4頁
Operating Systems: Distributed Coordina.ppt_第5頁
已閱讀5頁,還剩32頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)

文檔簡介

1、18: Distributed Coordination,1,Jerry Breecher,OPERATING SYSTEMS Distributed Coordination,18: Distributed Coordination,2,DISTRIBUTED COORDINATION,Topics: Event Ordering Mutual Exclusion Atomicity Concurrency Control Deadlock Handling Election Algorithms Reaching Agreement,18: Distributed Coordination

2、,3,DISTRIBUTED COORDINATION,Definitions: Tightly coupled systems: Same clock, usually shared memory. Communication is via this shared memory. Multiprocessors. Loosely coupled systems: Different clock. Use communication links. Distributed systems.,18: Distributed Coordination,4,DISTRIBUTED COORDINATI

3、ON,Happening before vs. concurrent. Here A - B means A occurred before B and thus could have caused B. Of the events shown on the next page, which are happened-before and which are concurrent? Ordering is easy if the systems share a common clock ( i.e., its in a centralized system.) With no common c

4、lock, each process keeps a logical clock. This Logical Clock can be simply a counter - it may have no relation to real time. Adjust the clock if messages are received with time higher than current time. We require that LC( A ) = LC( B ), then set LC( B ) = LC( A ) + 1.,Event Ordering,18: Distributed

5、 Coordination,5,DISTRIBUTED COORDINATION,Event Ordering,P0,P1,P4,P3,P2,P,o,o,o,o,o,Q0,Q1,Q4,Q3,Q2,Q,o,o,o,o,o,R0,R1,R4,R3,R2,R,o,o,o,o,o,Time,18: Distributed Coordination,6,DISTRIBUTED COORDINATION,Heres the problem we have to solve. Suppose two processes want to gain two locks. For a single system

6、approach, we learned that deadlock could be avoided by assuring that the locks are always attained in the same order: P1: P2: wait(Q); wait(Q); wait(S); wait(S); . . signal(S); signal(Q); signal(Q); signal(S); To make this work, we need to maintain strict Happening Before relationships.,Mutual Exclu

7、sion/ Synchronization,18: Distributed Coordination,7,DISTRIBUTED COORDINATION,USING DISTRIBUTED SEMAPHORES With only a single machine, a processor can provide mutual exclusion. But its much harder to do with a distributed system. The network may not be fully connected so communication must be throug

8、h an intermediary machine. Concerns center around: 1. Efficiency/performance 2. How to re-coordinate if something breaks.,Mutual Exclusion/ Synchronization,Techniques we will discuss: 1. Centralized 2. Fully Distributed 3. Distributed with Tokens With rings Without rings,18: Distributed Coordination

9、,8,DISTRIBUTED COORDINATION,CENTRALIZED APPROACH Choose one processor as coordinator who handles all requests. A process that wants to enter its critical section sends a request message to the coordinator. On getting a request, the coordinator doesnt answer until the critical section is empty (has b

10、een released by whoever is holding it). On getting a release, the coordinator answers the next outstanding request. If coordinator dies, elect a new one who recreates the request list by polling all systems to find out what resource each thinks it has. Requires three messages per critical section en

11、try; request, reply, release. The method is free from starvation.,Mutual Exclusion/ Synchronization,18: Distributed Coordination,9,DISTRIBUTED COORDINATION,CENTRALIZED APPROACH - IMPLEMENTATION So how would you do this? Lets sketch out a way of doing a centralized lock.,Mutual Exclusion/ Synchroniza

12、tion,18: Distributed Coordination,10,DISTRIBUTED COORDINATION,CENTRALIZED APPROACH - IMPLEMENTATION Heres one approach works for threads only code is java: private void GetThreadLock() try sem.acquire(); catch(InterruptedException e) / End of GetThreadLock private void ReleaseThreadLock() sem.releas

13、e(); / End of GetThreadLock,Mutual Exclusion/ Synchronization,18: Distributed Coordination,11,DISTRIBUTED COORDINATION,CENTRALIZED APPROACH - IMPLEMENTATION Heres one approach works for remote processes code is java: private void GetRemoteLock() try Socket sock = new Socket(ServerIP, ServerPort); /

14、create socket catch (Throwable e) / End of GetRemoteLock,Mutual Exclusion/ Synchronization,18: Distributed Coordination,12,DISTRIBUTED COORDINATION,CENTRALIZED APPROACH - IMPLEMENTATION Heres one approach works for remote processes code is java: public void requestServer() request = netIn.readLine()

15、; / Get request from client synchronized( whit ) / Java has a monitor which we now enter boolean lockAchieved = false; if ( request.equals(LockRequest) ) while ( !lockAchieved ) / when waking from a wait() need to redo if ( getLocker() = -1 ) / Lock is not held netOut.println( Granted ); setLocker(

16、threadID ); / Show who owns the lock lockAchieved = true; / end of if else if ( getLocker() = threadID ) / lock held by us = BAD lockAchieved = true; else / lock held by someone else wait(); / Suspend ourselves / End of else / End of while !lockAchieved / End of LockRequest,Mutual Exclusion/ Synchro

17、nization,18: Distributed Coordination,13,DISTRIBUTED COORDINATION,CENTRALIZED APPROACH - IMPLEMENTATION Heres one approach works for remote processes code is java: if ( request.equals(UnLockRequest) ) if ( getLocker() != threadID ) / lock not held by us = BAD System.out.println(“trying to unlock a l

18、ock held by someone else”); else netOut.println( Granted ); netOut.flush(); setLocker( -1 ); notifyAll(); / wake up anyone else needing the lock / end of unlock request / end synchronized monitor,Mutual Exclusion/ Synchronization,18: Distributed Coordination,14,DISTRIBUTED COORDINATION,FULLY DISTRIB

19、UTED APPROACH Approach due to Lamport. These are the general properties for the method: The general mechanism is for a process Pi to send a request ( with ID and time stamp ) to all other processes. When a process Pj receives such a request, it may reply immediately or it may defer sending a reply b

20、ack. When responses are received from all processes, then Pi can enter its Critical Section. When Pi exits its critical section, the process sends reply messages to all its deferred requests.,Mutual Exclusion/ Synchronization,18: Distributed Coordination,15,DISTRIBUTED COORDINATION,FULLY DISTRIBUTED

21、 APPROACH The general rules for reply for processes receiving a request: If Pj receives a request, and Pj process is in its critical section, defer (hold off) the response to Pi. If Pj receives a request, and not in critical section, and doesnt want to get in, then reply immediately to Pi. If Pj wan

22、ts to enter its critical section but has not yet entered it, then it compares its own timestamp TSj with the timestamp TSi from Ti. If TSj TSi, then it sends a reply immediately to Pi. Pi asked first. Otherwise the reply is deferred until after Pj finishes its critical section.,Mutual Exclusion/ Syn

23、chronization,18: Distributed Coordination,16,DISTRIBUTED COORDINATION,The Fully Distributed Approach assures: Mutual exclusion Freedom from deadlock Freedom from starvation, since entry to the critical section is scheduled according to the timestamp ordering. The timestamp ordering ensures that proc

24、esses are served in a first-come, first-served order. 2 X ( n - 1 ) messages needed for each entry. This is the minimum number of required messages per critical-section entry when processes act independently and concurrently. Problems with the method include: Need to know identity of everyone in sys

25、tem. Fails if anyone dies - must continually monitor the state of all processes. Processes are always coming and going so its hard to maintain current data.,Mutual Exclusion/ Synchronization,18: Distributed Coordination,17,DISTRIBUTED COORDINATION,TOKEN PASSING APPROACH Tokens with rings Whoever hol

26、ds the token can use the critical section. When done, pass on the token. Processes must be logically connected in a ring - it may not be a physical ring. Advantages: No starvation if the ring is unidirectional. There are many messages passed per section entered if few users want to get in section. O

27、nly one message/entry if everyone wants to get in. OK if you can detect loss of token and regenerate via election or other means. If a process is lost, a new logical ring must be generated.,Mutual Exclusion/ Synchronization,18: Distributed Coordination,18,DISTRIBUTED COORDINATION,TOKEN PASSING APPRO

28、ACH Tokens without rings ( Chandy ) A process can send a token to any other process. Each process maintains an ordered list of requests for a critical section. Process requiring entrance broadcasts message with ID and new count (current logical time). When using the token, store into it the time-of-

29、request for the request just finished. If a process is holding token and not in critical section, send to first message received ( if time maintained in token is later than that for a request in the list, its an old message and can be discarded.) If no request, hang on to the token.,Mutual Exclusion

30、/ Synchronization,18: Distributed Coordination,19,Atomicity means either ALL the operations associated with a program unit are executed to completion, or none are performed. Ensuring atomicity in a distributed system requires a transaction coordinator, which is responsible for the following: Startin

31、g the execution of a transaction. Breaking the transaction into a number of sub transactions, and distributing these sub transactions to the appropriate sites for execution. Coordinating the termination of the transaction, which may result in the transaction being committed at all sites or aborted a

32、t all sites.,DISTRIBUTED COORDINATION,Atomicity,18: Distributed Coordination,20,Two-Phase Commit Protocol (2PC) For atomicity to be ensured, all the sites in which a transaction T executes must agree on the final outcome of the execution. 2PC is one way of doing this. Execution of the protocol is in

33、itiated by the coordinator after the last step of the transaction has been reached. When the protocol is initiated, the transaction may still be executing at some of the local sites. The protocol involves all the local sites at which the transaction executed. Let T be a transaction initiated at site

34、 Si, and let the transaction coordinator at Si be Ci,DISTRIBUTED COORDINATION,Atomicity,18: Distributed Coordination,21,Two-Phase Commit Protocol (2PC) Phase 1: Obtaining a decision Ci adds record to the log. Ci sends message to all sites. When a site receives a message, the transaction manager dete

35、rmines if it can commit the transaction. If no: add record to the log and respond to Ci with . If yes: add record to the log. force all log records for T onto stable storage. transaction manager sends message to Ci . Coordinator collects responses - If All respond ready, decision is commit. If At le

36、ast one response is abort, decision is abort. If At least one participant fails to respond within timeout period, decision is abort.,DISTRIBUTED COORDINATION,Atomicity,18: Distributed Coordination,22,Two-Phase Commit Protocol (2PC) Phase 2: Recording the decision in the database Coordinator adds a d

37、ecision record ( or ) to its log and forces record onto stable storage. Once that record reaches stable storage it is irrevocable (even if failures occur). Coordinator sends a message to each participant informing it of the decision (commit or abort) . Participants take appropriate action locally.,D

38、ISTRIBUTED COORDINATION,Atomicity,18: Distributed Coordination,23,Failure Handling in Two-Phase Commit: Failure of a participating Site: The log contains a record. In this case, the site executes redo (T) The log contains an record. In this case, the site executes undo (T) The log contains a record;

39、 consult Ci . If Ci is down, site sends query-status(T) message to the other sites. The log contains no control records concerning (T). Then the site executes undo(T). Failure of the Coordinator Ci: If an active site contains a record in its log, then T must be committed. If an active site contains

40、an record in its log, then T must be aborted. If some active site does not contain the record in its log, then the failed coordinator Ci cannot have decided to commit T. Rather than wait for Ci to recover, it is preferable to abort T. All active sites have a record in their logs, but no additional c

41、ontrol records. In this case we must wait for the coordinator to recover. Blocking problem - T is blocked pending the recovery of site Si.,DISTRIBUTED COORDINATION,Atomicity,18: Distributed Coordination,24,We need to modify the centralized concurrency schemes to accommodate the distribution of trans

42、actions. Transaction manager coordinates execution of transactions (or sub transactions) that access data at local sites. Local transaction only executes at that site. Global transaction executes at several sites. Locking Protocols Can use the two-phase locking protocol in a distributed environment

43、by changing how the lock manager is implemented. Nonreplicated scheme - each site maintains a local lock manager which administers lock and unlock requests for those data items that are stored in that site. Simple implementation involves two message transfers for handling lock requests, and one mess

44、age transfer for handling unlock requests. Deadlock handling is more complex.,DISTRIBUTED COORDINATION,Concurrency Control,18: Distributed Coordination,25,Locking Protocols = Single-coordinator approach: A single lock manager resides in a single chosen site; all lock and unlock requests are made at

45、that site. Simple implementation Simple deadlock handling Possibility of bottleneck Vulnerable to loss of concurrency controller if single site fails.,DISTRIBUTED COORDINATION,Concurrency Control,18: Distributed Coordination,26,Locking Protocols = Multiple-coordinator approach: Distributes lock-mana

46、ger function over several sites. Majority protocol: Avoids drawbacks of central control by replicating data in a decentralized manner. More complicated to implement. Deadlock-handling algorithms must be modified; possible for deadlock to occur in locking only one data item. Biased protocol: Like maj

47、ority protocol, but requests for shared locks prioritized over exclusive locks. Less overhead on reads than in majority protocol; but more overhead on writes. Like majority protocol, deadlock handling is complex.,DISTRIBUTED COORDINATION,Concurrency Control,18: Distributed Coordination,27,Locking Pr

48、otocols = Multiple-coordinator approach: Primary copy: One of the sites at which a replica resides is designated as the primary site. Request to lock a data item is made at the primary site of that data item. Concurrency control for replicated data handled in a manner similar to that for un-replicat

49、ed data. Simple implementation, but if primary site fails, the data item is unavailable, even though other sites may have a replica. Timestamping: Generate unique timestamps in distributed scheme: A) Each site generates a unique local timestamp. B) The global unique timestamp is obtained by concaten

50、ation of the unique local timestamp with the unique site identifier. C) Use a logical clock defined within each site to ensure the fair generation of timestamps. Timestamp-ordering scheme - combine the centralized concurrency control timestamp scheme with the (2PC) protocol to obtain a protocol that

51、 ensures serializability with no cascading rollbacks.,DISTRIBUTED COORDINATION,Concurrency Control,18: Distributed Coordination,28,DEADLOCK PREVENTION To prevent Deadlocks, must stop one of the four conditions (these should sound familiar!): Mutual exclusion, Hold and wait, No preemption, Circular w

52、ait. Possible Solutions Include: Global resource ordering (all resources are given unique numbers and a process can acquire them only in ascending order.) Simple to implement, low cost, but requires knowing all resources. Prevents a circular wait. Bankers algorithm with one process being banker (can

53、 be bottleneck.) Large number of messages is required so method is not very practical. Priorities based on unique numbers for each process has a problem with starvation.,DISTRIBUTED COORDINATION,Deadlock Handling,18: Distributed Coordination,29,DEADLOCK PREVENTION Possible Solutions Include: Priorit

54、ies based on timestamps can be used to prevent circular waits. Each process is assigned a timestamp at its creation. Several variations are possible: Non-preemptive Requester waits for resource if older than current resource holder, else its rolled back losing all its resources. The older a process

55、gets, the longer it waits. Preemptive If the requester is older than the holder, then the holder is preempted ( rolled back ). If the requester is younger, then it waits. Fewer rollbacks here. When P(i) is preempted by P(j), it restarts and, being younger, ends up waiting for P(j). Keep timestamp if

56、 rolled back ( dont reassign them ) - prevents starvation since a preempted process will soon be the oldest. The preemption method has fewer rollbacks because in the non-preemptive method, a young process can be rolled back a number of times before it gets the resource.,DISTRIBUTED COORDINATION,Dead

57、lock Handling,18: Distributed Coordination,30,DEADLOCK DETECTION The previous Prevention Techniques can unnecessarily preempt a resource. Can we do rollback only when a deadlock is detected? Use Wait For Graphs - recall, with a single resource of a type, a cycle is a deadlock. Each site maintains a

58、local wait-for-graph, with nodes being local or remote processes requesting LOCAL resources. To show no deadlock has occurred, show the union of graphs has no cycle. - Cycle formed. SEE THE FIGURES ON THE NEXT PAGE,DISTRIBUTED COORDINATION,Deadlock Handling,18: Distributed Coordination,31,DISTRIBUTE

59、D COORDINATION,Deadlock Handling,Figure 7.7 Resource Allocation Graph & Its Wait-For Graph.,Figure 18.3 Two local wait-for graphs.,P1,P2,P3,P5,Figure 18.4 Global local wait-for graphs.,P3,P4,P2,P3,P4,P2,P1,P5,18: Distributed Coordination,32,DEADLOCK DETECTION CENTRALIZED In this method, the union is maintained in one process. If a global (c

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
  • 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論