![數(shù)據(jù)結(jié)構(gòu)英文教學(xué)課件:chapter10 Hashing_第1頁](http://file4.renrendoc.com/view/ffa34273eed9fe414c0512f5012b0f20/ffa34273eed9fe414c0512f5012b0f201.gif)
![數(shù)據(jù)結(jié)構(gòu)英文教學(xué)課件:chapter10 Hashing_第2頁](http://file4.renrendoc.com/view/ffa34273eed9fe414c0512f5012b0f20/ffa34273eed9fe414c0512f5012b0f202.gif)
![數(shù)據(jù)結(jié)構(gòu)英文教學(xué)課件:chapter10 Hashing_第3頁](http://file4.renrendoc.com/view/ffa34273eed9fe414c0512f5012b0f20/ffa34273eed9fe414c0512f5012b0f203.gif)
![數(shù)據(jù)結(jié)構(gòu)英文教學(xué)課件:chapter10 Hashing_第4頁](http://file4.renrendoc.com/view/ffa34273eed9fe414c0512f5012b0f20/ffa34273eed9fe414c0512f5012b0f204.gif)
![數(shù)據(jù)結(jié)構(gòu)英文教學(xué)課件:chapter10 Hashing_第5頁](http://file4.renrendoc.com/view/ffa34273eed9fe414c0512f5012b0f20/ffa34273eed9fe414c0512f5012b0f205.gif)
版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)
文檔簡介
1、Chapter 10HashingADT of Dictionary const int DefaultSize = 26;templateclass Dictionarypublic: Dictionary ( int size = DefaultSize ); int IsIn ( Name name ); Attribute *Find ( Name name ); void Insert ( Name name, Attribute attr ); void Remove ( Name name ); Hash tableSupport the following operations
2、Find InsertDelete. (deletions may be unnecessary in some applications)Unlike binary search tree, AVL tree and B+-tree, the following functions cannot be done:Minimum and maximumSuccessor and predecessorReport data within a given rangeList out the data in orderUnrealistic solution Each position (slot
3、) corresponds to a key in the universe of keysTk corresponds to an element with key kIf the set contains no element with key k, then Tk=NULLUnrealistic solutioninsert, delete and find all take O(1) (worst-case) timeProblem:The scheme wastes too much space if the universe is too large compared with t
4、he actual number of elements to be stored. E.g. student IDs are 8-digit integers, so the universe size is 108, but we only have about 4000 students in Software College.HashingUsually, m m, where m is the size of the hash tableDesign a good hash functionthat is fast to compute and can minimize the nu
5、mber of collisionsDesign a method to resolve the collisions when they occurHash FunctionThe division method h(k) = k mod me.g. m=12, k=100, h(k)=4 Requires only a single division operation (quite fast)Certain values of m should be avoidede.g. if m=2p, then h(k) is just the p lowest-order bits of k;
6、the hash function does not depend on all the bitsSimilarly, if the keys are decimal numbers, should not set m to be a power of 10Hash FunctionIts a good practice to set the table size m to be a prime numberGood values for m: primes not too close to exact powers of 2e.g. the hash table is to hold 200
7、0 numbers, and we dont mind an average of 3 numbers being hashed to the same entrychoose m=701Hash Function.Can the keys be strings?Most hash functions assume that the keys are natural numbersif keys are not natural numbers, a way must be found to interpret them as natural numbersHash Function.Metho
8、d 1Add up the ASCII values of the characters in the stringProblems:Different permutations of the same set of characters would have the same hash valueIf the table size is large, the keys are not distribute well. e.g. Suppose m=10007 and all the keys are eight or fewer characters long. Since ASCII va
9、lue a reasonably equitable distributionProblemEnglish is not randomOnly 28 percent of the table can actually be hashed to (assuming a table size of 10,007)Method 3computesinvolves all characters in the key and be expected to distribute wellCollision Handling: (1) Separate ChainingInstead of a hash t
10、able, we use a table of linked listkeep a linked list of keys that hash to the same valueh(K) = K mod 10Separate ChainingTo insert a key KCompute h(K) to determine which list to traverse If Th(K) contains a null pointer, initiatize this entry to point to a linked list that contains K alone. If Th(K)
11、 is a non-empty list, we add K at the beginning of this list.To delete a key Kcompute h(K), then search for K within the list at Th(K). Delete K if it is found.Separate Chaining Assume that we will be storing n keys. Then we should make m the next larger prime number. If the hash function works well
12、, the number of keys in each linked list will be a small constant.Therefore, we expect that each search, insertion, and deletion can be done in constant time.Disadvantage: Memory allocation in linked list manipulation will slow down the program. Advantage: deletion is easy.Collision Handling:(2) Ope
13、n AddressingOpen addressing: relocate the key K to be inserted if it collides with an existing key. That is, we store K at an entry different from Th(K). Two issues arisewhat is the relocation scheme?how to search for K later?Three common methods for resolving a collision in open addressingLinear pr
14、obingQuadratic probingDouble hashingOpen AddressingTo insert a key K, compute h0(K). If Th0(K) is empty, insert it there. If collision occurs, probe alternative cell h1(K), h2(K), . until an empty cell is found.hi(K) = (hash(K) + f(i) mod m, with f(0) = 0f: collision resolution strategyLinear Probin
15、gf(i) =icells are probed sequentially (with wraparound) hi(K) = (hash(K) + i) mod mInsertion:Let K be the new key to be inserted. We compute hash(K)For i = 0 to m-1compute L = ( hash(K) + I ) mod mTL is empty, then we put K there and stop. If we cannot find an empty entry to put K, it means that the
16、 table is full and we should report an error. Linear Probinghi(K) = (hash(K) + i) mod mE.g, inserting keys 89, 18, 49, 58, 69 with hash(K)=K mod 10To insert 58, probe T8, T9, T0, T1To insert 69, probe T9, T0, T1, T2 Primary ClusteringWe call a block of contiguously occupied table entries a clusterOn
17、 the average, when we insert a new key K, we may hit the middle of a cluster. Therefore, the time to insert K would be proportional to half the size of a cluster. That is, the larger the cluster, the slower the performance. Primary ClusteringLinear probing has the following disadvantages:Once h(K) f
18、alls into a cluster, this cluster will definitely grow in size by one. Thus, this may worsen the performance of insertion in the future.If two cluster are only separated by one entry, then inserting one key into a cluster can merge the two clusters together. Thus, the cluster size can increase drast
19、ically by a single insertion. This means that the performance of insertion can deteriorate drastically after a single insertion.Large clusters are easy targets for collisions.Quadratic Probingf(i) = i2hi(K) = ( hash(K) + i2 ) mod mE.g., inserting keys 89, 18, 49, 58, 69 with hash(K) = K mod 10To ins
20、ert 58, probe T8, T9, T(8+4) mod 10To insert 69, probe T9, T(9+1) mod 10, T(9+4) mod 10Quadratic ProbingTwo keys with different home positions will have different probe sequencese.g. m=101, h(k1)=30, h(k2)=29probe sequence for k1: 30,30+1, 30+4, 30+9probe sequence for k2: 29, 29+1, 29+4, 29+9If the
21、table size is prime, then a new key can always be inserted if the table is at least half empty Quadratic ProbingSecondary clusteringKeys that hash to the same home position will probe the same alternative cellsTo avoid secondary clustering, the probe sequence need to be a function of the original ke
22、y value, not the home positionDouble HashingTo alleviate the problem of clustering, the sequence of probes for a key should be independent of its primary position = use two hash functions: hash() and hash2()f(i) = i + hash2(K)E.g. hash2(K) = R - (K mod R), with R is a prime smaller than mDouble Hash
23、inghi(K) = ( hash(K) + f(i) ) mod m; hash(K) = K mod mf(i) = i + hash2(K); hash2(K) = R - (K mod R),Example: m=10, R = 7 and insert keys 89, 18, 49, 58, 69To insert 49, hash2(49)=7, 2nd probe is T(9+7) mod 10To insert 58, hash2(58)=5, 2nd probe is T(8+5) mod 10To insert 69, hash2(69)=1, 2nd probe is
24、 T(9+1) mod 10Choice of hash2()Hash2() must never evaluate to zeroFor any key K, hash2(K) must be relatively prime to the table size m. Otherwise, we will only be able to examine a fraction of the table entries. E.g.,if hash(K) = 0 and hash2(K) = m/2, then we can only examine the entries T0, Tm/2, a
25、nd nothing else!Choice of hash2()One solution is to make m prime, and choose R to be a prime smaller than m, and set hash2(K) = R (K mod R)Quadratic probing, however, does not require the use of a second hash function likely to be simpler and faster in practice例: 給定關(guān)鍵字序列為 19,14,23,1,68,20,84,27,55,1
26、1,10,79 散列函數(shù):H(key)=key%13 散列表空間地址為012, 試用線性探查法建立散列表0123456789101112191423111682084848484272727272755111079存儲的是數(shù)據(jù)元素1 891011765312421014 27 55 68 84 19 20 10 23 11 Deletion in open addressingActual deletion cannot be performed in open addressing hash tablesotherwise this will isolate records further
27、down the probe sequenceSolution: Add an extra bit to each table entry, and mark a deleted slot by storing a special value DELETED (tombstone)class HashTable /用線性探查法處理溢出時散列表類的定義public: enum KindOfEntry Active, Empty, Deleted ; HashTable ( ) : TableSize ( DefaultSize ) AllocateHt ( ); CurrentSize = 0;
28、 HashTable ( ) delete ht; const HashTable & operator = ( const HashTable & ht2 ); int Find ( const Type & x ); int Insert ( const Type & x ); int Remove ( const Type & x ); int IsIn ( const Type & x ) return ( i = Find (x) ) = 0 ? 1 : 0; void MakeEmpty ( );private: struct HashEntry Type Element; Kin
29、dOfEntry info; int operator= ( HashEntry &, HashEntry & ); int operator!= ( HashEntry &, HashEntry & ); HashEntry ( ) : info (Empty ) HashEntry (const Type & E, KindOfEntry i = Empty ) : Element (E), info (i) ; enum DefaultSize = 11 HashEntry *ht; int CurrentSize, TableSize; void AllocateHt ( ) ht =
30、 new HashEntryTableSize ; int FindPos ( const Type & x ) const; template int HashTable:Find ( const Type & x ) /線性探查法的搜索算法,函數(shù)返回找到位置。/若返回負(fù)數(shù)可能是空位,若為-TableSize則失敗。 int i = FindPos ( x ), j = i; while ( != Empty & htj.Element != x ) j = ( j + 1 ) % TableSize; if ( j = i ) return -TableSize; if ( = Active ) return j; else -j;template void HashTabe : MakeEmpty ( ) /置表中所有表項為空 for ( int i = 0; i TableSize; i+) = Empty; CurrentSize = 0;template const HashTable & HashTable :operator = ( const HashTab
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025年罩極型電動機(jī)項目可行性研究報告
- 成都四川省人民醫(yī)院蒲江醫(yī)院·蒲江縣人民醫(yī)院編外護(hù)理人員招聘3人筆試歷年參考題庫附帶答案詳解
- 2025年熱彎爐項目可行性研究報告
- 2025年槳葉-微粉兩級干燥系統(tǒng)項目可行性研究報告
- 2025年旋轉(zhuǎn)式膜電位器項目可行性研究報告
- 2025年差動軸項目可行性研究報告
- 2025年噴氣織機(jī)邊撐項目可行性研究報告
- 2025年利巴韋林滴眼液項目可行性研究報告
- 2025至2031年中國3-丙二醇行業(yè)投資前景及策略咨詢研究報告
- 2025至2030年高純錳礦項目投資價值分析報告
- 《認(rèn)識人民幣》完整版
- 工程施工風(fēng)險研判報告及安全風(fēng)險管控防范應(yīng)對措施
- 科普作家協(xié)會會員
- ptmeg生產(chǎn)工藝技術(shù)
- 高中英語定語從句之哪吒-Attributive Clause 課件
- 仁愛版八年級英語下冊全冊教案
- 醫(yī)療安全不良事件警示教育課件
- 《幼兒園健康》課件
- 醫(yī)保物價培訓(xùn)課件
- 2024年國新國際投資有限公司招聘筆試參考題庫含答案解析
- 心肌梗死心律失常的機(jī)制和處置
評論
0/150
提交評論