improving the efficiency of bitmap inds for data warehouses_第1頁(yè)
improving the efficiency of bitmap inds for data warehouses_第2頁(yè)
improving the efficiency of bitmap inds for data warehouses_第3頁(yè)
improving the efficiency of bitmap inds for data warehouses_第4頁(yè)
improving the efficiency of bitmap inds for data warehouses_第5頁(yè)
已閱讀5頁(yè),還剩39頁(yè)未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶(hù)提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、Computer Science and Technology專(zhuān)業(yè)畢業(yè)論文 精品論文 IMPROVING THE EFFICIENCY OF BITMAP INDEXES FOR DATA WAREHOUSES關(guān)鍵詞:Data Warehous Bitmap index Compression Data reorganization摘要:The progresses of the computer technology have enabled the development of new kinds ofvery large database applications such as dat

2、a warehouses and scientific databases. Unlike thetraditional databases systems, these data sets are characterized by the high dimensionality, andare mostly read-only and append-only systems. A data warehouse is large repository of information containing data collected fromdifferent data sources over

3、 a long period of time. The warehouse data are used for analyticalpurposes by the knowledge workers (executive, manager, analyst) of an organization to makebetter and faster decisions. The query performance in such an environment is a criticalchallenge as most of the associated queries are complex a

4、nd adhoc in nature and require hugevolumes of data to be processed. A promise way to speed up the query performance in datawarehouses in the bitmap index. Because the bitwise logical operations are well supported bycomputer hardware, the bitmaps can easily and effciently be combined to answer querie

5、s.However, as the database grow in size, the bitmap index size increases as well. It is thereforeimperative to compress the index in order to improve the query performance. One of the mostnotable bitmap index compression schemes is the word-aligned hybrid code (WAH). This thesis aims to present an e

6、fficient bitmap indexing technique to improving theperformance of the Word-Aligned Hybrid for high-cardinality attributes in data warehousingapplications. WAH compression is based on the run-length encoding, and its performancedepends on the presence of long runs of identical bits. For large data se

7、ts with high-cardinalityattributes, in most cases, the bitmap index of the indexed attribute (column) would be sparse;hence the compression process would be ineffective. To make the bitmap compression efficient,it's therefore necessary to reorder the data warehouse tuples to achieve high

8、er compressionrates. To reorder the tuples of base data, our approach consists to simply sort the indexedattribute before generating the bitmap index. Our strategy is therefore used as a preprocessingstep before compression, only to improve the performance, without affecting algorithms usedfor compr

9、ession and query processing. Sorting of the indexed attribute ensures long runs ofones and zeros. These long runs of ones and zeros are desirable for any compression schemethat is based on run length encoding, such as WAH. To further improve the performance, wepropose to bin and cluster the indexed

10、attributed. Our experiments, conducted on five data setswith table cardinalities varying from 100,000 to 2,000,000 records, showed that our strategysignificantly improves the performance of the word-aligned hybrid code. We observed that thesizes of the bitmap indices considerably decreased for vario

11、us attributes cardinalities. Moreover,we also found out that the execution time measured for both equality and range queries wassubstantially improved.正文內(nèi)容 The progresses of the computer technology have enabled the development of new kinds ofvery large database applications such as data warehouses a

12、nd scientific databases. Unlike thetraditional databases systems, these data sets are characterized by the high dimensionality, andare mostly read-only and append-only systems. A data warehouse is large repository of information containing data collected fromdifferent data sources over a long period

13、 of time. The warehouse data are used for analyticalpurposes by the knowledge workers (executive, manager, analyst) of an organization to makebetter and faster decisions. The query performance in such an environment is a criticalchallenge as most of the associated queries are complex and adhoc in na

14、ture and require hugevolumes of data to be processed. A promise way to speed up the query performance in datawarehouses in the bitmap index. Because the bitwise logical operations are well supported bycomputer hardware, the bitmaps can easily and effciently be combined to answer queries.However, as

15、the database grow in size, the bitmap index size increases as well. It is thereforeimperative to compress the index in order to improve the query performance. One of the mostnotable bitmap index compression schemes is the word-aligned hybrid code (WAH). This thesis aims to present an efficient bitma

16、p indexing technique to improving theperformance of the Word-Aligned Hybrid for high-cardinality attributes in data warehousingapplications. WAH compression is based on the run-length encoding, and its performancedepends on the presence of long runs of identical bits. For large data sets with high-c

17、ardinalityattributes, in most cases, the bitmap index of the indexed attribute (column) would be sparse;hence the compression process would be ineffective. To make the bitmap compression efficient,it's therefore necessary to reorder the data warehouse tuples to achieve higher compression

18、rates. To reorder the tuples of base data, our approach consists to simply sort the indexedattribute before generating the bitmap index. Our strategy is therefore used as a preprocessingstep before compression, only to improve the performance, without affecting algorithms usedfor compression and que

19、ry processing. Sorting of the indexed attribute ensures long runs ofones and zeros. These long runs of ones and zeros are desirable for any compression schemethat is based on run length encoding, such as WAH. To further improve the performance, wepropose to bin and cluster the indexed attributed. Ou

20、r experiments, conducted on five data setswith table cardinalities varying from 100,000 to 2,000,000 records, showed that our strategysignificantly improves the performance of the word-aligned hybrid code. We observed that thesizes of the bitmap indices considerably decreased for various attributes

21、cardinalities. Moreover,we also found out that the execution time measured for both equality and range queries wassubstantially improved.The progresses of the computer technology have enabled the development of new kinds ofvery large database applications such as data warehouses and scientific datab

22、ases. Unlike thetraditional databases systems, these data sets are characterized by the high dimensionality, andare mostly read-only and append-only systems. A data warehouse is large repository of information containing data collected fromdifferent data sources over a long period of time. The wareh

23、ouse data are used for analyticalpurposes by the knowledge workers (executive, manager, analyst) of an organization to makebetter and faster decisions. The query performance in such an environment is a criticalchallenge as most of the associated queries are complex and adhoc in nature and require hu

24、gevolumes of data to be processed. A promise way to speed up the query performance in datawarehouses in the bitmap index. Because the bitwise logical operations are well supported bycomputer hardware, the bitmaps can easily and effciently be combined to answer queries.However, as the database grow i

25、n size, the bitmap index size increases as well. It is thereforeimperative to compress the index in order to improve the query performance. One of the mostnotable bitmap index compression schemes is the word-aligned hybrid code (WAH). This thesis aims to present an efficient bitmap indexing techniqu

26、e to improving theperformance of the Word-Aligned Hybrid for high-cardinality attributes in data warehousingapplications. WAH compression is based on the run-length encoding, and its performancedepends on the presence of long runs of identical bits. For large data sets with high-cardinalityattribute

27、s, in most cases, the bitmap index of the indexed attribute (column) would be sparse;hence the compression process would be ineffective. To make the bitmap compression efficient,it's therefore necessary to reorder the data warehouse tuples to achieve higher compressionrates. To reorder t

28、he tuples of base data, our approach consists to simply sort the indexedattribute before generating the bitmap index. Our strategy is therefore used as a preprocessingstep before compression, only to improve the performance, without affecting algorithms usedfor compression and query processing. Sort

29、ing of the indexed attribute ensures long runs ofones and zeros. These long runs of ones and zeros are desirable for any compression schemethat is based on run length encoding, such as WAH. To further improve the performance, wepropose to bin and cluster the indexed attributed. Our experiments, cond

30、ucted on five data setswith table cardinalities varying from 100,000 to 2,000,000 records, showed that our strategysignificantly improves the performance of the word-aligned hybrid code. We observed that thesizes of the bitmap indices considerably decreased for various attributes cardinalities. More

31、over,we also found out that the execution time measured for both equality and range queries wassubstantially improved.The progresses of the computer technology have enabled the development of new kinds ofvery large database applications such as data warehouses and scientific databases. Unlike thetra

32、ditional databases systems, these data sets are characterized by the high dimensionality, andare mostly read-only and append-only systems. A data warehouse is large repository of information containing data collected fromdifferent data sources over a long period of time. The warehouse data are used

33、for analyticalpurposes by the knowledge workers (executive, manager, analyst) of an organization to makebetter and faster decisions. The query performance in such an environment is a criticalchallenge as most of the associated queries are complex and adhoc in nature and require hugevolumes of data t

34、o be processed. A promise way to speed up the query performance in datawarehouses in the bitmap index. Because the bitwise logical operations are well supported bycomputer hardware, the bitmaps can easily and effciently be combined to answer queries.However, as the database grow in size, the bitmap

35、index size increases as well. It is thereforeimperative to compress the index in order to improve the query performance. One of the mostnotable bitmap index compression schemes is the word-aligned hybrid code (WAH). This thesis aims to present an efficient bitmap indexing technique to improving thep

36、erformance of the Word-Aligned Hybrid for high-cardinality attributes in data warehousingapplications. WAH compression is based on the run-length encoding, and its performancedepends on the presence of long runs of identical bits. For large data sets with high-cardinalityattributes, in most cases, t

37、he bitmap index of the indexed attribute (column) would be sparse;hence the compression process would be ineffective. To make the bitmap compression efficient,it's therefore necessary to reorder the data warehouse tuples to achieve higher compressionrates. To reorder the tuples of base d

38、ata, our approach consists to simply sort the indexedattribute before generating the bitmap index. Our strategy is therefore used as a preprocessingstep before compression, only to improve the performance, without affecting algorithms usedfor compression and query processing. Sorting of the indexed

39、attribute ensures long runs ofones and zeros. These long runs of ones and zeros are desirable for any compression schemethat is based on run length encoding, such as WAH. To further improve the performance, wepropose to bin and cluster the indexed attributed. Our experiments, conducted on five data

40、setswith table cardinalities varying from 100,000 to 2,000,000 records, showed that our strategysignificantly improves the performance of the word-aligned hybrid code. We observed that thesizes of the bitmap indices considerably decreased for various attributes cardinalities. Moreover,we also found

41、out that the execution time measured for both equality and range queries wassubstantially improved.The progresses of the computer technology have enabled the development of new kinds ofvery large database applications such as data warehouses and scientific databases. Unlike thetraditional databases

42、systems, these data sets are characterized by the high dimensionality, andare mostly read-only and append-only systems. A data warehouse is large repository of information containing data collected fromdifferent data sources over a long period of time. The warehouse data are used for analyticalpurpo

43、ses by the knowledge workers (executive, manager, analyst) of an organization to makebetter and faster decisions. The query performance in such an environment is a criticalchallenge as most of the associated queries are complex and adhoc in nature and require hugevolumes of data to be processed. A p

44、romise way to speed up the query performance in datawarehouses in the bitmap index. Because the bitwise logical operations are well supported bycomputer hardware, the bitmaps can easily and effciently be combined to answer queries.However, as the database grow in size, the bitmap index size increase

45、s as well. It is thereforeimperative to compress the index in order to improve the query performance. One of the mostnotable bitmap index compression schemes is the word-aligned hybrid code (WAH). This thesis aims to present an efficient bitmap indexing technique to improving theperformance of the W

46、ord-Aligned Hybrid for high-cardinality attributes in data warehousingapplications. WAH compression is based on the run-length encoding, and its performancedepends on the presence of long runs of identical bits. For large data sets with high-cardinalityattributes, in most cases, the bitmap index of

47、the indexed attribute (column) would be sparse;hence the compression process would be ineffective. To make the bitmap compression efficient,it's therefore necessary to reorder the data warehouse tuples to achieve higher compressionrates. To reorder the tuples of base data, our approach c

48、onsists to simply sort the indexedattribute before generating the bitmap index. Our strategy is therefore used as a preprocessingstep before compression, only to improve the performance, without affecting algorithms usedfor compression and query processing. Sorting of the indexed attribute ensures l

49、ong runs ofones and zeros. These long runs of ones and zeros are desirable for any compression schemethat is based on run length encoding, such as WAH. To further improve the performance, wepropose to bin and cluster the indexed attributed. Our experiments, conducted on five data setswith table card

50、inalities varying from 100,000 to 2,000,000 records, showed that our strategysignificantly improves the performance of the word-aligned hybrid code. We observed that thesizes of the bitmap indices considerably decreased for various attributes cardinalities. Moreover,we also found out that the execut

51、ion time measured for both equality and range queries wassubstantially improved.The progresses of the computer technology have enabled the development of new kinds ofvery large database applications such as data warehouses and scientific databases. Unlike thetraditional databases systems, these data

52、 sets are characterized by the high dimensionality, andare mostly read-only and append-only systems. A data warehouse is large repository of information containing data collected fromdifferent data sources over a long period of time. The warehouse data are used for analyticalpurposes by the knowledg

53、e workers (executive, manager, analyst) of an organization to makebetter and faster decisions. The query performance in such an environment is a criticalchallenge as most of the associated queries are complex and adhoc in nature and require hugevolumes of data to be processed. A promise way to speed

54、 up the query performance in datawarehouses in the bitmap index. Because the bitwise logical operations are well supported bycomputer hardware, the bitmaps can easily and effciently be combined to answer queries.However, as the database grow in size, the bitmap index size increases as well. It is th

55、ereforeimperative to compress the index in order to improve the query performance. One of the mostnotable bitmap index compression schemes is the word-aligned hybrid code (WAH). This thesis aims to present an efficient bitmap indexing technique to improving theperformance of the Word-Aligned Hybrid

56、for high-cardinality attributes in data warehousingapplications. WAH compression is based on the run-length encoding, and its performancedepends on the presence of long runs of identical bits. For large data sets with high-cardinalityattributes, in most cases, the bitmap index of the indexed attribu

57、te (column) would be sparse;hence the compression process would be ineffective. To make the bitmap compression efficient,it's therefore necessary to reorder the data warehouse tuples to achieve higher compressionrates. To reorder the tuples of base data, our approach consists to simply s

58、ort the indexedattribute before generating the bitmap index. Our strategy is therefore used as a preprocessingstep before compression, only to improve the performance, without affecting algorithms usedfor compression and query processing. Sorting of the indexed attribute ensures long runs ofones and

59、 zeros. These long runs of ones and zeros are desirable for any compression schemethat is based on run length encoding, such as WAH. To further improve the performance, wepropose to bin and cluster the indexed attributed. Our experiments, conducted on five data setswith table cardinalities varying from 100,000 to 2,000,000 records, showed that our strategysignificantly improves the performance of the word-aligned hybrid code. We observed that thesizes of the bitmap indices considerably decreased for various attributes cardinalities. Moreover,we also fo

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論