版權(quán)說(shuō)明:本文檔由用戶(hù)提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、、外文資料譯文:Java 開(kāi)發(fā) 2.0:使用 Hibernate Shards 進(jìn)行切 分橫向擴(kuò)展的關(guān)系數(shù)據(jù)庫(kù)An drew Glover,作者兼開(kāi)發(fā)人員,Beac on 50摘要:Sharding并不適合所有網(wǎng)站,但它是一種能夠滿足大數(shù)據(jù)的需求方法。對(duì)于一些商 店來(lái)說(shuō),切分意味著可以保持一個(gè)受信任的RDBMS,同時(shí)不犧牲數(shù)據(jù)可伸縮性和系統(tǒng)性能。在Java開(kāi)發(fā)2.0系列的這一部分中,您可以了解到切分何時(shí)起作用,以及何時(shí)不起 作用,然后開(kāi)始著手對(duì)一個(gè)可以處理數(shù)TB數(shù)據(jù)的簡(jiǎn)單應(yīng)用程序進(jìn)行切分。日期:2010年8月31日級(jí)別:中級(jí)PDF格式:A4和信(64KB的15頁(yè))取得 Adobe?Reader
2、?軟件當(dāng)關(guān)系數(shù)據(jù)庫(kù)試圖在一個(gè)單一表中存儲(chǔ)數(shù)TB的數(shù)據(jù)時(shí),總體性能通常會(huì)降低。索引所有的數(shù)據(jù)讀取,顯然是很耗時(shí)的,而且其中有可能是寫(xiě)入,也可能是讀出。因?yàn)镹oSQL 數(shù)據(jù)商店尤其適合存儲(chǔ)大型數(shù)據(jù),但是NoSQL是一種非關(guān)系數(shù)據(jù)庫(kù)方法。對(duì)于傾向于使用ACID-ity和實(shí)體結(jié)構(gòu)關(guān)系數(shù)據(jù)庫(kù)的開(kāi)發(fā)人員及需要這種結(jié)構(gòu)的項(xiàng)目來(lái)說(shuō),切分是一個(gè) 令人振奮的選方法。切分一個(gè)數(shù)據(jù)庫(kù)分區(qū)的分支,不是在本機(jī)上的數(shù)據(jù)庫(kù)技術(shù),它發(fā)生在應(yīng)用場(chǎng)面上。在 各種切分實(shí)現(xiàn),Hibernate Shards可能是Java?技術(shù)世界中最流行的。這個(gè)漂亮的項(xiàng)目可 以讓您使用映射至邏輯數(shù)據(jù)庫(kù)的POJO對(duì)切分?jǐn)?shù)據(jù)集進(jìn)行幾乎無(wú)縫操作。當(dāng)你使用H
3、ibernate Shards時(shí),您不需要將你的 POJO特別映射至切分。您可以像使用 Hibernate方 法對(duì)任何常見(jiàn)關(guān)系數(shù)據(jù)庫(kù)進(jìn)行映射時(shí)一樣對(duì)其進(jìn)行映射。Hibernate Shards可以為您管理低 級(jí)別的切分任務(wù)。迄今為止,在這個(gè)系列,我用一個(gè)比賽和參賽者類(lèi)推關(guān)系的簡(jiǎn)單域表現(xiàn)出不同的數(shù)據(jù) 存儲(chǔ)技術(shù)比喻為基礎(chǔ)。這個(gè)月,我將使用這個(gè)熟悉的例子,介紹一個(gè)實(shí)際的切分策略,然 后在Hibernate實(shí)現(xiàn)它的碎片。請(qǐng)注意,切分首當(dāng)其沖的工作是和Hibernate沒(méi)有必然關(guān)系的,事實(shí)上,對(duì)Hibernate stards編碼部分是容易的。真正難的是搞清楚內(nèi)容碎片和你的工 作方式。關(guān)于本系列Java
4、的發(fā)展前景已經(jīng)發(fā)生了根本變化,因?yàn)?Java技術(shù)初現(xiàn)端倪。得益于成熟的開(kāi)源 框架和可靠的租金部署基礎(chǔ)設(shè)施,它現(xiàn)在的組裝,測(cè)試,運(yùn)行和維護(hù)Java應(yīng)用開(kāi)發(fā)的速度和成本降低。在這個(gè)系列中,An drew Glover探討了技術(shù)和工具,使這個(gè)新的 Java開(kāi)發(fā)有 盡可能多的典范。切分簡(jiǎn)介數(shù)據(jù)庫(kù)切分是一種劃分成一些小團(tuán)體的邏輯數(shù)據(jù),可以將一塊表的分成不同的小組。 例如,如果您正在根據(jù)時(shí)間戳對(duì)一個(gè)名為foo的超大型表進(jìn)行分區(qū),2010年8月之前的所有數(shù)據(jù)都將進(jìn)入分區(qū) A,而之后的數(shù)據(jù)則全部進(jìn)入分區(qū) B。分區(qū)可以加快讀寫(xiě)速度,因 為它們的目標(biāo)是單獨(dú)分區(qū)中的較小型數(shù)據(jù)集。分區(qū)并不總是可用的(MySQL并沒(méi)有
5、支持它,直到5.1版),而且與商業(yè)系統(tǒng)一起 做讓它的成本可以讓人望而卻步。更何況,在同一物理機(jī)上實(shí)現(xiàn)最分區(qū)存儲(chǔ)數(shù)據(jù),所以你 仍然受到硬件基礎(chǔ)的限制。分區(qū)也不能解決可靠性的或硬件不足。因此,聰明的人開(kāi)始為 尋找各種新的方法。切分基本上是在數(shù)據(jù)庫(kù)級(jí)別的:而不是分裂的碎片的數(shù)據(jù)表的行,數(shù)據(jù)庫(kù)本身是被分 割(通常是在不同的機(jī)器)的一些邏輯數(shù)據(jù)元素,而不是分裂成較小的塊表,分割分片成 一個(gè)完整的數(shù)據(jù)庫(kù)小切分基本上是在數(shù)據(jù)庫(kù)級(jí)別的:而不是分裂的碎片的數(shù)據(jù)表的行,數(shù) 據(jù)庫(kù)本身是被分割(通常是在不同的機(jī)器)的一些邏輯數(shù)據(jù)元素,塊。切分典型的例子是基于大型數(shù)據(jù)庫(kù)存儲(chǔ)劃分各地區(qū)的全球客戶(hù)數(shù)據(jù):切分A用于存儲(chǔ)美國(guó)的
6、客戶(hù)信息,切分 B用戶(hù)存儲(chǔ)亞洲的客戶(hù)信息,切分 C歐洲,等。這些切分分別 處于不同的計(jì)算機(jī)上,且每個(gè)切分將存儲(chǔ)所有相關(guān)數(shù)據(jù),如客戶(hù)喜好或訂購(gòu)歷史。對(duì)分片(如分區(qū))的好處是它壓縮大數(shù)據(jù):在每個(gè)單獨(dú)的碎片表,它允許更快的讀取和寫(xiě)入,提高了性能。分片是也可以提高想象可靠性,因?yàn)榧词挂凰槠馔馐?,其他?仍然能夠滿足數(shù)據(jù)。而由于分片是在應(yīng)用層完成,你可以做的數(shù)據(jù)庫(kù)在常規(guī)下不支持分割 它。資金成本也可能降低。主鍵切分利用多個(gè)數(shù)據(jù)庫(kù),所有這些都有自主意識(shí)的功能,不干涉其他切分。因此,如果 你依賴(lài)于數(shù)據(jù)庫(kù)序列(如主鍵自動(dòng)生成),很可能是相同的主鍵將顯示在一個(gè)數(shù)據(jù)庫(kù)上成 立。這是可能的,以協(xié)調(diào)跨分布式數(shù)據(jù)庫(kù)
7、序列,但這樣做增加了系統(tǒng)的復(fù)雜性。最安全的 方式,禁止重復(fù)的主鍵是讓你的應(yīng)用程序(這將是一個(gè)sharded管理系統(tǒng)反正)生成密鑰??缢槠樵?xún)大部分(包括Hibernate碎片)分片的實(shí)現(xiàn)不允許跨碎片查詢(xún),這意味著你必須去額 外的長(zhǎng)度,如果你想利用兩對(duì)來(lái)自不同的碎片的數(shù)據(jù)集。(有趣的是,Amazon的SimpleDB 的還禁止跨域查詢(xún)。)如果將美國(guó)客戶(hù)信息存儲(chǔ)在切分1中,還需要將所有相關(guān)數(shù)據(jù)存儲(chǔ)在此。如果您嘗試將那些數(shù)據(jù)存儲(chǔ)在切分2中,情況就會(huì)變得復(fù)雜,系統(tǒng)性能也可能受影響。這種情況也與先前提出的觀點(diǎn) -如果你有點(diǎn)最終需要做跨碎片連接,你最好的管 理方式,消除了重復(fù)的可能性管理鍵!顯然,你需要充
8、分考慮分片策略,然后再設(shè)置你的 數(shù)據(jù)庫(kù)。一旦你已經(jīng)選擇了一種特定的方向,你就或多或少地依賴(lài)于它-它很難在走動(dòng)后, 一直sharded數(shù)據(jù)。避免過(guò)早分片切分最好采用分片后期。像過(guò)早的優(yōu)化,分片的基礎(chǔ)上增長(zhǎng)數(shù)據(jù)的預(yù)期可能是一個(gè)災(zāi) 難。分片實(shí)施的成功是基于一段時(shí)間內(nèi)適當(dāng)?shù)亓私鈹?shù)據(jù)增長(zhǎng)的應(yīng)用程序,并推斷未來(lái)。一 旦你sharded您的數(shù)據(jù)可能會(huì)極其難以走動(dòng)。一個(gè)策略的例子由于分片結(jié)合你到一個(gè)線性數(shù)據(jù)模型(即,你不能輕易加入不同碎片的數(shù)據(jù)),你應(yīng) 該從你的數(shù)據(jù)清楚地了解每個(gè)組織碎片是將如何邏輯的。這通常是最容易由一個(gè)域的主節(jié) 點(diǎn)成為重點(diǎn)。在一個(gè)電子商務(wù)系統(tǒng)的情況下,主節(jié)點(diǎn)可以是一個(gè)命令或一個(gè)客戶(hù)。因此,
9、 如果你選擇 客戶(hù)”作為您的分片策略的基礎(chǔ),然后與客戶(hù)的所有數(shù)據(jù)將被轉(zhuǎn)移到各自的碎 片,但你還是要選擇哪些碎片去移動(dòng)這些數(shù)據(jù)。對(duì)客戶(hù)來(lái)說(shuō),你可以根據(jù)位置碎片(歐洲,亞洲,非洲等),或者你可以在別的東西 的碎片。這取決于你。您的碎片戰(zhàn)略應(yīng)當(dāng)指出,納入均勻分布的碎片之間的所有數(shù)據(jù)的一些方法。分片整體的思路是,打破大套成小的數(shù)據(jù),因此,如果某個(gè)特定電子商務(wù)領(lǐng)域有 一個(gè)大的歐洲客戶(hù)在設(shè)置和美國(guó)比較少,它可能不會(huì)基于意義的碎片對(duì)客戶(hù)的位置回到比賽一一使用切分!現(xiàn)在讓我們回到我經(jīng)常提到的賽跑應(yīng)用程序示例中,我可以根據(jù)比賽或參賽者進(jìn)行切 分。在本示例中,我將根據(jù)比賽進(jìn)行切分,因?yàn)槲铱吹接蚴歉鶕?jù)參加不同比賽的
10、參賽者進(jìn) 行組織的。因此,比賽是域的根。我也將根據(jù)比賽距離進(jìn)行切分,因?yàn)楸荣悜?yīng)用程序包含 不同長(zhǎng)度和不同參賽者的多項(xiàng)比賽。請(qǐng)注意:在進(jìn)行上述決定時(shí),我已經(jīng)接受了一個(gè)妥協(xié):如果一個(gè)參賽者參加了不止一 項(xiàng)比賽,他們分屬不同的切分,那該怎么辦呢? Hibernate Shards (像大多數(shù)切分實(shí)現(xiàn)一樣) 不支持跨切分連接。我必須忍受這些輕微不便,允許參賽者被包含在多個(gè)切分中一也就是說(shuō),我將在參賽者參加的多個(gè)比賽切分中重建該參賽者。為了簡(jiǎn)便起見(jiàn),我將創(chuàng)建兩個(gè)切分:一個(gè)用于10英里以下的比賽;另一個(gè)用于10英 里以上的比賽。實(shí)現(xiàn) Hibernate shardsHibernate stards與現(xiàn)有的
11、Hibernate項(xiàng)目幾乎天衣無(wú)縫。唯一的缺點(diǎn)是,Hibernate的碎片需要一些具體資料和你的行為。 也就是說(shuō),它需要一個(gè)碎片訪問(wèn)策略,碎片,選擇策略, 以及碎片,解決策略。這些接口,你必須執(zhí)行,盡管在某些情況下,你可以使用默認(rèn)的。 我們將在后面的部分逐個(gè)了解各個(gè)接口。ShardAccessStrategy執(zhí)行查詢(xún)時(shí),Hibernate Shards需要一個(gè)決定首個(gè)切分、第二個(gè)切分及后續(xù)切分的機(jī)制。Hibernate Shards無(wú)需確定查詢(xún)什么(這是 Hibernate Core和基礎(chǔ)數(shù)據(jù)庫(kù)需要做的),但 是它確實(shí)意識(shí)到,在獲得答案之前可能需要對(duì)多個(gè)切分進(jìn)行查詢(xún)。因此,Hibernate
12、Shards提供了兩種極具創(chuàng)意的邏輯實(shí)現(xiàn)方法:一種方法是根據(jù)序列機(jī)制(一次一個(gè))對(duì)切分進(jìn)行查 詢(xún),直到獲得答案為止;另一種方法是并行訪問(wèn)策略,這種方法使用一個(gè)線程模型一次對(duì) 所有切分進(jìn)行查詢(xún)。我要保持簡(jiǎn)單,并利用連續(xù)的戰(zhàn)略,取名為SequentialShardAccessStrateg。我們將很快配置。ShardSelectionStrategy當(dāng)創(chuàng)建一個(gè)新的對(duì)象(即,當(dāng)一個(gè)新的Race或Runner是通過(guò)Hibernate創(chuàng)建),Hibernate Shards需要知道什么碎片相應(yīng)的數(shù)據(jù)應(yīng)該寫(xiě)入。因此,你必須實(shí)現(xiàn)這個(gè)接口和代碼邏輯的 分片。如果你想有一個(gè)默認(rèn)的實(shí)現(xiàn),有一個(gè)被稱(chēng)為RoundRo
13、binShardSelectionStrategy它使用了碎片的數(shù)據(jù)放入循環(huán)賽戰(zhàn)略。對(duì)于賽跑應(yīng)用程序,我需要提供根據(jù)比賽距離進(jìn)行切分的行為。因此,我們需要實(shí)現(xiàn)ShardSelectionStrategy 接 口并提供依據(jù) Race 對(duì)象的 distanee 采用 selectShardldForNewObject方法進(jìn)行切分的簡(jiǎn)易邏輯。(我將稍候在 Race對(duì)象中展示。) 在運(yùn)行時(shí),當(dāng)調(diào)用是一些保存在我的領(lǐng)域?qū)ο箢?lèi)的方法,該接口的行為是在Hibernate杠桿內(nèi)心深處的核心。清單1。一個(gè)簡(jiǎn)單的碎片,選擇策略import org.hiber nate.shards.Shardld;import
14、org.hibernate.shards.strategy.selectio n.ShardSelectio nStrategy;public class RacerShardSelectio nStrategy impleme nts ShardSelectio nStrategy public Shardld selectShardldForNewObject(Object obj) if (obj in sta nceof Race) Race rce = (Race) obj;return this.determi neShardld(rce.getDista nce(); else
15、if (obj in sta nceof Runner) Runner runnr = (Runner) obj;if (runn r.getRaces().isEmpty() throw new lllegalArgume ntExceptio n(ru nners must have at least one race); else double dist = 0.0;for (Race rce : runn r.getRaces() dist = rce.getDista nce();break;return this.determi neShardld(dist); else thro
16、w new lllegalArgume ntExceptio n(a non-shardable object is being created);private Shardld determi neShardld(double dista nce)if (dista nee 10.0) return new Shardld(1); else return new Shardld(O);正如你可以看到清單1,如果該對(duì)象被保存的一場(chǎng) Race,那么它的距離確定,因此, 而且(因此)選擇了一個(gè)切分。在這種情況下,有兩個(gè)切分:0和1,其中切分1中包含10英里以上的比賽,切分 0中包含所有其他比賽。如
17、果持久化一個(gè) Ru nner或其他對(duì)象,情況會(huì)稍微復(fù)雜一些。我已經(jīng)編碼了一個(gè)邏輯 規(guī)則,其中有三個(gè)規(guī)定:一名Runner在沒(méi)有對(duì)應(yīng)的 Race時(shí)無(wú)法存在。如果Runner被創(chuàng)建時(shí)參加了多場(chǎng) Races這名Runner將被持久化到尋找到的首場(chǎng) Race所屬的切分中。(順便說(shuō)一句,該原則對(duì)未來(lái)有負(fù)面影響。)如果還保存了其他域?qū)ο?,現(xiàn)在將引發(fā)一個(gè)異常。根據(jù)這些你就可以擦你眉頭上的汗水,因?yàn)榇蠖鄶?shù)的辛勤的工作都做完了。隨著比賽應(yīng) 用的增長(zhǎng),我所使用的邏輯可能不靈活,但這行得通為執(zhí)行本示范!ShardResolutionStrategy要找這個(gè)對(duì)象的關(guān)鍵Hibernate Stards需要一個(gè)辦法決定先
18、切分那個(gè)。你就用 SharedResolutio nStrategy接 口去引導(dǎo)。正如我之前所說(shuō)的,sharding迫使你對(duì)基本有敏銳的鑰匙,你可以管理之行。幸運(yùn)的是, 已經(jīng)好Hibernate Stards或UUID生成方面表現(xiàn)良好。因此 Hibernate Shards創(chuàng)造性地提供 一個(gè)ID生成器,名為 ShardedUUlDGenerator,它可以靈活地將切分ID信息嵌入到UUID 中。如果您最后使用ShardedUUlDGe nerator進(jìn)行鍵生成(我在本文中也將采取這種方法),那么您也可以使用 Hibernate Shards提供的創(chuàng)新 ShardResolutionStrate
19、gy實(shí)現(xiàn),名 為 AIIShardsShardResolutionStrategy這可以決定依據(jù)一個(gè)特定對(duì)象的 ID搜索什么切分。配置好Hibernate Shards工作所需的三個(gè)接口后,我們就可以對(duì)切分示例應(yīng)用程序的 第二步進(jìn)行實(shí)現(xiàn)了?,F(xiàn)在應(yīng)該啟動(dòng)Hibernate的SessionFactory 了。外文原文資料信息1 外文原文作者:2 外文原文所在書(shū)名或論文題目:3 外文原文來(lái)源:出版社或刊物名稱(chēng)、出版時(shí)間或刊號(hào)、譯文部分所在頁(yè)碼:網(wǎng)頁(yè)地址:、外文原文資料:Java development 2.0: Sharding withHibernate ShardsHoriz on tai sc
20、alability for relati onal databasesAn drew Glover, Author and developer, Beac on50Summary: Shard ing isnt for every one, but its one way that relati onal systems can meet the dema nds of big data. For some shops, shard ing means being able to keep a trusted RDBMS in place without sacrific ing data s
21、calability or system performa nee. In this in stallme nt of theJava developme nt 2.0series find out whe n shard ing works, and whe n it does nt, and the n get your hands busy shardi ng a simple applicati on capable of han dli ng terabytes of data.Date: 31 Aug 2010Level: In termediatePDF: A4 and Lett
22、er (64KB | 15 pages)Get Adobe? Reader?鬆 蠱 When relational databases attempt to store terabytes of data in single tables, overall performa nee typically degrades. In dex ing all that data is obviously expe nsive for reads, but also for writes. While NoSQL datastores are particularly suited to storing
23、 big data (think Googles Bigtable), NoSQL is a pate ntly non-relati onal approach. For the developer who prefers the ACID-ity and solid structure of a relati onal database, or the project that requires it, shardi ng could be an excit ing alter native.Shard ing, an offshoot of database partiti onin g
24、, is nt a n ative database tech niqu it happe ns at the level of the application. Among various sharding implementations, Hibernate Shards is possibly the most popular in the world of Java? tech no logy. This nifty project lets you work more or less seamlessly with sharded datasets (I will explai n
25、the more or less part shortly) using POJOs that are mapped to a logical database. Whe n you use Hiber nate Shards, you dont have to specifically map your POJOs to shard you map them as you would any no rmal relati onal database in the Hiber nate way. Hibernate Shards man ages the low-level shardi ng
26、 stuff for you.So far in this series Ive used a simple domain based on the analogy of races and runners to dem on strate various data storage tech no logies. This mon th, Ill use this familiar example to introduce a practical sharding strategy, then implement it in Hibernate Shards. Note that the br
27、unt of the work related to shard ing isnt n ecessarily related to Hiber nate; in fact, cod ing for Hiber nate Shards is the easy part. The real work is figuri ng ouh ow and what youll shard.About this seriesThe Java developme nt la ndscape has cha nged radically since Java tech no logy first emerged
28、. Thanks to mature ope n source frameworks and reliable for-re nt deployme nt in frastructures, its now possible to assemble, test, r un, and maintain Java applicati ons quickly and in expe nsively .In this series An drew Glover explores the spectrum of tech no logies and tools that make this new Ja
29、va developme nt paradigm possible.Shard ing at a gla neeDatabase partiti oning is an in here ntly relati onal process of dividi ng a tables rows by some logical piece of data into smaller groups. If you were partitioning a gigantic table nameefoo based on timestamps, for instanee, all the data for A
30、ugust 2010 would go in Partition A, while anything since the n would be in Partiti on B, and so on. Partiti oning has the effect of mak ing reads and writes faster because they target smaller datasets in in dividual partiti ons.Partiti oning isnt always available (MySQL did nt support it un til vers
31、i on 5.1), and the cost of doing it with a commercial system can be prohibitive. Whats more, most partitioning implementations store data on the same physical machine, so youre still bound to the limits of your hardware. Partiti oning also does nt resolve the reliability, or lack thereof, of your ha
32、rdware. Thus, various smart people started look ing for new ways to scale.Sharding is essentially partitioning at the database level: rather than divide a tables rows by pieces of data, the database itself is split up (usually across different machines) by some logical data eleme nt. That is, rather
33、 tha n splitti ng up aable into smaller chun ks, shard ing splits up an en tire databasei nto smaller chu nks.The canoni cal example for shardi ng is based on divid ing a large database stori ng worldwide customer data by region: Shard A for customers in the United States, Shard B for Asia, Shard C
34、for Europe, and so on. The shards themselves would live on differe nt mach ines and each shard would hold all related data, such as customer prefere nces or order history.The ben efit of shard ing (like partiti oning) is that it compacts big data: in dividual tables are smaller in each shard, which
35、allows for faster reads and writes, which in creases performa nee. Shard ing also con ceivably improves reliability, because eve n if one shard un expectedly fails, others are still able to serve data. And because sharding is done at the application layer, you can do it for databases that dont suppo
36、rt regular partitioning. The monetary cost is also potentially lower.Shard ing and strategyLike most tech no logies, shard ing does en tail some trade-offs. Because shard ing isnt a n ative database tech nique- that is, you must impleme nt it in your applicati on youll n eed to map out your shard in
37、g strategy before you beg in. Both primary keys and cross-shard queries play a major role whe n shardi ng, mainly by defi ning what you cant do.Primary keysShard ing leverages multiple databases, all of which function aut onom ously, without aware ness of their peers. As a result, if you rely on dat
38、abase seque nces (such as for automatic primary key gen erati on), its likely that an ide ntical primary key will show up across a set of databases. Its possible to coordi nate seque nces across a distributed database but doing so in creases system complexity. The safest way to prohibit duplicate pr
39、imary keys is to have your applicatio n (which will be managing a sharded system any way) gen erate keys.Cross-shard queriesMost shardi ng impleme ntatio ns (in clud ing Hibernate Shards) dont permit cross-shard query ing, which mea ns you have to go to extra len gths if you want to leverage two set
40、s of data from differe nt shards. (In teresti ngly, Amaz ons SimpleDB also prohibits cross-doma in queries.) For in sta nee, if youre stori ng Un ited States customers in Shard 1, you also n eed to store all of their related data there. If you try to store that data in Shard 2, thi ngs will get comp
41、licated, and system performa nee will probably suffer. This situati on is also related to the point made earlie if you somehow end up n eedi ng to do cross-shard joins, you had better be managing keys in a way that elim in ates the possibility of duplicates!Clearly, youll n eed to fully con sider a
42、shard ing strategy before you set up your database. And once youve chose n a particular directi on, youre more or less tied to 十 its hard to move data around after its bee n sharded.Avoid premature shardingShard ing is best employed late in the game. Like premature optimizati on, shard ing based on
43、expected data growth could be a recipe for disaster. Successful shard ing impleme ntatio ns are based on measurably un dersta nding an applicati ons data growth over time, and the n extrapolati ng to the future. Once youve sharded your data it can be extraord in arily hard to move around.A strategy
44、exampleBecause shard ing binds you to a lin ear data model (that is, you cant easily join data in differe nt shards), you should start with a clear picture of how your data will be logically orga ni zed per shard. This is usually easiest by focus ing on the primary node of a doma in. In the case of
45、an e-commerce system, the primary node could be either an order or a customer. Thus, if you choose customer as the basis for your shardi ng strategy, the n all data related to customers will be moved into the respective shards, though youll still have to choose to which shard to move that data.For c
46、ustomers, you could shard based on locatio n (Europe, Asia, Africa, etc.), or you could shard based on someth ing else. Its up to you. Your shard strategy should, however, i ncorporate some means of distributi ng data eve nly among all of your shards. The whole idea of shard ing is to break up big d
47、ata sets into smaller on es; thus, if a particular e-commerce domai n had a large set of Europea n customers and relatively few in the Un ited States, it probably would nt make sense to shard based on customer locati on.Off to the races with shard ing!Getting back to the familiar example of my racin
48、g application, I can shard by race or by runner. In this case, Im going to shard by race, because I see the doma in being orga ni zed by runners who bel ong to races. So the race is the root of my doma in. Im also going to shard based on race distanee, because my racing application holds myriad race
49、s of different lengths, along with myriad runn ers.Note that in making these decisi on s, I have already accepted a trade-off: what if a runner participates in more than one race, each of them living in different shards? Hibernate Shards (like most shard ing impleme ntati ons) does nt support cross-
50、shard joins. Im going to have to live with this slight inconvenience and allow runners to live in multiple shards that is, I will recreate each runner in the shards where his or her various races live.To keep things simple, Im going to create two shards: one for races less than 10 miles and ano ther
51、 for any thi ng greater tha n 10 miles.Impleme nting Hibernate ShardsHibernate Shards is made to work almost seamlessly with exist ing Hibernate projects. The only catch is that Hiber nate Shards n eeds some specific in formati on and behavior from you. Namely, it n eeds a shard-access strategy, a s
52、hard-select ion strategy, and a shard-resoluti on strategy. These are in terfaces you must impleme nt, though in some cases you can use default on es. Well look at each in terface separately in the followi ng sect ions.ShardAccessStrategyWhen a query is executed, Hibernate Shards n eeds a mecha nism
53、 for determ ining which shard to hit first, sec ond, and so on. Hibernate Shards does nt n ecessarily figure out what a query is looking for (thats for the Hibernate Core and underlying database to do), but it does recognize that a query might n eed to execute aga inst multiple shards before an an s
54、wer is obta in ed. So, Hibernate Shards provides two logical implementations out of the box: one executes a query in a sequential mechanism (one at a time) against shards until an answer is returned, or until all of the shards have bee n queried. The other impleme ntatio n is a parallel-access strat
55、egy, which uses a thread ing model to hit all of the shards at on ce.rm going to keep things simple and utilize the seque ntial strategy, aptly n amed Seque ntialShardAccessStrategy. Well con figure it shortly.ShardSelectio nStrategyWhen a new object is created (that is, whe n a new Race or Runner i
56、s created via Hibernate), Hiber nate Shards n eeds to know what shard the corresp onding data should be writte n to. Accord in gly, you must impleme nt this in terface and code the shard ing logic. If you want a default impleme ntati on, theres one dubbed Roun dRob in ShardSelect ion Strategy, which
57、 uses a roun d-rob in strategy for putt ing data into shards.For the raci ng applicati on, I n eed to provide behavior that shards by race dista nee. Accordi ngly, ril n eed to impleme nt the ShardSelectio nStrategy in terface and provide some simple logic that shards based on a Race objects dista n
58、ee in the selectShardldForNewObject method. (Ill show the Race object shortly.)At run time, whe n a call is made to some save-like method on my doma in objects, this in terfaces behavior is leveraged deep dow n in Hiber nates core.Listing 1. A simple shard-selection strategyimport org.hiber nate.shards.Shardld;import org.hibernate.shards.strategy.selectio n.ShardSelectio nStrategy;public class R
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 2024-2025學(xué)年新教材高中物理第2章單元綜合練習(xí)含解析魯科版必修第一冊(cè)
- 2025工程轉(zhuǎn)讓合同范本
- 2025年廣州貨運(yùn)從業(yè)資格證考試題目大全答案
- 2025年廣州貨運(yùn)從業(yè)資格證模擬試題及答案
- 2025年?yáng)|營(yíng)貨運(yùn)從業(yè)資格證模擬考試題庫(kù)
- 2025忘記簽訂勞動(dòng)合同如何賠償
- 中國(guó)高溫及持久強(qiáng)度試驗(yàn)機(jī)項(xiàng)目投資可行性研究報(bào)告
- 中國(guó)帶光源放大鏡項(xiàng)目投資可行性研究報(bào)告
- 上海外國(guó)語(yǔ)大學(xué)《數(shù)據(jù)庫(kù)系統(tǒng)及原理》2023-2024學(xué)年第一學(xué)期期末試卷
- 公司報(bào)告格式范文
- 大學(xué)《宏觀經(jīng)濟(jì)學(xué)》期末考試試題及參考答案
- 20以?xún)?nèi)加減法練習(xí)題100題附參考答案(滿分必刷)
- 2024年高考語(yǔ)文閱讀之王愿堅(jiān)小說(shuō)專(zhuān)練(解析版)
- 新中國(guó)史智慧樹(shù)知到期末考試答案章節(jié)答案2024年中國(guó)農(nóng)業(yè)大學(xué)
- Academic English智慧樹(shù)知到答案2024年杭州醫(yī)學(xué)院
- 國(guó)標(biāo)《電力儲(chǔ)能用鋰離子電池監(jiān)造導(dǎo)則》
- TSGD7004-2010壓力管道定期檢驗(yàn)規(guī)則-公用管道
- 上海教育版高中數(shù)學(xué)一年級(jí)上冊(cè)全冊(cè)教案
- 2024年發(fā)供電分公司安全規(guī)程考試多選題庫(kù)及答案(共200題)
- 2024年青海省檢察機(jī)關(guān)聘用制書(shū)記員招聘筆試參考題庫(kù)附帶答案詳解
- 新概念英語(yǔ)第一冊(cè)課文
評(píng)論
0/150
提交評(píng)論