大數(shù)據(jù)分析存儲(chǔ)解決方案toBD39.ppt_第1頁(yè)
大數(shù)據(jù)分析存儲(chǔ)解決方案toBD39.ppt_第2頁(yè)
大數(shù)據(jù)分析存儲(chǔ)解決方案toBD39.ppt_第3頁(yè)
大數(shù)據(jù)分析存儲(chǔ)解決方案toBD39.ppt_第4頁(yè)
大數(shù)據(jù)分析存儲(chǔ)解決方案toBD39.ppt_第5頁(yè)
已閱讀5頁(yè),還剩34頁(yè)未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、從企業(yè)數(shù)據(jù)向大數(shù)據(jù)的擴(kuò)展,Traditional Approach Structured, analytical, logical Systems of Record,New ApproachCreative, holistic thought, intuition Systems Of Engagement,Multimedia,Systems of Insight Enterprise Integrationand Context Accumulation,StructuredRepeatableLinear,UnstructuredExploratoryDynamic,Data Ware

2、house,Web Logs,Social Data,Text Data:emails,Sensor data:images,RFID,Internal App Data,Transaction Data,Mainframe Data,OLTP System Data,Hadoop andStreams,Traditional Sources,New Sources,ERP data,對(duì)新式基礎(chǔ)架構(gòu)的需求,在可靠和安全的環(huán)境中處理關(guān)鍵業(yè)務(wù)應(yīng)用 存取和處理海量數(shù)據(jù)包括結(jié)構(gòu)化和非結(jié)構(gòu)化數(shù)據(jù) 速度及時(shí)響應(yīng)隨時(shí)可能出現(xiàn)的商業(yè)機(jī)會(huì),這就需要靈活、實(shí)時(shí)性的基礎(chǔ)架構(gòu) The dynamics of SoR

3、and SoE: 通過(guò)負(fù)載及資源部署的優(yōu)化,來(lái)增強(qiáng)靈活性和效益 通過(guò)采用包括基于開放標(biāo)準(zhǔn)的技術(shù)等新技術(shù)來(lái)改善IT economics,System of Record (SoR),Systems of Engagement(SoE),對(duì)的決策 對(duì)的地方 對(duì)的時(shí)間點(diǎn),Big Data & Analytics,大數(shù)據(jù)分析的新型架構(gòu)解決方案,IBM Big Data & Analytics Infrastructure,Data Zone,Application Zone,4,Smart Metering,Grid Operations 電網(wǎng)管理,Field Service 外勤現(xiàn)場(chǎng)服務(wù),Resou

4、rce Planning 資源規(guī)劃,Customer Service / Customer Operations,實(shí)現(xiàn)真正的有效的法規(guī)遵從,及時(shí)發(fā)現(xiàn)能源損耗問(wèn)題、以及偷電和欺詐行為,提高客戶滿意度,電量使用預(yù)測(cè)更為精確,電網(wǎng)運(yùn)維優(yōu)化,減少停電次數(shù)和時(shí)間,案例: Smart Metering智慧電力計(jì)費(fèi) 大數(shù)據(jù)分析應(yīng)用可以帶來(lái)真正的業(yè)務(wù)價(jià)值,法規(guī)遵從,案例: 用大數(shù)據(jù)分析來(lái)加強(qiáng) Smart Metering,數(shù)據(jù)分析的高可用性,以確保隨時(shí)了解用戶喜好,跨應(yīng)用的TB級(jí)的數(shù)據(jù)需求 通用虛擬化存儲(chǔ)平臺(tái),實(shí)時(shí)收集、存儲(chǔ)并分析數(shù)據(jù),最快可達(dá) 50,000 data points/sec,歷史用電狀態(tài)數(shù)據(jù)的

5、復(fù)雜查詢處理,數(shù)據(jù)在加載到數(shù)據(jù)倉(cāng)庫(kù)前的清洗、驗(yàn)證,這些數(shù)據(jù)可能來(lái)自很多的用戶、收費(fèi)系統(tǒng)或斷電保護(hù)系統(tǒng),關(guān)系掌控構(gòu)建和維護(hù)電網(wǎng)的唯一試圖,對(duì)整個(gè)企業(yè)的結(jié)構(gòu)化和非結(jié)構(gòu)化數(shù)據(jù)t做全局導(dǎo)覽Navigation,從中發(fā)現(xiàn)Discover價(jià)值,分析用戶用電情況,偵測(cè)偷電、改表等行為,預(yù)測(cè)哪些用戶適合于哪些分時(shí)時(shí)段電價(jià)或需求/響應(yīng)服務(wù),分時(shí)時(shí)段電價(jià)的實(shí)時(shí)定價(jià) 或 提供及時(shí)的需求/響應(yīng)服務(wù),IBM Big Data & Analytics Reference Architecture,Big Data Platform Capabilities,Information Ingest Real-time Ana

6、lytics Warehouse & Data Marts Analytic Appliances,All Data Sources,Advanced Analytics/New Insights,New/Enhanced Applications,New Infrastructure Leverages Data Types,Data inMotion,Data atRest,Data inMany Forms,Information Ingestion and Operational Information,Decision Management,BI and Predictive Ana

7、lytics,Navigation and Discovery,IntelligenceAnalysis,Video/Audio Network/Sensor Entity Analytics Predictive,Real-time Analytics,Exploration, Integrated Warehouse, and Mart Zones,Discovery Deep Reflection Operational Predictive,Stream Processing Data Integration Master Data,Streams,Information Govern

8、ance, Security and Business Continuity,Streams,Warehouse,InfoSphere BigInsights Hadoop-based 低延遲分析,針對(duì)多樣化的、海量靜態(tài)數(shù)據(jù)Data-At-Rest,Netezza High Capacity Appliance 基于結(jié)構(gòu)化數(shù)據(jù)的可查詢歸檔,Netezza 1000 基于結(jié)構(gòu)化數(shù)據(jù)的BI+定制化分析 Data,Smart Analytics System 基于結(jié)構(gòu)化數(shù)據(jù)的運(yùn)營(yíng)分析,Informix Timeseries Time-structured analytics,InfoSphere W

9、arehouse 基于結(jié)構(gòu)化數(shù)據(jù)的大容量數(shù)據(jù)分析,InfoSphere Streams 低延遲流數(shù)據(jù)分析 Velocity, Variety & Volume Data-In-Motion,MPP Data Warehouse,Stream Computing,Information Integration,Hadoop,InfoSphere Information Server 海量數(shù)據(jù)集成和轉(zhuǎn)化,IBM Big Data Platform大數(shù)據(jù)平臺(tái),What: 一種開源軟件,將數(shù)據(jù)計(jì)算分布到整個(gè)集群的常見商用服務(wù)器和存儲(chǔ)上 Why: 傳統(tǒng)的計(jì)算架構(gòu)是一種沿縱向擴(kuò)展模式,通過(guò)更快的SAN、大

10、容量?jī)?nèi)存和多級(jí)緩存將數(shù)據(jù)加載到CPU上,成本比較高。 What: Hadoop 把大數(shù)據(jù)集合拆分區(qū)劃為小數(shù)據(jù)集合,再把小數(shù)據(jù)集合分發(fā)到多臺(tái)普通服務(wù)器上,是一種橫向擴(kuò)展模式。 Why: Scalable, Flexible, Cost Effective, Fault Tolerent Components: Map Reduce, HDFS,What is Hadoop?,IBM Value for Hadoop!,HDFS 把數(shù)據(jù)分散存儲(chǔ)在多個(gè)存儲(chǔ)節(jié)點(diǎn)Node上 HDFS 設(shè)計(jì)時(shí)就假設(shè)存儲(chǔ)節(jié)點(diǎn)有失效的可能,所以HDFS會(huì)把一份數(shù)據(jù)復(fù)制3份以上,分散存儲(chǔ)在多個(gè)節(jié)點(diǎn)上,從而實(shí)現(xiàn)系統(tǒng)整體上的可靠性

11、 HDFS文件系統(tǒng)是由服務(wù)器節(jié)點(diǎn)集群組成的,每臺(tái)服務(wù)器依照HDFS的特有block協(xié)議支持網(wǎng)絡(luò)化block 數(shù)據(jù) HDFS Name Node 有發(fā)生單點(diǎn)故障的危險(xiǎn) IBM 在改善文件系統(tǒng)的性能同時(shí)消除了單點(diǎn)故障 Elastic Storage -SNC (available as beta code),Hadoop 說(shuō)明, Map Reduce, HDFS,Hadoop Stack,What does it look like?,典型Hadoop存儲(chǔ)的Pain Points,在選擇HDFS的組件(如軟件、服務(wù)器、網(wǎng)絡(luò)和存儲(chǔ)等)時(shí)很難選對(duì) 在從測(cè)試環(huán)境遷移到生產(chǎn)環(huán)境時(shí),需要做的調(diào)優(yōu)和調(diào)整工作太

12、繁復(fù)了 長(zhǎng)期持續(xù)不斷的運(yùn)維保障過(guò)于繁重,比如老要更換失效組件(尤其是硬盤),這使得保證期望的SLA非常難,CPU 和存儲(chǔ)去耦 本來(lái)用戶的CPU和內(nèi)存已經(jīng)滿足計(jì)算需求,但為了存儲(chǔ)容量需要安裝更多的硬盤不得不買更多的、不必要的CPU和內(nèi)存 Storage options available have clear gaps 本地存儲(chǔ)的利用率低 (25%),每次需要擴(kuò)容的時(shí)候就要添加更多的服務(wù)器,而一旦硬盤失效后需要重建,服務(wù)器越多,失效的幾率越高,性能也就越差,IBM Storage for Hadoop,傳統(tǒng)的 Hadoop 集群使用的是服務(wù)器內(nèi)置硬盤存儲(chǔ)。如果用作測(cè)試或科學(xué)研究還好,可作為業(yè)務(wù)運(yùn)

13、行的存儲(chǔ)就要采用企業(yè)存儲(chǔ) Hadoop 集群要負(fù)責(zé)數(shù)據(jù)保護(hù)和復(fù)制 重建(就是copy)失效的數(shù)據(jù)集到不同節(jié)點(diǎn)上 嚴(yán)重影響CPU性能,無(wú)法實(shí)現(xiàn)企業(yè)級(jí)的RAS Replicate data 問(wèn)題同上 擴(kuò)展的時(shí)候同時(shí)增加處理器/網(wǎng)絡(luò)/存儲(chǔ),無(wú)法做到物盡其用( no way to separate these 3 even if excess capacity existing in one (e.g. Needed more storage but had to add Compute and Network)) 使用外部存儲(chǔ)可以將存儲(chǔ)負(fù)載和Hadoop計(jì)算節(jié)點(diǎn)分離,同時(shí)還獲得了企業(yè)存儲(chǔ)的好處。 S

14、ell the value of XIV, V7000, SVC, etc. 用戶一般會(huì)隨Hadoop File System部署;采用Elastic Storage 可以有很多好處,數(shù)據(jù)加速 Experience the instant results that come from IBM FlashSystem Drive as much as 45X faster analytics results on certain workloads 數(shù)據(jù)負(fù)載的多樣性和靈活性 XIV delivers predictable performance that scales linearly wit

15、hout hotspots delivering insights from analytics faster with tuning-free data distribution Scale-out, parallel processing of Elastic Storage software and integration with FlashSystem dramatically accelerates performance of Analytics clusters Virtual Storage Center with SVC automatically optimizes da

16、ta warehouse performance and cost across Flash and Disk Mainframe Data Environments Integration with DB2 & specialty analytics “engines” leveraging DS8870 delivers 4x reduction in batch times with new High Performance Flash Enclosures High speed encryption on every drive type secures data 數(shù)據(jù)保護(hù)和保留 LT

17、FS EE w/ tape provides reduced TCO by up to 90% over disk for long term retention of data at rest with a large open format tape repository Reduce the amount of data to be stored by up to 25 times with ProtecTIER de-duplication,12x 更快 IBM FlashSystem increased SPLUNK & SAS application efficiency to p

18、erform business analytics,20 x 改善 in actionable supply chain analytics, 4x reduction in batch times, virtualization for plug & play,6x 時(shí)間節(jié)省 “GPFS allows us to move the metadata from the disk to the FlashSystem online. Once we did that, the backups were reduced down to about an hour.”,2 hrs becomes 2

19、 minutes 失效切換時(shí)間大幅縮短,Mapping Characteristics to IBM Storage Products,Storage Infrastructure 需求,適用于所有的5種應(yīng)用場(chǎng)景,Optimized Multi-Temperature Warehouse優(yōu)化的多級(jí)存儲(chǔ)庫(kù) All Flash FlashSystem Hybrid DS8000 EasyTier XIV + SSD Caching Storwize EasyTier FlashSystem Solution (VSC + FlashSystem) PureSystems PureFlex (XIV

20、 or Storwize w/EasyTier) PureData for Transactions (Storwize) PureData for Analytics (Netezza),Midrange & Entry Tier 0 Acceleration,Smarter Storage,Integrated Systems,Enterprise Offerings,XIV,zEnterprise Solutions for Analytics with DS8000,PureData System for Operational Analytics with Storwize,Pure

21、Flex System with Storwize,DS8000,Smart Analytics Systems with DS3xxx,Open & Extensible,Storwize family FlashSystem family,IBM Smarter Storage 的設(shè)計(jì)就是支持大數(shù)據(jù)分析高效和優(yōu)化數(shù)據(jù)基礎(chǔ)架構(gòu),IBM FlashSystem:為大數(shù)據(jù)分析應(yīng)用設(shè)計(jì)的,讓應(yīng)用和數(shù)據(jù)實(shí)現(xiàn)極速,IBM FlashSystem的 極速性能 讓實(shí)時(shí)業(yè)務(wù)決策成為可能 適合于模塊化數(shù)據(jù)存儲(chǔ)結(jié)構(gòu)的Hadoop系統(tǒng)。某些或所有數(shù)據(jù)可以保存到Flash閃存上,其他可以保存到XIV,IBM XIV

22、: Optimized data workload diversity for Big Data & Analytics,IBM XIV 的高性能無(wú)須人工干預(yù)配置,且適用于各種各樣的存儲(chǔ)負(fù)載 IBM XIV 的效率 高的異乎尋常,而且簡(jiǎn)單性業(yè)內(nèi)最高,內(nèi)置友好界面 IBM XIV 的彈性是企業(yè)級(jí)的,完全保證了數(shù)據(jù)的可用性和業(yè)務(wù)連續(xù)性,XIV: 為 Analytics 而生,無(wú)與倫比的性能,可擴(kuò)展的網(wǎng)格存儲(chǔ)架構(gòu) 任意時(shí)間支持任意讀寫負(fù)載 板上的閃存Flash,無(wú)與倫比的可靠性,精致的數(shù)據(jù)分布 無(wú)雙的磁盤重建時(shí)間 企業(yè)級(jí)的可用性,無(wú)與倫比的簡(jiǎn)易性,簡(jiǎn)單的規(guī)劃、供給和靈活性 上線后零維護(hù) 零調(diào)優(yōu),“X

23、IV最吸引我們的地方就是其超強(qiáng)的性能 we正是由于XIV為我們的精細(xì)復(fù)雜的分析應(yīng)用提供了一致的高性能, 使得我們能夠?yàn)槲覀兊挠脩魩?lái)更多的價(jià)值?!?SAS 和 XIV 網(wǎng)格架構(gòu) 完美的結(jié)合,大規(guī)模并行計(jì)算 保持持續(xù)地最佳性能 Balanced Performance性能均衡 常年零調(diào)整 Unprecedented Scalability史無(wú)前例的擴(kuò)展性 配合添加SAS節(jié)點(diǎn)和XIV模塊即可,IBM SVC: Optimized data workload flexibility for Big Data & Analytics,IBM SVC 通過(guò)如下功能在IBM 大數(shù)據(jù)產(chǎn)品線上增加了靈活性:

24、完整和數(shù)據(jù)虛擬化和數(shù)據(jù)移動(dòng)性 高級(jí)集群和復(fù)制 多路鏡像,read preferred option Real Time Compression實(shí)時(shí)壓縮 Easy Tier Hot Extent caching,Storwize V7000/U,IBM SVC,設(shè)計(jì)原則,Real-Time Compression實(shí)時(shí)壓縮是設(shè)計(jì)來(lái)做: 作用于 Active Primary Data 專用的壓縮平臺(tái) Platform handles ALL heavy lifting associated with compression 不會(huì)影響性能 We modify a compressed file in-

25、place efficiently 不會(huì)改變用戶應(yīng)用 Users nor admins need to change anything 處理流程不變 壓縮是在線完成,不是事后壓縮 業(yè)界標(biāo)準(zhǔn)壓縮算法 所采用的壓縮算法已經(jīng)使用了幾十年,Storwize V7000/U,IBM SVC,流處理計(jì)算 & IBM Flash Systems,Data: 是擁有還是保存? 或是是分析和開始行動(dòng)!,Data in,Data at,InfoSphere Streams: 大數(shù)據(jù)流分析,為分析動(dòng)態(tài)數(shù)據(jù)而建 多并發(fā)輸入數(shù)據(jù)流 大規(guī)??蓴U(kuò)展Massive scalability 分析和處理的數(shù)據(jù)多樣化 Struct

26、ured, unstructured, video, audio Advanced analytic operators 自適應(yīng)實(shí)時(shí)分析 With Data Warehouses With Hadoop Systems,Current fact finding當(dāng)前數(shù)據(jù)查詢 分許流動(dòng)中的數(shù)據(jù)在數(shù)據(jù)落盤前 低延遲模式, push model 數(shù)據(jù)驅(qū)動(dòng)真正的數(shù)據(jù)分析,Historical fact finding歷史數(shù)據(jù)查詢 查找和分析存儲(chǔ)在磁盤上的數(shù)據(jù)信息 批處理模式, pull model 查詢驅(qū)動(dòng): submits queries to static data,Traditional Comp

27、uting,Stream Computing,流數(shù)據(jù)計(jì)算代表著計(jì)算模式的變遷,Real-time Analytics,Real Time Analytics實(shí)時(shí)分析想象一下你如何用防火栓喝水,來(lái)自多個(gè)多樣輸入源的大量數(shù)據(jù) 直接處理和過(guò)濾數(shù)據(jù),而不必存儲(chǔ) 僅保存有價(jià)值的數(shù)據(jù) 僅關(guān)聯(lián)對(duì)數(shù)據(jù)最感興趣的用戶 隨著數(shù)據(jù)信息的產(chǎn)生采取行動(dòng),Adaptive Analytics自適應(yīng)分析Data in Motion and Data at Rest的集成,1. Data Ingest,數(shù)據(jù)集成, 數(shù)據(jù)挖掘, 機(jī)器學(xué)習(xí), 統(tǒng)計(jì)建模,實(shí)時(shí)和歷史數(shù)據(jù)洞察力的可視化,3. Adaptive Analytics Mo

28、del,數(shù)據(jù)收取,在線分析準(zhǔn)備,模式校驗(yàn),Data,2. Bootstrap/Enrich,Control flow,InfoSphere BigInsights, Database & Warehouse,InfoSphere Streams,Adaptive Real-Time Analytics自適應(yīng)實(shí)時(shí)分析,來(lái)自多個(gè)多樣輸入源的大量數(shù)據(jù) 過(guò)去、現(xiàn)在和未來(lái)全方位綜合性視圖 實(shí)時(shí)分析,低延時(shí)結(jié)果 Full context for deep analysis深度分析的完整的上下文 跨data in motion and data at rest的常用數(shù)據(jù)分析 自適應(yīng)-隨機(jī)而變 當(dāng)發(fā)現(xiàn)非預(yù)期行

29、為時(shí),自適應(yīng) 當(dāng)識(shí)別出新數(shù)據(jù)意義時(shí)深度分析之 開始沒有意識(shí)到的數(shù)據(jù)意義,隨后才可能意識(shí)到 自適應(yīng)在開始沒有意識(shí)到的,隨后可以找出數(shù)據(jù)模式,Stock market Impact of weather on securities prices Analyze market data at ultra-low latencies Momentum Calculator,Fraud prevention Detecting multi-party fraud Real time fraud prevention,e-Science Space weather prediction Detection

30、 of transient events Synchrotron atomic research Genomic Research,Transportation Intelligent traffic management Automotive Telematics,Energy & Utilities Transactive control Phasor Monitoring Unit Down hole sensor monitoring,Natural Systems Wildfire management Water management,Other Manufacturing Tex

31、t Analysis ERP for Commodities,Telephony CDR processing Social analysis Churn prediction Geomapping,如何使用InfoSphere Streams?,加快數(shù)據(jù)流入分析系統(tǒng)的速度,向交易方向加速。,一個(gè)高效和靈活的基礎(chǔ)架構(gòu)顯然可以加快流速,并平衡不同數(shù)據(jù)分析的需求,大數(shù)據(jù)分析的新式基礎(chǔ)架構(gòu)解決方案,IBM Big Data & Analytics Infrastructure,Data Zone,Application Zone,Experience real-time analytical ins

32、ights with up to 50 x better performance than enterprise disk systems using IBM FlashCore technology Preserve and protect infrastructure continuity while scaling to over 2 petabyte of effective all-flash capacity under a single integrate interface Deliver agility and data economics with 4x greater c

33、apacity in less rack space than competitive all-flash products,Synchronized and Complimentary to Overarching Storage Messaging - Accelerate time to insights through data without borders. IBM innovation frees data with agile and simple to use storage solutions delivering superior data economics,IBM F

34、lashSystem Core Launch Messaging,Drive a complete paradigm shift in Enterprise Storage with the all new IBM FlashSystem Family,IBM FlashSystem Family2015 Theme,Time to insight. Time to value. Time to market.,IBM FlashSystem, its about time.,Flash Realized!,IBM FlashSystem V9000Foundational Pillars,I

35、BM FlashCore Technology is the DNA of the FlashSystem Family,Introducing the New IBM FlashSystem Family Offerings,IBM FlashSystem 900 Extreme Performance: Delivers 100 microsecond response times Macro Efficiency: Lowest latency offering with 40% greater capacity at a lower cost per capacity Enterpri

36、se Reliability: IBM enhanced Micron MLC flash technology with Flash Wear Guarantee,Powered by IBM FlashCore Technology,IBM FlashSystem V9000 Scalable Performance: Grow capacity and performance with up to 2.2PB scaling capability Enduring Economics: Next generation flash media with lower cost per capacity Agile Integration: Fully integrated system management to simplify m

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論