Oracle英文版培訓課件之Data Warehouse:les04_第1頁
Oracle英文版培訓課件之Data Warehouse:les04_第2頁
Oracle英文版培訓課件之Data Warehouse:les04_第3頁
Oracle英文版培訓課件之Data Warehouse:les04_第4頁
Oracle英文版培訓課件之Data Warehouse:les04_第5頁
已閱讀5頁,還剩36頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領

文檔簡介

Extraction,Transformation,

andLoading(ETL)ExtractionandTransportationObjectivesAftercompletingthislesson,youshouldbeabletodothefollowing:DescribethecoreETLframeworkinsidethedatabaseanditsintegrationadvantageExplaindatawarehousingextractionmethodsIdentifytransportationmethods:FlatfileDistributesoperationsTransportabletablespacesDescribetransformationflowOverviewLesson4:Extraction/TransportationLesson5:LoadingLesson6:TransformationWhatIsETL?ETLisanacronymforExtraction,Transformation,andLoading.ThefollowinghappenduringtheETLprocess:Thedesireddataisidentifiedandextractedfrommanydifferentsources.Sometransformationsmaytakeplaceduringthisextractionprocess.Afterextraction,thedatamustbetransportedtoatargetsystemoranintermediatesystemforfurtherprocessing.Dependingonthemethodoftransportation,sometransformationscanbedonesimultaneously.ETLreferstoabroadprocess.ExtractionMethodsExtractioncanbethoughtofintwoparts:ExtractionTransportationTherearetwoextractionmethods:LogicalPhysicalYourlogicalchoiceinfluencesthewaythedataisphysicallyextracted.Somecriteriaforchoosingacombination:BusinessneedsLocationofthesourceandtargetsystemsAvailabilityofthesourcesystemTimerequiredtoextractdataLogicalExtractionMethodsTherearetwokindsoflogicalextraction:FullextractionAlldataispulledLessinformationtotrackMoretimerequiredtopullthedataIncrementalextractionAsubsetofdataispulledMusttrackwhatdataneedstobepulledLesstimerequiredtopullthedataPhysicalExtractionMethodsTherearetwotypesofphysicalextraction.Onlineextraction:PullsdatafromthesourcesystemOfflineextraction:PullsdatafromastagingareaStagingareasincludeflatfiles,dumpfiles,andtransportabletablespaces.OfflineExtractionStagingareas:FlatfilesRequiresdatainapredefined,genericformatDumpfilesMustbeinanOracle-specificformatRedoandarchivelogsDatalocatedinspecialdumpfilesTransportabletablespacesPowerful,fastmethodformovinglargevolumesofdataImplementingMethodsofExtractionExtractingtoafile:SpoolingfromSQL*PlusUsingOCIorPro*CtodumptoafileUsingDataPumptoexporttoanOracledumpfileUsingexternaltablesExtractingthroughdistributedoperationsIncrementalExtractionUsingCDCCDCcancaptureandpublishcommittedchangedataineitherofthefollowingmodes:SynchronousTriggersonthesourcedatabaseallowchangedatatobecapturedimmediately.Changedataiscapturedaspartofthetransactionmodifyingthesourcetable.AsynchronousChangedataiscapturedafteraSQLstatementperformingDMLiscommittedusingtheredologs.AsynchronousChangeDataCaptureisbuiltonOracleStreams.PublishandSubscribeModelThepublisherperformsthefollowingtasks:IdentifiessourcetablesfromwhichthedatawarehouseisinterestedincapturingchangedataUsestheDBMS_CDC_PUBLISHpackageto:SetupthecaptureofdatafromthesourcetablesDetermineandadvancethechangesetsPublishthechangedataAllowscontrolledaccesstosubscribersusingtheSQLGRANT

andREVOKE

statementsPublishandSubscribeModelThesubscriberusestheDBMS_CDC_SUBSCRIBEpackageto:SubscribetosourcetablesExtendthewindowandcreatechangeviewPreparethesubscriberviewsViewdatastoredinchangetablesPurgethesubscriberviewRemovethesubscriberviewsSynchronousCDCSourcetablesSourcedatabasetransactionsSYNC_SOURCEChangesourceChangesetChangetablesSubscriber

viewsTriggerexecutionSourcedatabaseAsynchronousCDCAsynchronousCDC:CaptureschangedatafromredologfilesafterchangeshavebeencommittedtothesourcedatabaseModesaredependentonthelevelofsupplementalloggingusedonthesourcedatabaseUsesOracleStreamstocapturechangedatafromredologfilesHasthreesourcemodes:AsynchronousAutoLogmodeAsynchronousHotLogmodeAsynchronousDistributedHotLogmodeAsynchronousAutoLogModeSourcetablesSourcedatabasetransactionsLGWROnlineredologsDistributedAutoLogchangesetChangesetChange

tablesSubscriber

viewsSourcedatabaseStagingdatabaseDistributedAutoLogchangesourceRFSStandbyredologsStreams

captureLOG_ARCHIVE_DEST_2AsynchronousHotLogConfigurationSourcetablesSourcedatabasetransactionsHOTLOG_SOURCEChangeSourceChangesetChangetablesSubscriber

viewsStreamslocalcaptureSourcedatabaseLGWROnlineredologsAsynchronousDistributedHotLogModeSourcetablesSourcedatabasetransactionsLGWROnlineredologsDistributedHotLogchangesetChangesetChangetablesSubscriber

viewsSourcedatabaseStagingdatabaseDBlinkDistributedHotLogchangesourceDBlinkStreamspropagationPreparingtoPublishChangeData1. Gatherrequirementsfromthesubscribers.2. Determinewhichsourcedatabasecontainstherelevantsourcetables.3. Choosethecapturemode:SynchronousAsynchronousHotLogAsynchronousDistributedHotLogAsynchronousAutoLogEnsurethatthesourceandstagingdatabaseshaveappropriatedatabaseinitializationparametersset.Setupdatabaselinksbetweenthesourcedatabaseandthestagingdatabase.CreatingaPublisherUserThestagingdatabasepublishermustbegrantedthefollowingprivilegesandroles:EXECUTE_CATALOG_ROLEprivilegeSELECT_CATALOG_ROLEprivilegeCREATETABLEand

CREATESESSIONprivilegesEXECUTEonthe

DBMS_CDC_PUBLISHpackageCreateadefaulttablespacefor

thepublisher.SynchronousPublishing1. Createachangeset.2. Createachangetable.3. Grantaccesstosubscribers.BEGINDBMS_CDC_PUBLISH.CREATE_CHANGE_SET(change_set_name=>'CHICAGO_DAILY',description=>'Changesetforsaleshistoryinfo',change_source_name=>'SYNC_SOURCE');END;GRANTSELECTONducts_ctTOsubscriber1;AsynchronousDistributed

HotLogPublishingPreparethesourceandstagingdatabases:1. ConfigureOracleNetsothatthesourcedatabasecancommunicatewiththestagingdatabase.2. Setinitializationparametersonthesourcedatabase.

3. Spatible=10.2.0global_names=truejob_queue_processes=<currentvalue>+2open_links=4parallel_max_servers=<currentvalue>+3processes=<currentvalue>+4sessions=<currentvalue>+1streams_pool_size=<currentvalue>+20MBundo_retention=3600AsynchronousDistributed

HotLogPublishingPreparethestagingdatabase:Spatible=10.2.0global_names=truejava_pool_size=50000000open_links=4job_queue_processes=2parallel_max_servers=<current_value>+2processes=<current_value>+3sessions=<currentvalue>+1streams_pool_size=<current_value>+11MBundo_retention=3600AsynchronousDistributed

HotLogPublishingAlterthesourcedatabase:1. PlacethedatabaseintoFORCELOGGINGloggingmodetoprotectagainstunloggeddirectwrites.2. Enablesupplementallogging.3. Createanunconditionalloggrouponallcolumnstobecapturedinthesourcetable.ALTERDATABASEFORCELOGGING;ALTERDATABASEADDSUPPLEMENTALLOGDATA;ALTERTABLESH.PRODUCTSADDSUPPLEMENTALLOGGROUPlog_group_products(PROD_ID,PROD_NAME,PROD_LIST_PRICE)ALWAYS;AsynchronousDistributed

HotLogPublishingPublisherprivilegesonsourceandstagingdatabases:1. Createandgrantprivilegestothesourcedatabasepublisher.2. Createandgrantprivilegestothestagingdatabasepublisher.CREATEUSERsource_cdcpubIDENTIFIEDBYsource_cdcpubQUOTAUNLIMITEDONSYSTEMQUOTAUNLIMITEDONSYSAUX;GRANTCREATESESSIONTOsource_cdcpub;GRANTDBATOsource_cdcpub;GRANTCREATEDATABASELINKTOsource_cdcpub;GRANTEXECUTEonDBMS_CDC_PUBLISHTOsource_cdcpub;GRANTEXECUTE_CATALOG_ROLETOsource_cdcpub;GRANTSELECT_CATALOG_ROLETOsource_cdcpub;EXECUTEDBMS_STREAMS_AUTH.GRANT_ADMIN_PRIVILEGE(GRANTEE=>'source_cdcpub');AsynchronousDistributed

HotLogPublishingCreatesourceandstagingdatabaselinks:1. Createthesourcedatabaselink.2. Createthestagingdatabaselink.CREATEDATABASELINKstaging_dbCONNECTTOstaging_cdcpubIDENTIFIEDBYstaging_cdcpubUSING'staging_db';CREATEDATABASELINKsource_dbCONNECTTOsource_cdcpubIDENTIFIEDBYsource_cdcpubUSING'source_db';AsynchronousDistributed

HotLogPublishingCreatechangesourcesandchangesets:1. Createthechangesources.2. Createthechangesets.BEGINDBMS_CDC_PUBLISH.CREATE_HOTLOG_CHANGE_SOURCE(change_source_name=>'CHICAGO',description=>'testsource',source_database=>'source_db');END;DBMS_CDC_PUBLISH.CREATE_CHANGE_SET(change_set_name=>'CHICAGO_DAILY',description=>'changesetforproductinfo',change_source_name=>'CHICAGO',stop_on_ddl=>'y');END;AsynchronousDistributed

HotLogPublishingCreatethechangetablesonthestagingdatabase:BEGINDBMS_CDC_PUBLISH.CREATE_CHANGE_TABLE(owner=>'staging_cdcpub',change_table_name=>'products_ct',change_set_name=>'CHICAGO_DAILY',source_schema=>'SH',source_table=>'PRODUCTS',column_type_list=>'PROD_IDNUMBER(6),PROD_NAMEVARCHAR2(50),PROD_LIST_PRICENUMBER(8,2),JOB_IDVARCHAR2(10),DEPARTMENT_IDNUMBER(4)',capture_values=>'both',rs_id=>'y',row_id=>'n',...options_string=>'TABLESPACETS_CHICAGO_DAILY');END;AsynchronousDistributed

HotLogPublishingEnablethechangesourceandchangeset:1. Enablethechangesource.2. Enablethechangeset.3. Grantaccesstosubscribers.BEGINDBMS_CDC_PUBLISH.ALTER_HOTLOG_CHANGE_SOURCE(change_source_name=>'CHICAGO',enable_source=>'Y');END;BEGINDBMS_CDC_PUBLISH.ALTER_CHANGE_SET(change_set_name=>'CHICAGO_DAILY',enable_capture=>'y');END;SubscribingtoChangeData1. Findthesourcetablesforwhichthesubscriberhasaccessprivileges.2. Findthechangesetnamesandcolumnsforwhichthesubscriberhasaccessprivileges.SQL>SELECT*FROMALL_SOURCE_TABLES;SOURCE_SCHEMA_NAMESOURCE_TABLE_NAME------------------------------------SHPRODUCTSSQL>SELECTUNIQUECHANGE_SET_NAME,COLUMN_NAME,PUB_IDFROM2ALL_PUBLISHED_COLUMNSWHERESOURCE_SCHEMA_NAME='SH'AND3SOURCE_TABLE_NAME='PRODUCTS';CHANGE_SET_NAMECOLUMN_NAMEPUB_ID----------------------------------------------CHICAGO_DAILYPROD_ID41494CHICAGO_DAILYPROD_LIST_PRICE41494CHICAGO_DAILYPROD_NAME41494SubscribingtoChangeData3. Createasubscription.4. Subscribetoasourcetableandcolumns.BEGINDBMS_CDC_SUBSCRIBE.CREATE_SUBSCRIPTION(change_set_name=>'CHICAGO_DAILY',description=>'ChangedataforPRODUCTS',subscription_name=>'SALES_SUB');END;BEGINDBMS_CDC_SUBSCRIBE.SUBSCRIBE(subscription_name=>'SALES_SUB',source_schema=>'SH',source_table=>'PRODUCTS',column_list=>'PROD_ID,PROD_NAME,PROD_LIST_PRICE',subscriber_view=>'SALES_VIEW');END;SubscribingtoChangeData5. Activatethesubscription.6. Getthenextsetofchangedata.BEGINDBMS_CDC_SUBSCRIBE.ACTIVATE_SUBSCRIPTION(subscription_name=>'SALES_SUB');END;BEGINDBMS_CDC_SUBSCRIBE.EXTEND_WINDOW(subscription_name=>'SALES_SUB');END;SubscribingtoChangeData7. Querythesubscriberviews.8. Indicatethatthechangedataisnolongerneeded.9. Endthesubscription.SELECTPROD_ID,PROD_NAME,PROD_LIST_PRICEFROMSALES_VIEW;PROD_IDPROD_NAMEPROD_LIST_PRICE------------------------------------------------------30And2CrosscourtTeeKids14.9930And2CrosscourtTeeKids17.6610Gurfield&MurksPleatedTrousers17.9910Gurfield&MurksPleatedTrousers21.99BEGINDBMS_CDC_SUBSCRIBE.PURGE_WINDOW(subscription_name=>'SALES_SUB');END;AsynchronousDistributedHotLog

SourceDatabaseInitializationParameters

ForallOracleDatabase10greleases:3600UNDO_RETENTIONThecurrentvalue+(thenumberofchangesourcesplanned)SESSIONSThecurrentvalue+(4timesthenumberofchangesourcesplanned)PROCESSESThecurrentvalue+(3timesthenumberofchangesourcesplanned)PARALLEL_MAX_SERVERSShouldbeequaltothenumberofDistributedHotLogchangesourcesplannedOPEN_LINKSMaximumnumberofDBMS_JOBjobsthatcanrunsimultaneouslyplus2JOB_QUEUE_PROCESSESTRUEGLOBAL_NAMES10.2.0or10.0.0COMPATIBLE

ValueParameterAsynchronousDistributedHotLog

SourceDatabaseInitializationParameters

ForOracle9.2databases:Thecurrentvalue+(thenumberofchangesourcesplanned)PROCESSESThecurrentvalue+(3timesthenumberofchangesourcesplanned)PARALLEL_MAX_SERVERSThenumberofDistributedHotLogchangesourcesplannedOPEN_LINKSThenumberofchangesourcesplannedLOGMNR_MAX_PERSISTENT_SESSIONS1LOG_PARALLELISMMaximumnumberofDBMS_JOBjobsthatcanrunsimultaneouslyplus2JOB_QUEUE_PROCESSESTRUEGLOBAL_NAMES9.2.0COMPATIBLE

ValueParameterAsynchronousDistributedHotLog

StagingDatabaseInitializationParametersForOracleDatabase10gRelease2:Settothecurrentvalue+((thenumberofchangesourcesplanned)*(11MB))STREAMS_POOL_SIZEThecurrentvalue+(thenumberofchangesourcesplanned)SESSIONSThecurrentvalue+(3timesthenumberofchangesourcesplanned)PROCESSESThecurrentvalue+(2timesthenumberofchangesourcesplanned)PARALLEL_MAX_SERVERSEqualtothenumberofDistributedHotLogchangesourcesplanned,butnolessthan4OPEN_LINKS50000000JAVA_POOL_SIZETRUEGLOBAL_NAMES10.2.0COMPATIBLE

ValueParameterDataDictionaryViewsSupportingCDCCHANGE_SOURCESlistsexistingchangesources.CHANGE_SETSlistsexistingchangesets.CHANGE_PROPAGATIONSdescribesthestreamspropagationassociatedwithagivendistributedHotLogchangesourceonthesourcedatabase.CHANGE_TABLESlistsexistingchangetables.DBA_SOURCE_TABLESlistspublishedsourcetables.DBA_PUBLISHED_COLUMNSlistspublishedsourcetablecolumns.DBA_SUBSCRIPTIONSlistsallregisteredsubscriptions.DBA_SUBSCRIBED_TABLESlistspublishedtablestowhichsubscribershavesubscribed.DBA_SUBSCRIBED_COLUMNSliststhecolumnsoftablestowhichsubscribershavesubscribed.TransportationinaDataWareh

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經權益所有人同意不得將文件中的內容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
  • 6. 下載文件中如有侵權或不適當內容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論