hadoop 常見錯誤及解決辦法_第1頁
hadoop 常見錯誤及解決辦法_第2頁
hadoop 常見錯誤及解決辦法_第3頁
hadoop 常見錯誤及解決辦法_第4頁
hadoop 常見錯誤及解決辦法_第5頁
已閱讀5頁,還剩98頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

hadoophbasehive常見問題解決分類:

Hadoop

hbase

hive

hadnoop集群常見問題2015-03-2410:29

44人閱讀

評論(0)

\o"收藏"收藏

\o"舉報"舉報\t"/lhy66/article/details/_blank"hbase\t"/lhy66/article/details/_blank"hadoop\t"/lhy66/article/details/_blank"hive\t"/lhy66/article/details/_blank"Hadoop常見問題目錄\o"系統(tǒng)根據(jù)文章中H1到H6標(biāo)簽自動生成文章目錄"(?)\o"展開"[+]

安裝過程中,由于網(wǎng)絡(luò)終端,導(dǎo)致下面問題:問題1:安裝停止在獲取安裝鎖

/tmp/scm_prepare_node.tYlmPfrTusingSSH_CLIENTtogettheSCMhostname:03395022

openingloggingfiledescriptor

正在啟動安裝腳本...正在獲取安裝鎖...BEGINflock4

這段大概過了半個小時,一次卸載,一次等了快1個小時,終于過去了,

問題2:不能選擇主機

安裝失敗了,重新不能選主機

圖1

解決方案,需要清理安裝失敗文件

\t"/lhy66/article/details/_blank"卸載ClouderaManager5.1.x.和相關(guān)軟件【官網(wǎng)翻譯:高可用】

問題3:DNS反向解析PTRlocalhost:

描述:

DNS反向解析錯誤,不能正確解析ClouderaManagerServer主機名

日志:DetectingClouderaManagerServer...

DetectingClouderaManagerServer...

BEGINhost-tPTR98

92.domainnamepointerlocalhost.

END(0)

usinglocalhostasscmserverhostname

BEGINwhichpython

/usr/bin/python

END(0)

BEGINpython-c'importsocket;importsys;s=socket.socket(socket.AF_INET);s.settimeout(5.0);s.connect((sys.argv[1],int(sys.argv[2])));s.close();'localhost7182

Traceback(mostrecentcalllast):

File"<string>",line1,in<module>

File"<string>",line1,inconnect

socket.error:[Errno111]Connectionrefused

END(1)

couldnotcontactscmserveratlocalhost:7182,givingup

waitingforrollbackrequest

解決方案:

將連不上的機器/usr/bin/host文件刪掉,執(zhí)行下面命令:1.

sudomv/usr/bin/host/usr/bin/host.bak復(fù)制代碼

說明:不明白cloudera的初衷,這里已經(jīng)得到ClouderaManagerServer的ip了,卻還要把ip解析成主機名來連接由于DNS反向解析沒有配置好,根據(jù)ClouderaManagerServer的ip解析主機名卻得到了localhost,造成之后的連接錯誤這里的解決方案是直接把/usr/bin/host刪掉,這樣ClouderaManager就會直接使用ip進(jìn)行連接,就沒有錯了參考:問題4NTP:

問題描述:BadHealth--ClockOffsetThehost'sNTPservicedidnotrespondtoarequestfortheclockoffset.解決:配置NTP服務(wù)步驟參考:

CentOS配置NTPServer:

\t"/lhy66/article/details/_blank"/centos-ntp/

國內(nèi)常用NTP服務(wù)器地址及IP

\t"/lhy66/article/details/_blank"/note/171309770/

修改配置文件:

[root@work03~]#vim/etc/ntp.conf#Usepublicserversfromtheproject.#Pleaseconsiderjoiningthepool(\t"/lhy66/article/details/_blank"/join.html).serverpreferserverserver

restrictmasknomodify

<===放行局域網(wǎng)來源

啟動ntp

#servicentpdrestart

<===啟動ntp服務(wù)

客戶端同步時間(work02,work03):

ntpdatework01

說明:NTP服務(wù)啟動需要大約五分鐘時間,服務(wù)啟動之前,若客戶端同步時間,則會出現(xiàn)錯誤“noserversuitableforsynchronizationfound”

定時同步時間:

在work02和work03上配置crontab定時同步時間

crontab-e

0012***root/usr/sbin/ntpdate21>>/root/ntpdate.log2>&1

問題2.2

描述:

ClockOffset·

Ensurethatthehost'shostnameisconfiguredproperly.·

Ensurethatport7182isaccessibleontheClouderaManagerServer(checkfirewallrules).·

Ensurethatports9000and9001arefreeonthehostbeingadded.·

Checkagentlogsin/var/log/cloudera-scm-agent/onthehostbeingadded(someofthelogscanbefoundintheinstallationdetails).問題定位:

在對應(yīng)host(work02、work03)上運行'ntpdc-cloopinfo'

[root@work03work]#ntpdc-cloopinfo

ntpdc:read:Connectionrefused解決:

開啟ntp服務(wù):

三臺機器都開機啟動ntp服務(wù)

chkconfigntpdon

問題5heartbeat:錯誤信息:Installationfailed.Failedtoreceiveheartbeatfromagent.解決:關(guān)閉防火墻

問題6UnknowHealth:UnknowHealth

重啟后:RequesttotheHostMonitorfailed.

service--status-all|grepclo

機器上查看scm-agent狀態(tài):cloudera-scm-agentdeadbutpidfileexists

解決:重啟服務(wù)

servicecloudera-scm-agentrestartservicecloudera-scm-serverrestart

問題7canonialnamehostnameconsistent:BadHealthThehostnameandcanonicalnameforthishostarenotconsistentwhencheckedfromaJavaprocess.canonicalname:4092Monitor-HostMonitorthrottling_loggerWARNING

(29skipped)hostnamework02differsfromthecanonical解決:修改hosts使FQDN和hostname相同ps:雖然解決了但是不明白為什么主機名和主機別名要一樣/etc/hosts85work01work0141work02work0298work03work03

問題8ConcerningHealth:ConcerningHealthIssue--

NetworkInterfaceSpeed--描述:Thehosthas2networkinterface(s)thatappeartobeoperatingatlessthanfullspeed.Warningthreshold:any.詳細(xì):Thisisahosthealthtestthatchecksfornetworkinterfacesthatappeartobeoperatingatlessthanfullspeed.

Afailureofthishealthtestmayindicatethatnetworkinterface(s)maybeconfiguredincorrectlyandmaybecausingperformanceproblems.Usetheethtoolcommandtocheckandconfigurethehost'snetworkinterfacestousethefastestavailablelinkspeedandduplexmode.解決:本次測試修改了ClouderaManager的配置,應(yīng)該不算是真正的解決

問題10IOException

thrown

while

collecting

data

from

host:

No

route

to

host原因:agent開啟了防火墻解決:service

iptables

stop問題112、Clouderarecommendssetting/proc/sys/vm/swappinessto0.Currentsettingis60.Usethesysctlcommandtochangethissettingatruntimeandedit/etc/sysctl.confforthissettingtobesavedafterareboot.Youmaycontinuewithinstallation,butyoumayrunintoissueswithClouderaManagerreportingthatyourhostsareunhealthybecausetheyareswapping.Thefollowinghostsareaffected:解決:#echo0>/proc/sys/vm/swappiness

(toapplyfornow)#sysctl-wvm.swappiness=0

(tomakethispersistentacrossreboots)問題12

時鐘不同步(同步至中科大時鐘服務(wù)器10)#echo"03***/usr/sbin/ntpdate10;/sbin/hwclock–w">>/var/spool/cron/root#servicecrondrestart#ntpdate10問題13

Thehost'sNTPservicedidnotrespondtoarequestfortheclockoffset.#servicentpdstart#ntpdc-cloopinfo(thehealthwillbegoodifthiscommandexecutedsuccessfully)問題14

TheClouderaManagerAgentisnotabletocommunicatewiththisrole'swebserver.一種原因是元數(shù)據(jù)數(shù)據(jù)庫無法連接,請檢查數(shù)據(jù)庫配置:

問題15

HiveMetastoreServer無法啟動,修改Hive元數(shù)據(jù)數(shù)據(jù)庫配置(當(dāng)我們修改主機名后即應(yīng)修改元數(shù)據(jù)數(shù)據(jù)庫配置):

問題排查方式一般的錯誤,查看錯誤輸出,按照關(guān)鍵字google異常錯誤(如namenode、datanode莫名其妙掛了):查看hadoop($HADOOP_HOME/logs)或hive日志

hadoop錯誤

問題16datanode無法正常啟動

添加datanode后,datanode無法正常啟動,進(jìn)程一會莫名其妙掛掉,查看namenode日志顯示如下:

Text代碼2013-06-2118:53:39,182FATALorg.apache.hadoop.hdfs.StateChange:BLOCK*NameSystem.getDatanode:Datanodex.x.x.x:50010isattemptingtoreportstorageIDDS-1357535176-x.x.x.x-50010-1371808472808.Nodey.y.y.y:50010isexpectedtoservethisstorage.原因分析:

拷貝hadoop安裝包時,包含data與tmp文件夾(見本人《hadoop安裝》一文),未成功格式化datanode

解決辦法:

Shell代碼rm-rf/data/hadoop/hadoop-1.1.2/datarm-rf/data/hadoop/hadoop-1.1.2/tmphadoopdatanode-format問題17

safemode

Text代碼2013-06-2010:35:43,758ERRORorg.apache.hadoop.security.UserGroupInformation:PriviledgedActionExceptionas:hadoopcause:node.SafeModeException:CannotrenewleaseforDFSClient_hb_rs_,60020,1371631589073.Namenodeisinsafemode.解決方案:

Shell代碼hadoopdfsadmin-safemodeleave問題18連接異常

Text代碼2013-06-2119:55:05,801WARNorg.apache.hadoop.hdfs.server.datanode.DataNode:java.io.IOException:Calltohomename/x.x.x.x:9000failedonlocalexception:java.io.EOFException可能原因:namenode監(jiān)聽:9000,而非:9000或外網(wǎng)IP:9000iptables限制

解決方案:檢查/etc/hosts配置,使得hostname綁定到非的IP上iptables放開端口

問題19namenodeid

Text代碼ERRORorg.apache.hadoop.hdfs.server.datanode.DataNode:java.io.IOException:IncompatiblenamespaceIDsin/var/lib/hadoop-0.20/cache/hdfs/dfs/data:namenodenamespaceID=240012870;datanodenamespaceID=1462711424.

問題:Namenode上namespaceID與datanode上namespaceID不一致。

問題產(chǎn)生原因:每次namenodeformat會重新創(chuàng)建一個namenodeId,而tmp/dfs/data下包含了上次format下的id,namenodeformat清空了namenode下的數(shù)據(jù),但是沒有清空datanode下的數(shù)據(jù),所以造成namenode節(jié)點上的namespaceID與datanode節(jié)點上的namespaceID不一致。啟動失敗。

解決辦法:參考該網(wǎng)址/wh62592855/archive/2010/07/21/5752199.aspx給出兩種解決方法,我們使用的是第一種解決方法:即:

(1)停掉集群服務(wù)

(2)在出問題的datanode節(jié)點上刪除data目錄,data目錄即是在hdfs-site.xml文件中配置的dfs.data.dir目錄,本機器上那個是/var/lib/hadoop-0.20/cache/hdfs/dfs/data/(注:我們當(dāng)時在所有的datanode和namenode節(jié)點上均執(zhí)行了該步驟。以防刪掉后不成功,可以先把data目錄保存一個副本).

(3)格式化namenode.

(4)重新啟動集群。

問題解決。

這種方法帶來的一個副作用即是,hdfs上的所有數(shù)據(jù)丟失。如果hdfs上存放有重要數(shù)據(jù)的時候,不建議采用該方法,可以嘗試提供的網(wǎng)址中的第二種方法。

問題20目錄權(quán)限

start-dfs.sh執(zhí)行無錯,顯示啟動datanode,執(zhí)行完后無datanode。查看datanode機器上的日志,顯示因dfs.data.dir目錄權(quán)限不正確導(dǎo)致:

Text代碼expected:drwxr-xr-x,current:drwxrwxr-x解決辦法:

查看dfs.data.dir的目錄配置,修改權(quán)限即可。

hive錯誤

問題21NoClassDefFoundError

Couldnotinitializeclassjava.lang.NoClassDefFoundError:Couldnotinitializeclassorg.apache.hadoop.hbase.io.HbaseObjectWritable

將protobuf-***.jar添加到j(luò)ars路徑

Xml代碼//$HIVE_HOME/conf/hive-site.xml

hive.aux.jars.path

file:///data/hadoop/hive-0.10.0/lib/hive-hbase-handler-0.10.0.jar,file:///data/hadoop/hive-0.10.0/lib/hbase-0.94.8.jar,file:///data/hadoop/hive-0.10.0/lib/zookeeper-3.4.5.jar,file:///data/hadoop/hive-0.10.0/lib/guava-r09.jar,file:///data/hadoop/hive-0.10.0/lib/hive-contrib-0.10.0.jar,file:///data/hadoop/hive-0.10.0/lib/protobuf-java-2.4.0a.jar

問題22hive動態(tài)分區(qū)異常

[FatalError]OperatorFS_2(id=2):Numberofdynamicpartitionsexceededhive.exec.max.dynamic.partitions.pernode

Shell代碼

hive>sethive.exec.max.dynamic.partitions.pernode=10000;問題23mapreduce進(jìn)程超內(nèi)存限制——hadoopJavaheapspace

vimmapred-site.xml添加:

Xml代碼//mapred-site.xml

mapred.child.java.opts

-Xmx2048m

Shell代碼#$HADOOP_HOME/conf/hadoop_env.shexportHADOOP_HEAPSIZE=5000

問題24hive文件數(shù)限制

[FatalError]totalnumberofcreatedfilesnowis100086,whichexceeds100000

Shell代碼

hive>sethive.exec.max.created.files=655350;問題25hive5.metastore連接超時

Text代碼FAILED:SemanticExceptionorg.apache.thrift.transport.TTransportException:.SocketTimeoutException:Readtimedout解決方案:

Shell代碼hive>sethive.metastore.client.socket.timeout=500;問題26hive6.java.io.IOException:error=7,Argumentlisttoolong

Text代碼

Taskwiththemostfailures(5):

TaskID:

task_201306241630_0189_r_000009

URL:

:50030/taskdetails.jsp?jobid=job_201306241630_0189&tipid=task_201306241630_0189_r_000009DiagnosticMessagesforthisTask:java.lang.RuntimeException:org.apache.hadoop.hive.ql.metadata.HiveException:HiveRuntimeErrorwhileprocessingrow(tag=0){"key":{"reducesinkkey0":"164058872","reducesinkkey1":"djh,S1","reducesinkkey2":"20130117170703","reducesinkkey3":"xxx"},"value":{"_col0":"1","_col1":"xxx","_col2":"20130117170703","_col3":"164058872","_col4":"xxx,S1"},"alias":0}

atorg.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:270)

atorg.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:520)

atorg.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)

atorg.apache.hadoop.mapred.Child$4.run(Child.java:255)

atjava.security.AccessController.doPrivileged(NativeMethod)

atjavax.security.auth.Subject.doAs(Subject.java:415)

atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)

atorg.apache.hadoop.mapred.Child.main(Child.java:249)Causedby:org.apache.hadoop.hive.ql.metadata.HiveException:HiveRuntimeErrorwhileprocessingrow(tag=0){"key":{"reducesinkkey0":"164058872","reducesinkkey1":"xxx,S1","reducesinkkey2":"20130117170703","reducesinkkey3":"xxx"},"value":{"_col0":"1","_col1":"xxx","_col2":"20130117170703","_col3":"164058872","_col4":"djh,S1"},"alias":0}

atorg.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:258)

...7moreCausedby:org.apache.hadoop.hive.ql.metadata.HiveException:[Error20000]:Unabletoinitializecustomscript.

atorg.apache.hadoop.hive.ql.exec.ScriptOcessOp(ScriptOperator.java:354)

atorg.apache.hadoop.hive.ql.exec.Ocess(Operator.java:474)

atorg.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)

atorg.apache.hadoop.hive.ql.exec.SelectOcessOp(SelectOperator.java:84)

atorg.apache.hadoop.hive.ql.exec.Ocess(Operator.java:474)

atorg.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)

atorg.apache.hadoop.hive.ql.exec.ExtractOcessOp(ExtractOperator.java:45)

atorg.apache.hadoop.hive.ql.exec.Ocess(Operator.java:474)

atorg.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:249)

...7moreCausedby:java.io.IOException:Cannotrunprogram"/usr/bin/python2.7":error=7,參數(shù)列表過長

atjava.lang.ProcessBuilder.start(ProcessBuilder.java:1042)

atorg.apache.hadoop.hive.ql.exec.ScriptOcessOp(ScriptOperator.java:313)

...15moreCausedby:java.io.IOException:error=7,參數(shù)列表過長

atjava.lang.UNIXProcess.forkAndExec(NativeMethod)

atjava.lang.UNIXProcess.(UNIXProcess.java:135)

atjava.lang.ProcessImpl.start(ProcessImpl.java:130)

atjava.lang.ProcessBuilder.start(ProcessBuilder.java:1023)

...16more

FAILED:ExecutionError,returncode20000fromorg.apache.hadoop.hive.ql.exec.MapRedTask.Unabletoinitializecustomscript.解決方案:

升級內(nèi)核或減少分區(qū)數(shù)\t"/lhy66/article/details/_blank"/jira/browse/HIVE-2372

問題27hive6.runtimeerror

Shell代碼hive>showtables;FAILED:Errorinmetadata:java.lang.RuntimeException:Unabletoinstantiateorg.apache.hadoop.hive.metastore.HiveMetaStoreClientFAILED:ExecutionError,returncode1fromorg.apache.hadoop.hive.ql.exec.DDLTask問題排查:

Shell代碼hive-hiveconfhive.root.logger=DEBUG,console

Text代碼13/07/1516:29:24INFOhive.metastore:TryingtoconnecttometastorewithURIthrift://xxx.xxx.xxx.xxx:908313/07/1516:29:24WARNhive.metastore:FailedtoconnecttotheMetaStoreS.apache.thrift.transport.TTransportException:.ConnectException:拒絕連接。。。MetaException(message:CouldnotconnecttometastoreusinganyoftheURIsprovided.Mostrecentfailure:org.apache.thrift.transport.TTransportException:.ConnectException:拒絕連接

嘗試連接9083端口,netstat查看該端口確實沒有被監(jiān)聽,第一反應(yīng)是hiveserver沒有正常啟動。查看hiveserver進(jìn)程卻存在,只是監(jiān)聽10000端口。

查看hive-site.xml配置,hive客戶端連接9083端口,而hiveserver默認(rèn)監(jiān)聽10000,找到問題根源了

解決辦法:

Shell代碼hive--servicehiveserver-p9083//或修改$HIVE_HOME/conf/hive-site.xml的hive.metastore.uris部分//將端口改為10000

using/usr/lib/hiveasHIVE_HOMEusing/var/run/cloudera-scm-agent/process/193-hive-HIVEMETASTOREasHIVE_CONF_DIRusing/usr/lib/hadoopasHADOOP_HOMEusing/var/run/cloudera-scm-agent/process/193-hive-HIVEMETASTORE/yarn-confasHADOOP_CONF_DIRERROR:Failedtofindhive-hbasestoragehandlerjarstoaddinhive-site.xml.HivequeriesthatuseHbasestoragehandlermaynotworkuntilthisisfixed.WedOct2218:48:53CST2014JAVA_HOME=/usr/java/jdk1.7.0_45-clouderausing/usr/java/jdk1.7.0_45-clouderaasJAVA_HOMEusing5asCDH_VERSIONusing/usr/lib/hiveasHIVE_HOMEusing/var/run/cloudera-scm-agent/process/193-hive-HIVEMETASTOREasHIVE_CONF_DIRusing/usr/lib/hadoopasHADOOP_HOMEusing/var/run/cloudera-scm-agent/process/193-hive-HIVEMETASTORE/yarn-confasHADOOP_CONF_DIRERROR:Failedtofindhive-hbasestoragehandlerjarstoaddinhive-site.xml.HivequeriesthatuseHbasestoragehandlermaynotworkuntilthisisfixed.WedOct2218:48:55CST2014JAVA_HOME=/usr/java/jdk1.7.0_45-clouderausing/usr/java/jdk1.7.0_45-clouderaasJAVA_HOMEusing5asCDH_VERSIONusing/usr/lib/hiveasHIVE_HOMEusing/var/run/cloudera-scm-agent/process/193-hive-HIVEMETASTOREasHIVE_CONF_DIRusing/usr/lib/hadoopasHADOOP_HOMEusing/var/run/cloudera-scm-agent/process/193-hive-HIVEMETASTORE/yarn-confasHADOOP_CONF_DIRERROR:Failedtofindhive-hbasestoragehandlerjarstoaddinhive-site.xml.HivequeriesthatuseHbasestoragehandlermaynotworkuntilthisisfixed.WedOct2218:48:58CST2014JAVA_HOME=/usr/java/jdk1.7.0_45-clouderausing/usr/java/jdk1.7.0_45-clouderaasJAVA_HOMEusing5asCDH_VERSIONusing/usr/lib/hiveasHIVE_HOMEusing/var/run/cloudera-scm-agent/process/193-hive-HIVEMETASTOREasHIVE_CONF_DIRusing/usr/lib/hadoopasHADOOP_HOMEusing/var/run/cloudera-scm-agent/process/193-hive-HIVEMETASTORE/yarn-confasHADOOP_CONF_DIRERROR:Failedtofindhive-hbasestoragehandlerjarstoaddinhive-site.xml.HivequeriesthatuseHbasestoragehandlermaynotworkuntilthisisfixed.

JAVA_HOME=/usr/java/jdk1.7.0_45-clouderausing/usr/java/jdk1.7.0_45-clouderaasJAVA_HOMEusing5asCDH_VERSIONusing/usr/lib/hiveasHIVE_HOMEusing/var/run/cloudera-scm-agent/process/212-hive-metastore-create-tablesasHIVE_CONF_DIRusing/usr/lib/hadoopasHADOOP_HOMEusing/var/run/cloudera-scm-agent/process/212-hive-metastore-create-tables/yarn-confasHADOOP_CONF_DIRERROR:Failedtofindhive-hbasestoragehandlerjarstoaddinhive-site.xml.HivequeriesthatuseHbasestoragehandlermaynotworkuntilthisisfixed.

查看

/usr/lib/hive是否正常正常的

下午3點21:09.801

FATAL

org.apache.hadoop.hbase.master.HMaster

Unhandledexception.Startingshutdown.java.io.IOException:errororinterruptedwhilesplittinglogsin[hdfs://master:8020/hbase/WALs/slave2,60020,1414202360923-splitting]Task=installed=2done=1error=1

atorg.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:362)

atorg.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:409)

atorg.apache.hadoop.hbase.master.MasterFileSystem.splitMetaLog(MasterFileSystem.java:301)

atorg.apache.hadoop.hbase.master.MasterFileSystem.splitMetaLog(MasterFileSystem.java:292)

atorg.apache.hadoop.hbase.master.HMaster.splitMetaLogBeforeAssignment(HMaster.java:1070)

atorg.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:854)

atorg.apache.hadoop.hbase.master.HMaster.run(HMaster.java:606)

atjava.lang.Thread.run(Thread.java:744)

下午3點46:12.903

FATAL

org.apache.hadoop.hbase.master.HMaster

Unhandledexception.Startingshutdown.java.io.IOException:errororinterruptedwhilesplittinglogsin[hdfs://master:8020/hbase/WALs/slave2,60020,1414202360923-splitting]Task=installed=1done=0error=1

atorg.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:362)

atorg.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:409)

atorg.apache.hadoop.hbase.master.MasterFileSystem.splitMetaLog(MasterFileSystem.java:301)

atorg.apache.hadoop.hbase.master.MasterFileSystem.splitMetaLog(MasterFileSystem.java:292)

atorg.apache.hadoop.hbase.master.HMaster.splitMetaLogBeforeAssignment(HMaster.java:1070)

atorg.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:854)

atorg.apache.hadoop.hbase.master.HMaster.run(HMaster.java:606)

atjava.lang.Thread.run(Thread.java:744)

解決方法:

在hbase-site.xml加入一條,讓啟動hbase集群時不做hlogsplitting

<property>

<name>hbase.master.distributed.log.splitting</name>

<value>false</value>

</property>

[root@master~]#hadoopfs-mv/hbase/WALs/slave2,60020,1414202360923-splitting/

/test[root@master~]#hadoopfs-ls/test

2014-10-2814:31:32,879

INFO[hconnection-0xd18e8a7-shared--pool2-t224](AsyncProcess.java:673)-#3,table=session_service_201410210000_201410312359,attempt=14/35failed1383ops,lastexception:org.apache.hadoop.hbase.RegionTooBusyException:org.apache.hadoop.hbase.RegionTooBusyException:Abovememstorelimit,regionName=session_service_201410210000_201410312359,7499999991,1414203068872.08ee7bb71161cb24e18ddba4c14da0f2.,server=slave1,60020,1414380404290,memstoreSize=271430320,blockingMemStoreSize=268435456

atorg.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2561)

atorg.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:1963)

atorg.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4050)

atorg.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3361)

atorg.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3265)

tobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:26935)

atorg.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175)

atorg.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1879)

ExceptionDescription\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"ClockOutOfSyncException當(dāng)一個RegionServer始終偏移太大時,master節(jié)點結(jié)將會拋出此異常.\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"DoNotRetryIOException用于提示不要再重試的異常子類:如\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"UnknownScannerException.\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"DroppedSnapshotException如果在flush過程中快照內(nèi)容并沒有正確的存儲到文件中時,該異常將被拋出.\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"HBaseIOException所有hbase特定的IOExceptions都是HBaseIOException類的子類.\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"InvalidFamilyOperationExceptionHbase接收修改表schema的請求,但請求中對應(yīng)的列族名無效.\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"MasterNotRunningExceptionmaster節(jié)點沒有運行的異常\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"NamespaceExistException已存在某namespace的異常\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"NamespaceNotFoundException找不到該namsespace的異常\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"NotAllMetaRegionsOnlineException某操作需要所有root及meta節(jié)點同時在線,但實際情況不滿足該操作要求\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"NotServingRegionException向某RegionServer發(fā)送訪問請求,但是它并沒有反應(yīng)或該region不可用.\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"PleaseHoldException當(dāng)某個ResionServer宕掉并由于重啟過快而導(dǎo)致master來不及處理宕掉之前的server實例,或者用戶調(diào)用admin級操作時master正處于初始化狀態(tài)時,或者在正在啟動的RegionServer上進(jìn)行操作時都會拋出此類異常.\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"RegionException訪問region時出現(xiàn)的異常.\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"RegionTooBusyExceptionRegionServer處于繁忙狀態(tài)并由于阻塞而等待提供服務(wù)的異常.\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"TableExistsException已存在某表的異常\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"TableInfoMissingException在table目錄下無法找到.tableinfo文件的異常\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"TableNotDisabledException某個表沒有正常處于禁用狀態(tài)的異常\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"TableNotEnabledException某個表沒有正常處于啟用狀態(tài)的異常\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"TableNotFoundException無法找到某個表的異常\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"UnknownRegionException訪問無法識別的region引起的異常.\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"UnknownScannerException向RegionServer傳遞了無法識別的scannerid的異常.\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"YouAreDeadException當(dāng)一個RegionServer報告它已被處理為dead狀態(tài),由master拋出此異常.\o"classinorg.apache.hadoop.hbase"\t"/lhy66/article/details/_blank"ZooKeeperConnectionException客戶端無法連接到zookeeper的異常.

INFOorg.apache.hadoop.hbase.regionserver.MemStoreFlusherWaited90779msonacompactiontocleanup'toomanystorefiles';ceedingwithflushofsession_service_201410210000_201410312359,7656249951,1414481868315.bbf0a49fb8a9b650a584769ddd1fdd89.

MemStoreFlusher實例生成時會啟動MemStoreFlusher.FlushHandler線程實例,

此線程個數(shù)通過hbase.hstore.flusher.count配置,默認(rèn)為1

一臺機器硬盤滿,一臺機器硬盤不滿的情況:群集中有26,632個副本不足的塊塊。群集中共有84,822個塊。百分比副本不足的塊:31.40%。警告閾值:10.00%。

群集中有27,278個副本不足的塊塊。群集中共有85,476個塊。百分比副本不足的塊:31.91%。警告閾值:10.00%。

下午4點08:53.847INFOorg.apache.hadoop.hbase.regionserver.DefaultStoreFlusherFlushed,sequenceid=45525,memsize=124.2M,hasBloomFilter=true,intotmpfilehdfs://master:8020/hbase/data/default/session_service_201410260000_201410312359/a3b64675b0069b8323665274e2f95cdc/.tmp/b7fa4f5f85354ecc96aa48a09081f786下午4點08:53.862INFOorg.apache.hadoop.hbase.regionserver.HStoreAddedhdfs://master:8020/hbase/data/default/session_service_201410260000_201410312359/a3b64675b0069b8323665274e2f95cdc/f/b7fa4f5f85354ecc96aa48a09081f786,entries=194552,sequenceid=45525,filesize=47.4M下午4點09:00.378WARNorg.apache.hadoop.ipc.RpcServer(responseTooSlow):{"processingtimems":39279,"call":"Scan(tobuf.generated.ClientProtos$ScanRequest)","client":":41284","starttimems":1414656501099,"queuetimems":0,"class":"HRegionServer","responsesize":16,"method":"Scan"}下午4點09:00.379WARNorg.apache.hadoop.ipc.RpcServerRpcServer.respondercallId:33398service:ClientServicemethodName:Scansize:209connection::41284:outputerror下午4點09:00.380WARNorg.apache.hadoop.ipc.RpcServerRpcServer.handler=79,port=60020:caughtaClosedChannelException,thismeansthattheserverwasprocessingarequestbuttheclientwentaway.Theerrormessagewas:null下午4點09:00.381INFOorg.apache.hadoop.hbase.regionserver.HRegionFinishedmemstoreflushof~128.1M/134326016,currentsize=2.4M/2559256forregionsession_service_201410260000_201410312359,6406249959,1414571385831.a3b64675b0069b8323665274e2f95cdc.in8133ms,sequenceid=45525,compactionrequested=false下

問題28hbase

hbase.hre

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論