常見(jiàn)硬件故障舉例 sun_第1頁(yè)
常見(jiàn)硬件故障舉例 sun_第2頁(yè)
常見(jiàn)硬件故障舉例 sun_第3頁(yè)
常見(jiàn)硬件故障舉例 sun_第4頁(yè)
常見(jiàn)硬件故障舉例 sun_第5頁(yè)
已閱讀5頁(yè),還剩6頁(yè)未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶(hù)提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、常見(jiàn)硬件故障舉例1, 系統(tǒng)日志/var/adm/messages*中報(bào)出cpu,內(nèi)存出現(xiàn)aft類(lèi)報(bào)錯(cuò),應(yīng)及時(shí)保修更換。查看cpu數(shù)量是否正確:psrinfo0 on-line since 01/26/07 11:22:072 on-line since 01/26/07 11:22:0516 on-line since 01/26/07 11:22:0718 on-line since 01/26/07 11:22:07或者/usr/platform/sun4u/sbin/prtdiag v 可以看到比較詳細(xì)的系統(tǒng)硬件配置。prtdiag -v | moresystem configurati

2、on: sun microsystems sun4u sun fire v440system clock frequency: 177 mhzmemory size: 4gb =cpus = e$ cpu cpu temperaturecpu freq size implementation mask die amb. status location- - - - - - - - -0 1593 mhz 1mb sunw,ultrasparc-iiii 3.4 - - online - 1 1593 mhz 1mb sunw,ultrasparc-iiii 3.4 - - online - (

3、下面略) a,cpu報(bào)錯(cuò)信息舉例,例子中說(shuō)明cpu18出現(xiàn)錯(cuò)誤:jun 27 17:50:30 v440 sunw,ultrasparc-iv: id 289920 notice: aft0 ucc event detected by cpu18 in user mode at tl=0, errid 0x00420f56.380eacb0jun 27 17:50:30 v440 afsr 0x00000400.00000026 afar 0x000000a0.b2532b20jun 27 17:50:30 v440 fault_pc 0xfe1696a8 esynd 0x

4、0026jun 27 17:50:30 v440 sunw,ultrasparc-iv: id 173042 aft0 errid 0x00420f56.380eacb0 data bit 19 was in error and correctedjun 27 17:50:30 v440 sunw,ultrasparc-iv: id 832860 aft2 errid 0x00420f56.380eacb0 pa=0x000000a0.b2532b00jun 27 17:50:30 v440 e$tag 0x000004a0.b2400001 e$sta

5、te_0 sharedjun 27 17:50:30 v440 sunw,ultrasparc-iv: id 895151 aft2 e$data (0x00) 0xfe409880.fe40b1e4 0xfe4133f8.fe1ee4a4 ecc 0x03fjun 27 17:50:30 v440 sunw,ultrasparc-iv: id 895151 aft2 e$data (0x10) 0xfe409370.fe40b120 0xfe40d77c.fe418850 ecc 0x17ajun 27 17:50:30 v440 sunw,ultra

6、sparc-iv: id 895151 aft2 e$data (0x20) 0xfe410fb8.fe40c874 0xfe07bdd0.fe406ad0 ecc 0x1a8jun 27 17:50:30 v440 sunw,ultrasparc-iv: id 895151 aft2 e$data (0x30) 0xfe40fe14.fe4104c0 0xfe40df10.fe4052c4 ecc 0x14ejun 27 17:50:30 v440 sunw,ultrasparc-iv: id 929717 aft2 d$ data

7、 not availablejun 27 17:50:30 v440 sunw,ultrasparc-iv: id 335345 aft2 i$ data not availableb,內(nèi)存報(bào)錯(cuò)舉例,可以看出/n0/sb4/p3/e1 j7300 這根內(nèi)存有問(wèn)題。may 14 17:39:20 hdb-lc lw8: id 408692 kern.notice main, up 153 days 12:05:38, memory 8,512,064may 14 21:39:20 hdb-lc lw8: id 994892 kern.notice main, up 153 d

8、ays 16:05:38, memory 8,208,768may 14 22:25:38 hdb-lc sunw,ultrasparc-iv: id 838864 notice: aft0 first error ucc event detected by cpu19 in user mode at tl=0, errid 0x002f28e2.e95593c0may 14 22:25:38 hdb-lc afsr 0x00000400.00000001 afar 0x00000023.f67bce70may 14 22:25:38 hdb-lc fault_pc 0x1

9、00fb8e60 esynd 0x0001 /n0/sb4/p3/e1 j7300may 14 22:25:38 hdb-lc sunw,ultrasparc-iv: id 450664 aft0 errid 0x002f28e2.e95593c0 check bit 0 was in error and correctedmay 14 22:25:38 hdb-lc sunw,ultrasparc-iv: id 248406 aft2 errid 0x002f28e2.e95593c0 pa=0x00000023.f67bce40may 14 22:2

10、5:38 hdb-lc e$tag 0x0000008f.d9249049 e$state_1 sharedmay 14 22:25:38 hdb-lc sunw,ultrasparc-iv: id 895151 aft2 e$data (0x00) 0xde10a064.80a3e000 0x1240027e.01000000 ecc 0x02amay 14 22:25:38 hdb-lc sunw,ultrasparc-iv: id 895151 aft2 e$data (0x10) 0xd80d601e.80a32005 0x0240015c.80

11、a7202b ecc 0x13amay 14 22:25:38 hdb-lc sunw,ultrasparc-iv: id 895151 aft2 e$data (0x20) 0x124000f5.01000000 0xd2176000.d406a030 ecc 0x186may 14 22:25:38 hdb-lc sunw,ultrasparc-iv: id 895151 aft2 e$data (0x30) 0x80a2400a.1640002d 0x01000000.c65e6180 ecc 0x1e8may 14 22:25:38 hdb-lc

12、 sunw,ultrasparc-iv: id 929717 aft2 d$ data not availablemay 14 22:25:38 hdb-lc sunw,ultrasparc-iv: id 335345 aft2 i$ data not availablemay 14 22:25:49 hdb-lc sunw,ultrasparc-iv: id 828558 notice: aft0 ucc event detected by cpu19 in user mode at tl=0, errid 0x002f28e2.e

13、95593c0may 14 22:25:49 hdb-lc afsr 0x00200400.00000001 afar 0x00000023.f67bce70may 14 22:25:49 hdb-lc fault_pc 0x100fb8e60 esynd 0x0001 /n0/sb4/p3/e1 j7300may 14 22:25:49 hdb-lc sunw,ultrasparc-iv: id 450664 aft0 errid 0x002f28e2.e95593c0 check bit 0 was in error and correctedmay 14 22:25:

14、49 hdb-lc sunw,ultrasparc-iv: id 248406 aft2 errid 0x002f28e2.e95593c0 pa=0x00000023.f67bce40may 14 22:25:49 hdb-lc e$tag 0x0000008f.d9249049 e$state_1 sharedmay 14 22:25:49 hdb-lc sunw,ultrasparc-iv: id 895151 aft2 e$data (0x00) 0xde10a064.80a3e000 0x1240027e.01000000 ecc 0x02am

15、ay 14 22:25:49 hdb-lc sunw,ultrasparc-iv: id 895151 aft2 e$data (0x10) 0xd80d601e.80a32005 0x0240015c.80a7202b ecc 0x13amay 14 22:25:49 hdb-lc sunw,ultrasparc-iv: id 895151 aft2 e$data (0x20) 0x124000f5.01000000 0xd2176000.d406a030 ecc 0x186may 14 22:25:49 hdb-lc sunw,ultrasparc-

16、iv: id 895151 aft2 e$data (0x30) 0x80a2400a.1640002d 0x01000000.c65e6180 ecc 0x1e8may 14 22:25:49 hdb-lc sunw,ultrasparc-iv: id 929717 aft2 d$ data not availablemay 14 22:25:49 hdb-lc sunw,ultrasparc-iv: id 335345 aft2 i$ data not available2, 硬盤(pán)報(bào)錯(cuò)a, 用format命令,對(duì)應(yīng)磁盤(pán)條目出現(xiàn)“t

17、ype unknown”或者“ driver not found”。b, 命令iostat en 輸出查看磁盤(pán)信息,注意media error中的數(shù)值是否為0。c, 對(duì)于用sds做的軟raid,出現(xiàn)下面幾種情況,則應(yīng)及時(shí)保修。 命令 metadb 的輸出中出現(xiàn)大寫(xiě)字母打頭的行。 命令 metastat 的輸出中,對(duì)應(yīng)的raid卷狀態(tài)出現(xiàn)非ok狀態(tài)提示。 系統(tǒng)日志中出現(xiàn)關(guān)于meta的告警信息。 舉例正常的metadb和metastat輸出和系統(tǒng)日志中的告警信息。bash-2.03# metadb flags first blk block count a m p luo 16 1034 /dev

18、/dsk/c1t0d0s7 a p luo 1050 1034 /dev/dsk/c1t0d0s7 a p luo 2084 1034 /dev/dsk/c1t0d0s7 a p luo 16 1034 /dev/dsk/c1t1d0s7 a p luo 1050 1034 /dev/dsk/c1t1d0s7 a p luo 2084 1034 /dev/dsk/c1t1d0s7bash-2.03# metastat | mored0: mirror submirror 0: d1 state: okay submirror 1: d2 state: okay pass: 1 read opt

19、ion: roundrobin (default) write option: parallel (default) size: 55092864 blocksd1: submirror of d0 state: okay size: 55092864 blocks stripe 0: device start block dbase state hot spare c1t0d0s0 0 no okay d2: submirror of d0 state: okay size: 55092864 blocks stripe 0: device start block dbase state h

20、ot spare c1t1d0s0 0 no okay nov 13 20:25:23 v440 md_stripe: id 641072 kern.warning warning: md: d32: read error on /dev/dsk/c1t1d0s3nov 13 20:25:24 v440 md_mirror: id 104909 kern.warning warning: md: d32: /dev/dsk/c1t1d0s3 needs maintenancec, 系統(tǒng)日志/var/adm/messages* 中出現(xiàn)磁盤(pán)的block報(bào)錯(cuò)信息。例如:nov 13 20:25:18

21、 v440 scsi: id 107833 kern.warning warning: /pci9,600000/sunw,qlc2/fp0,0/ssdw500000e0114799c1,0 (ssd0):nov 13 20:25:18 v440 error for command: read(10) error level: retryablenov 13 20:25:18 v440 scsi: id 107833 kern.notice requested block: 111216288 error block: 111216305nov 13 20:25:18 v440 scsi: i

22、d 107833 kern.notice vendor: fujitsu serial number: 0530c049em nov 13 20:25:18 v440 scsi: id 107833 kern.notice sense key: media errornov 13 20:25:18 v440 scsi: id 107833 kern.notice asc: 0x11 (), ascq: 0x1, fru: 0x0nov 13 20:25:19 v440 scsi: id 243001 kern.warning warning: /pci9,600000/sunw,qlc2/fp

23、0,0 (fcp0):nov 13 20:25:19 v440 fcp: wwn 0x500000e0114799c1 reset successfullynov 13 20:25:19 v440 scsi: id 107833 kern.warning warning: /pci9,600000/sunw,qlc2/fp0,0/ssdw500000e0114799c1,0 (ssd0):nov 13 20:25:19 v440 error for command: read(10) error level: retryablenov 13 20:25:19 v440 scsi: id 107

24、833 kern.notice requested block: 111216288 error block: 111216305nov 13 20:25:19 v440 scsi: id 107833 kern.notice vendor: fujitsu serial number: 0530c049em nov 13 20:25:19 v440 scsi: id 107833 kern.notice sense key: media errornov 13 20:25:19 v440 scsi: id 107833 kern.notice asc: 0x11 (), ascq: 0x1,

25、 fru: 0x0nov 13 20:25:20 v440 scsi: id 243001 kern.warning warning: /pci9,600000/sunw,qlc2/fp0,0/ssdw500000e0114799c1,0 (ssd0):nov 13 20:25:20 v440 scsi transport failed: reason reset: retrying commandnov 13 20:25:22 v440 scsi: id 107833 kern.warning warning: /pci9,600000/sunw,qlc2/fp0,0/ssdw500000e

26、0114799c1,0 (ssd0):nov 13 20:25:22 v440 error for command: read(10) error level: retryablenov 13 20:25:22 v440 scsi: id 107833 kern.notice requested block: 111216288 error block: 111216305nov 13 20:25:22 v440 scsi: id 107833 kern.notice vendor: fujitsu serial number: 0530c049em nov 13 20:25:22 v440

27、scsi: id 107833 kern.notice sense key: media errornov 13 20:25:22 v440 scsi: id 107833 kern.notice asc: 0x11 (), ascq: 0x1, fru: 0x03, 網(wǎng)絡(luò)接口的問(wèn)題 例如網(wǎng)絡(luò)時(shí)斷時(shí)通,會(huì)在系統(tǒng)日志/var/adm/messages*中產(chǎn)生如下日志: mar 16 23:12:30 v440 genunix: id 408789 kern.warning warning: ce0: fault detected external to device; service degra

28、dedmar 16 23:12:30 v440 genunix: id 451854 kern.warning warning: ce0: xcvr addr:0x01 - link downmar 16 23:14:06 v440 genunix: id 408789 kern.notice notice: ce0: fault cleared external to device; service availablemar 16 23:14:06 v440 genunix: id 451854 kern.notice notice: ce0: xcvr addr:0x01 - link u

29、p 100 mbps full duplexmar 16 23:14:16 v440 genunix: id 408789 kern.warning warning: ce0: fault detected external to device; service degradedmar 16 23:14:16 v440 genunix: id 451854 kern.warning warning: ce0: xcvr addr:0x01 - link downmar 16 23:14:54 v440 genunix: id 408789 kern.notice notice: ce0: fa

30、ult cleared external to device; service availablemar 16 23:14:54 v440 genunix: id 451854 kern.notice notice: ce0: xcvr addr:0x01 - link up 100 mbps full duplexmar 16 23:51:39 v440 genunix: id 408789 kern.warning warning: ce0: fault detected external to device; service degradedmar 16 23:51:39 v440 ge

31、nunix: id 451854 kern.warning warning: ce0: xcvr addr:0x01 - link downmar 16 23:53:11 v440 genunix: id 408789 kern.notice notice: ce0: fault cleared external to device; service availablemar 16 23:53:11 v440 genunix: id 451854 kern.notice notice: ce0: xcvr addr:0x01 - link up 100 mbps full duplexmar

32、16 23:53:23 v440 genunix: id 408789 kern.warning warning: ce0: fault detected external to device; service degradedmar 16 23:53:23 v440 genunix: id 451854 kern.warning warning: ce0: xcvr addr:0x01 - link downmar 16 23:54:02 v440 genunix: id 408789 kern.notice notice: ce0: fault cleared external to de

33、vice; service availablemar 16 23:54:02 v440 genunix: id 451854 kern.notice notice: ce0: xcvr addr:0x01 - link up 100 mbps full duplex4, 通過(guò)系統(tǒng)配置命令判斷硬件故障信息。可以查看系統(tǒng)的cpu,內(nèi)存,pci,i/o,風(fēng)扇,電源模塊,溫度,obp版本以及各種指示燈的詳細(xì)狀態(tài)。比如壞了一個(gè)電源,則電源狀態(tài)那里會(huì)顯示fault,而不是ok。舉例v440的prtdiag v 輸出信息:rootv440# /usr/platform/sun4u/sbin/prtdiag

34、-v | moresystem configuration: sun microsystems sun4u sun fire v440system clock frequency: 177 mhzmemory size: 4gb = cpus = e$ cpu cpu temperaturecpu freq size implementation mask die amb. status location- - - - - - - - - 0 1593 mhz 1mb sunw,ultrasparc-iiii 3.4 - - online - 1 1593 mhz 1mb sunw,ultra

35、sparc-iiii 3.4 - - online - = io devices =bus freq slot + name +type mhz status path model- - - - -pci 66 mb pci108e,abba (network) sunw,pci-ce okay /pci1c,600000/network2pci 66 pci2 pci100b,35 (network) sunw,pci-ce okay /pci1d,700000/network2pci 66 pci4 pci100b,35 (network) sunw,pci-x-qge okay /pci

36、1d,700000/pci1/network0pci 66 pci4 pci100b,35 (network) sunw,pci-x-qge okay /pci1d,700000/pci1/network1pci 66 pci4 pci100b,35 (network) sunw,pci-x-qge okay /pci1d,700000/pci1/network2pci 66 pci4 pci100b,35 (network) sunw,pci-x-qge okay /pci1d,700000/pci1/network3pci 33 mb isa/su (serial) okay /pci1e

37、,600000/isa7/serial0,3f8pci 33 mb isa/su (serial) okay /pci1e,600000/isa7/serial0,2e8pci 33 mb isa/rmc-comm-rmc_comm (seria+ okay /pci1e,600000/isa7/rmc-comm0,3e8pci 33 pci0 pci100b,35 (network) sunw,pci-ce okay /pci1e,600000/network2pci 33 mb pciclass,0c0310 (usb) okay /pci1e,600000/usbapci 33 mb p

38、ciclass,0c0310 (usb) okay /pci1e,600000/usbbpci 33 mb pci10b9,5229 (ide) okay /pci1e,600000/idedpci 66 mb pci108e,abba (network) sunw,pci-ce okay /pci1f,700000/network1pci 66 mb scsi-pci1000,30 (scsi-2) lsi,1030 okay /pci1f,700000/scsi2pci 66 mb scsi-pci1000,30 (scsi-2) lsi,1030 okay /pci1f,700000/s

39、csi2,1= memory configuration =segment table:-base address size interleave factor contains-0x0 2gb 4 bankids 0,1,2,30x1000000000 2gb 4 bankids 16,17,18,19bank table:- physical locationid controllerid groupid size interleave way-0 0 0 512mb 0,1,2,31 0 1 512mb 2 0 1 512mb 3 0 0 512mb 16 1 0 512mb 0,1,2

40、,317 1 1 512mb 18 1 1 512mb 19 1 0 512mb memory module groups:-controllerid groupid labels status-0 0 c0/p0/b0/d0 0 0 c0/p0/b0/d1 0 1 c0/p0/b1/d0 0 1 c0/p0/b1/d1 1 0 c1/p0/b0/d0 1 0 c1/p0/b0/d1 1 1 c1/p0/b1/d0 1 1 c1/p0/b1/d1 = environmental status =fan speeds:-location sensor status speed-ft0/f0 tach okay 3792 rpm ft1/f0 tach okay 3994 rpm ft1/f1 tach okay 3947 rpm ps0 ff_pdct_fan okay ps1 ff_pdct_fan okay temperature

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

最新文檔

評(píng)論

0/150

提交評(píng)論