2024年模式識(shí)別實(shí)驗(yàn)報(bào)告_第1頁(yè)
2024年模式識(shí)別實(shí)驗(yàn)報(bào)告_第2頁(yè)
2024年模式識(shí)別實(shí)驗(yàn)報(bào)告_第3頁(yè)
2024年模式識(shí)別實(shí)驗(yàn)報(bào)告_第4頁(yè)
2024年模式識(shí)別實(shí)驗(yàn)報(bào)告_第5頁(yè)
已閱讀5頁(yè),還剩36頁(yè)未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

北京郵電大學(xué)試驗(yàn)匯報(bào)題目:模式識(shí)別班級(jí):27專業(yè):信息工程姓名:呂則東序號(hào):20目錄試驗(yàn)一、Bayes分類器設(shè)計(jì)31.1試驗(yàn)類型:31.2試驗(yàn)?zāi)康模?1.3試驗(yàn)條件:31.4試驗(yàn)原理:31.5試驗(yàn)內(nèi)容:41.6試驗(yàn)規(guī)定:41.7試驗(yàn)代碼和成果5最小錯(cuò)誤率貝葉斯分類器設(shè)計(jì)6試驗(yàn)程序:6運(yùn)行成果8最小風(fēng)險(xiǎn)貝葉斯決策分類器設(shè)計(jì)8試驗(yàn)程序9運(yùn)行成果11試驗(yàn)二、基于Fisher準(zhǔn)則線性分類器設(shè)計(jì)132.1試驗(yàn)類型:132.2試驗(yàn)?zāi)康模?32.3試驗(yàn)條件:132.4試驗(yàn)原理:132.5試驗(yàn)內(nèi)容:142.6試驗(yàn)規(guī)定:172.7試驗(yàn)代碼和試驗(yàn)成果18分類器設(shè)計(jì)和分類成果18試驗(yàn)成果20試驗(yàn)分析:25試驗(yàn)四、近鄰法分類器設(shè)計(jì)264.1試驗(yàn)類型:264.2試驗(yàn)?zāi)康模?64.3試驗(yàn)條件:264.4試驗(yàn)原理:264.5試驗(yàn)內(nèi)容274.6試驗(yàn)規(guī)定:33試驗(yàn)代碼:34試驗(yàn)成果:38問(wèn)題3回答:41試驗(yàn)總結(jié)42試驗(yàn)一、Bayes分類器設(shè)計(jì)1.1試驗(yàn)類型:基礎(chǔ)型:Bayes分類器設(shè)計(jì)1.2試驗(yàn)?zāi)康模罕驹囼?yàn)意在讓同學(xué)對(duì)模式識(shí)別有一種初步的理解,可以根據(jù)自已的設(shè)計(jì)對(duì)貝葉斯決策理論算法有一種深刻地認(rèn)識(shí),理解二類分類器的設(shè)計(jì)原理。1.3試驗(yàn)條件:matlab軟件1.4試驗(yàn)原理:最小風(fēng)險(xiǎn)貝葉斯決策可按下列環(huán)節(jié)進(jìn)行:

(1)在已知,,i=1,…,c及給出待識(shí)別的的狀況下,根據(jù)貝葉斯公式計(jì)算出後驗(yàn)概率:

j=1,…,x

(2)運(yùn)用計(jì)算出的後驗(yàn)概率及決策表,按下面的公式計(jì)算出采用,i=1,…,a的條件風(fēng)險(xiǎn)

,i=1,2,…,a

(3)對(duì)(2)中得到的a個(gè)條件風(fēng)險(xiǎn)值,i=1,…,a進(jìn)行比較,找出使其條件風(fēng)險(xiǎn)最小的決策,即

則就是最小風(fēng)險(xiǎn)貝葉斯決策。1.5試驗(yàn)內(nèi)容:假定某個(gè)局部區(qū)域細(xì)胞識(shí)別中正常()和非正常()兩類先驗(yàn)概率分別為正常狀態(tài):P()=0.9;異常狀態(tài):P()=0.1。既有一系列待觀測(cè)的細(xì)胞,其觀測(cè)值為:-3.9847-3.5549 -1.2401-0.9780-0.7932-2.8531-2.7605-3.7287-3.5414-2.2692-3.4549-3.0752-3.99342.8792-0.97800.79321.18823.0682-1.5799-1.4885-0.7431-0.4221-1.11864.2532已知先驗(yàn)概率是的曲線如下圖:類條件概率分布正態(tài)分布分別為(-2,0.25)(2,4)試對(duì)觀測(cè)的成果進(jìn)行分類。1.6試驗(yàn)規(guī)定:用matlab完畢分類器的設(shè)計(jì),規(guī)定程序?qū)?yīng)語(yǔ)句有闡明文字,規(guī)定有子程序的調(diào)用過(guò)程。根據(jù)例子畫出後驗(yàn)概率的分布曲線以及分類的成果示意圖。假如是最小風(fēng)險(xiǎn)貝葉斯決策,決策表如下:最小風(fēng)險(xiǎn)貝葉斯決策表:狀態(tài)決策α106α210請(qǐng)重新設(shè)計(jì)程序,畫出對(duì)應(yīng)的後驗(yàn)概率的分布曲線和分類成果,并比較兩個(gè)成果。1.7試驗(yàn)代碼和成果最小錯(cuò)誤率貝葉斯分類器設(shè)計(jì)試驗(yàn)程序:x=[-3.9847-3.5549-1.2401-0.9780-0.7932-2.8531-2.7605-3.7287-3.5414-2.2692-3.4549-3.0752-3.99342.8792-0.97800.79321.18823.0682-1.5799-1.4885-0.7431-0.4221-1.11864.2532]pw1=0.9;pw2=0.1e1=-2;a1=0.5e2=2;a2=2m=numel(x)%得到待測(cè)細(xì)胞個(gè)數(shù)pw1_x=zeros(1,m)%寄存對(duì)w1的後驗(yàn)概率矩陣pw2_x=zeros(1,m)%寄存對(duì)w2的後驗(yàn)概率矩陣results=zeros(1,m)%寄存比較成果矩陣fori=1:m%計(jì)算在w1下的後驗(yàn)概率pw1_x(i)=(pw1*normpdf(x(i),e1,a1))/(pw1*normpdf(x(i),e1,a1)+pw2*normpdf(x(i),e2,a2))%計(jì)算在w2下的後驗(yàn)概率pw2_x(i)=(pw2*normpdf(x(i),e2,a2))/(pw1*normpdf(x(i),e1,a1)+pw2*normpdf(x(i),e2,a2))endfori=1:mifpw1_x(i)>pw2_x(i)%比較兩類後驗(yàn)概率result(i)=0%正常細(xì)胞elseresult(i)=1%異常細(xì)胞endenda=[-5:0.05:5]%取樣本點(diǎn)以畫圖n=numel(a)pw1_plot=zeros(1,n)pw2_plot=zeros(1,n)forj=1:npw1_plot(j)=(pw1*normpdf(a(j),e1,a1))/(pw1*normpdf(a(j),e1,a1)+pw2*normpdf(a(j),e2,a2))%計(jì)算每個(gè)樣本點(diǎn)對(duì)w1的後驗(yàn)概率以畫圖pw2_plot(j)=(pw2*normpdf(a(j),e2,a2))/(pw1*normpdf(a(j),e1,a1)+pw2*normpdf(a(j),e2,a2))endfigure(1)holdonplot(a,pw1_plot,'k-',a,pw2_plot,'r-.')fork=1:mifresult(k)==0plot(x(k),-0.1,'b*')%正常細(xì)胞用*表達(dá)elseplot(x(k),-0.1,'rp')%異常細(xì)胞用五角星表達(dá)end;end;legend('正常細(xì)胞後驗(yàn)概率曲線','異常細(xì)胞後驗(yàn)概率曲線','正常細(xì)胞','異常細(xì)胞')xlabel('樣本細(xì)胞的觀測(cè)值')ylabel('後驗(yàn)概率')title('後驗(yàn)概率分布曲線')gridonreturn;運(yùn)行成果後驗(yàn)概率分布曲線後驗(yàn)概率曲線與判決成果在一張圖上:後驗(yàn)概率曲線如圖所示,帶*的曲線為判決成異常細(xì)胞的後驗(yàn)概率曲線;另一條平滑的曲線為判為正常細(xì)胞的後驗(yàn)概率曲線。根據(jù)最小錯(cuò)誤概率準(zhǔn)則,判決成果見(jiàn)曲線下方,其中“*”代表判決為正常細(xì)胞,“五角星”代表異常細(xì)胞各細(xì)胞分類成果:各細(xì)胞分類成果:0000000000000101110001010為判成正常細(xì)胞,1為判成異常細(xì)胞最小風(fēng)險(xiǎn)貝葉斯決策分類器設(shè)計(jì)試驗(yàn)程序x=[-3.9847,-3.5549,-1.2401,-0.9780,-0.7932,-2.8531,-2.7605,-3.7287,-3.5414,-2.2692,-3.4549,-3.0752,-3.9934,2.8792,-0.9780,0.7932,1.1882,3.0682,-1.5799,-1.4885,-0.7431,-0.4221,-1.1186,4.2532]disp(x)pw1=0.9pw2=0.1[R1_x,R2_x,result]=bayes(x,pw1,pw2)function[R1_x,R2_x,result]=danger(x,pw1,pw2)m=numel(x)%得到待測(cè)細(xì)胞個(gè)數(shù)R1_x=zeros(1,m)%寄存把樣本X判為正常細(xì)胞所導(dǎo)致的整體損失R2_x=zeros(1,m)%寄存把樣本X判為異常細(xì)胞所導(dǎo)致的整體損失result=zeros(1,m)%寄存比較成果e1=-2a1=0.5e2=2a2=2%類條件概率分布px_w1:(-2,0.25)px_w2(2,4)r11=0r12=2r21=4r22=0%風(fēng)險(xiǎn)決策表fori=1:m%計(jì)算兩類風(fēng)險(xiǎn)值R1_x(i)=r11*pw1*normpdf(x(i),e1,a1)/(pw1*normpdf(x(i),e1,a1)+pw2*normpdf(x(i),e2,a2))+r21*pw2*normpdf(x(i),e2,a2)/(pw1*normpdf(x(i),e1,a1)+pw2*normpdf(x(i),e2,a2))R2_x(i)=r12*pw1*normpdf(x(i),e1,a1)/(pw1*normpdf(x(i),e1,a1)+pw2*normpdf(x(i),e2,a2))+r22*pw2*normpdf(x(i),e2,a2)/(pw1*normpdf(x(i),e1,a1)+pw2*normpdf(x(i),e2,a2))endfori=1:mifR2_x(i)>R1_x(i)%第二類比第一類風(fēng)險(xiǎn)大result(i)=0%判為正常細(xì)胞(損失較?。?,用0表達(dá)elseresult(i)=1%判為異常細(xì)胞,用1表達(dá)endenda=[-5:0.05:5]%取樣本點(diǎn)以畫圖n=numel(a)R1_plot=zeros(1,n)R2_plot=zeros(1,n)forj=1:nR1_plot(j)=r11*pw1*normpdf(a(j),e1,a1)/(pw1*normpdf(a(j),e1,a1)+pw2*normpdf(a(j),e2,a2))+r21*pw2*normpdf(a(j),e2,a2)/(pw1*normpdf(a(j),e1,a1)+pw2*normpdf(a(j),e2,a2))R2_plot(j)=r12*pw1*normpdf(a(j),e1,a1)/(pw1*normpdf(a(j),e1,a1)+pw2*normpdf(a(j),e2,a2))+r22*pw2*normpdf(a(j),e2,a2)/(pw1*normpdf(a(j),e1,a1)+pw2*normpdf(a(j),e2,a2))%計(jì)算各樣本點(diǎn)的風(fēng)險(xiǎn)以畫圖endfigure(1)holdonplot(a,R1_plot,'b-',a,R2_plot,'g*-')fork=1:mifresult(k)==0plot(x(k),-0.1,'b^')%正常細(xì)胞用上三角表達(dá)elseplot(x(k),-0.1,'go')%異常細(xì)胞用圓表達(dá)end;end;legend('正常細(xì)胞','異常細(xì)胞','Location','Best')xlabel('細(xì)胞分類成果')ylabel('條件風(fēng)險(xiǎn)')title('風(fēng)險(xiǎn)判決曲線')gridonreturn運(yùn)行成果最小風(fēng)險(xiǎn)貝葉斯決策風(fēng)險(xiǎn)判決曲線如上圖所示,其中帶*的綠色曲線代表異常細(xì)胞的條件風(fēng)險(xiǎn)曲線;另一條光滑的藍(lán)色曲線為判為正常細(xì)胞的條件風(fēng)險(xiǎn)曲線。根據(jù)貝葉斯最小風(fēng)險(xiǎn)判決準(zhǔn)則,判決成果見(jiàn)曲線下方,其中“上三角”代表判決為正常細(xì)胞,“圓圈“代表異常細(xì)胞。各細(xì)胞分類成果:100000000000110111000101其中,0為判成正常細(xì)胞,1為判成異常細(xì)胞分類成果比較:由最小錯(cuò)誤率的貝葉斯判決和基于最小風(fēng)險(xiǎn)的貝葉斯判決得出的圖形中的分類成果可以看出,樣本-3.9934、-3.9847在前者中被分為“正常細(xì)胞”,在後者中被分為“異常細(xì)胞”,分類成果截然不一樣。由于在予以最小風(fēng)險(xiǎn)的貝葉斯判決中,影響決策成果的原因多了一種“損失”??梢钥闯?,在圖1中,這兩個(gè)樣本點(diǎn)下兩類決策的後驗(yàn)概率相差很小,當(dāng)結(jié)合最小風(fēng)險(xiǎn)貝葉斯決策表進(jìn)行計(jì)算時(shí),“損失”就起了主導(dǎo)作用,導(dǎo)致出現(xiàn)了相反的成果。此外,最小錯(cuò)誤率貝葉斯決策就是在0-1損失函數(shù)條件下的最小風(fēng)險(xiǎn)貝葉斯決策,即前者是後者的特例。試驗(yàn)二、基于Fisher準(zhǔn)則線性分類器設(shè)計(jì)2.1試驗(yàn)類型:設(shè)計(jì)型:線性分類器設(shè)計(jì)(Fisher準(zhǔn)則)2.2試驗(yàn)?zāi)康模罕驹囼?yàn)意在讓同學(xué)深入理解分類器的設(shè)計(jì)概念,可以根據(jù)自已的設(shè)計(jì)對(duì)線性分類器有更深刻地認(rèn)識(shí),理解Fisher準(zhǔn)則措施確定最佳線性分界面措施的原理,以及Lagrande乘子求解的原理。2.3試驗(yàn)條件:matlab軟件2.4試驗(yàn)原理:線性鑒別函數(shù)的一般形式可表到達(dá)

其中根據(jù)Fisher選擇投影方向W的原則,雖然原樣本向量在該方向上的投影能兼顧類間分布盡量分開,類內(nèi)樣本投影盡量密集的規(guī)定,用以評(píng)價(jià)投影方向W的函數(shù)為:

上面的公式是使用Fisher準(zhǔn)則求最佳法線向量的解,該式比較重要。此外,該式這種形式的運(yùn)算,我們稱為線性變換,其中式一種向量,是的逆矩陣,如是d維,和都是d×d維,得到的也是一種d維的向量。

向量就是使Fisher準(zhǔn)則函數(shù)達(dá)極大值的解,也就是按Fisher準(zhǔn)則將d維X空間投影到一維Y空間的最佳投影方向,該向量的各分量值是對(duì)原d維特性向量求加權(quán)和的權(quán)值。以上討論了線性鑒別函數(shù)加權(quán)向量W確實(shí)定措施,并討論了使Fisher準(zhǔn)則函數(shù)極大的d維向量的計(jì)算措施,不過(guò)鑒別函數(shù)中的另一項(xiàng)尚未確定,一般可采用如下幾種措施確定如或者

或當(dāng)與已知時(shí)可用……當(dāng)W0確定之後,則可按如下規(guī)則分類,

使用Fisher準(zhǔn)則措施確定最佳線性分界面的措施是一種著名的措施,盡管提出該措施的時(shí)間比較早,仍見(jiàn)有人使用。2.5試驗(yàn)內(nèi)容:已知有兩類數(shù)據(jù)和兩者的概率已知=0.6,=0.4。中數(shù)據(jù)點(diǎn)的坐標(biāo)對(duì)應(yīng)一一如下:數(shù)據(jù):x=0.23311.52070.64990.77571.05241.19740.29080.25180.66820.56220.90230.1333-0.54310.9407-0.21260.0507-0.08100.73150.33451.0650-0.02470.10430.31220.66550.58381.16531.26530.8137-0.33990.51520.7226-0.0.4070-0.1717-1.0573-0.2099y=2.33852.19461.67301.63651.78442.01552.06812.12132.47971.51181.96921.83401.87042.29481.77142.39391.56481.93292.20272.45681.75231.69912.48831.72592.04662.02262.37571.79872.08282.07981.94492.38012.23732.16141.92352.2604z=0.53380.85141.08310.41641.11760.55360.60710.44390.49280.59011.09271.07561.00720.42720.43530.98690.48411.09921.02990.71271.01240.45760.85441.12750.77050.41291.00850.76760.84180.87840.97510.78400.41581.03150.75330.9548數(shù)據(jù)點(diǎn)的對(duì)應(yīng)的三維坐標(biāo)為x2=1.40101.23012.08141.16551.37401.18291.76321.97392.41522.58902.84721.95391.25001.28641.26142.00712.18311.79091.33221.14661.70871.59202.93531.46642.93131.83491.83402.50962.71982.31482.03532.60301.23272.14651.56732.9414y2=1.02980.96110.91541.49010.82000.93991.14051.06780.80501.28891.46011.43340.70911.29421.37440.93871.22661.18330.87980.55920.51500.99830.91200.71261.28331.10291.26800.71401.24461.33921.18080.55031.47081.14350.76791.1288z2=0.62101.36560.54980.67080.89321.43420.95080.73240.57841.49431.09150.76441.21591.30491.14080.93980.61970.66031.39281.40840.69090.84000.53811.37290.77310.73191.34390.81420.95860.73790.75480.73930.67390.86511.36991.1458數(shù)據(jù)的樣本點(diǎn)分布如下圖:2.6試驗(yàn)規(guī)定:請(qǐng)把數(shù)據(jù)作為樣本,根據(jù)Fisher選擇投影方向的原則,使原樣本向量在該方向上的投影能兼顧類間分布盡量分開,類內(nèi)樣本投影盡量密集的規(guī)定,求出評(píng)價(jià)投影方向的函數(shù),并在圖形表達(dá)出來(lái)。并在試驗(yàn)匯報(bào)中表達(dá)出來(lái),并求使取極大值的。用matlab完畢Fisher線性分類器的設(shè)計(jì),程序的語(yǔ)句規(guī)定有注釋。根據(jù)上述的成果并判斷(1,1.5,0.6)(1.2,1.0,0.55),(2.0,0.9,0.68),(1.2,1.5,0.89),(0.23,2.33,1.43),屬于哪個(gè)類別,并畫出數(shù)據(jù)分類對(duì)應(yīng)的成果圖,規(guī)定畫出其在上的投影?;卮鹑缦聠?wèn)題,分析一下的比例因子對(duì)于Fisher鑒別函數(shù)沒(méi)有影響的原因。2.7試驗(yàn)代碼和試驗(yàn)成果分類器設(shè)計(jì)和分類成果x1=[0.23311.52070.64990.77571.05241.19740.29080.25180.66820.56220.90230.1333-0.54310.9407-0.21260.0507-0.08100.73150.33451.0650-0.02470.10430.31220.66550.58381.16531.26530.8137-0.33990.51520.7226-0.0.4070-0.1717-1.0573-0.2099];y1=[2.33852.19461.67301.63651.78442.01552.06812.12132.47971.51181.96921.83401.87042.29481.77142.39391.56481.93292.20272.45681.75231.69912.48831.72592.04662.02262.37571.79872.08282.07981.94492.38012.23732.16141.92352.2604];z1=[0.53380.85141.08310.41641.11760.55360.60710.44390.49280.59011.09271.07561.00720.42720.43530.98690.48411.09921.02990.71271.01240.45760.85441.12750.77050.41291.00850.76760.84180.87840.97510.78400.41581.03150.75330.9548];x2=[1.40101.23012.08141.16551.37401.18291.76321.97392.41522.58902.84721.95391.25001.28641.26142.00712.18311.79091.33221.14661.70871.59202.93531.46642.93131.83491.83402.50962.71982.31482.03532.60301.23272.14651.56732.9414];y2=[1.02980.96110.91541.49010.82000.93991.14051.06780.80501.28891.46011.43340.70911.29421.37440.93871.22661.18330.87980.55920.51500.99830.91200.71261.28331.10291.26800.71401.24461.33921.18080.55031.47081.14350.76791.1288];z2=[0.62101.36560.54980.67080.89321.43420.95080.73240.57841.49431.09150.76441.21591.30491.14080.93980.61970.66031.39281.40840.69090.84000.53811.37290.77310.73191.34390.81420.95860.73790.75480.73930.67390.86511.36991.1458];%數(shù)據(jù)錄入,整合為矩陣k=0;fori=1:6forj=1:6k=k+1;w1(:,k)=[x1(i,j);y1(i,j);z1(i,j)];w2(:,k)=[x2(i,j);y2(i,j);z2(i,j)];endend%計(jì)算均值向量m1和m2m1=mean(w1,2);m2=mean(w2,2);%計(jì)算離散度矩陣fori=1:36s1=(w1(:,i)-m1)*(w1(:,i)-m1)';s2=(w2(:,i)-m2)*(w2(:,i)-m2)';endsw=s1+s2;%計(jì)算閥值w0w_new=transpose(inv(sw)*(m1-m2));m1_new=w_new*m1;m2_new=w_new*m2;Pw1=0.6;Pw2=0.4;e=exp(1);w0=(m1_new+m2_new)/2-log(Pw1/Pw2)/log(e)/(36+36-2);%分類判斷x=[11.22.01.20.231.51.00.91.52.330.60.550.680.891.43];m=0;n=0;result1=[];result2=[];fori=1:5y(i)=w_new*x(:,i);ify(i)>w0m=m+1;result1(:,m)=x(:,i);elsen=n+1;result2(:,n)=x(:,i);endend%成果顯示display('屬于第一類的點(diǎn)')result1display('屬于第二類的點(diǎn)')result2scatter3(w1(1,:),w1(2,:),w1(3,:),'+r'),holdonscatter3(w2(1,:),w2(2,:),w2(3,:),'sg'),holdonscatter3(result1(1,:),result1(2,:),result1(3,:),'k'),holdonscatter3(result2(1,:),result2(2,:),result2(3,:),'bd')title('樣本點(diǎn)及試驗(yàn)點(diǎn)的空間分布圖')legend('樣本點(diǎn)w1','樣本點(diǎn)w2','屬于第一類的試驗(yàn)點(diǎn)','屬于第二類的試驗(yàn)點(diǎn)')試驗(yàn)成果分類成果如下:畫出其在w上的投影x1=[0.23311.52070.64990.77571.05241.19740.29080.25180.66820.56220.90230.1333-0.54310.9407-0.21260.0507-0.08100.73150.33451.0650-0.02470.10430.31220.66550.58381.16531.26530.8137-0.33990.51520.7226-0.0.4070-0.1717-1.0573-0.2099];x2=[2.33852.19461.67301.63651.78442.01552.06812.12132.47971.51181.96921.83401.87042.29481.77142.39391.56481.93292.20272.45681.75231.69912.48831.72592.04662.02262.37571.79872.08282.07981.94492.38012.23732.16141.92352.2604];x3=[0.53380.85141.08310.41641.11760.55360.60710.44390.49280.59011.09271.07561.00720.42720.43530.98690.48411.09921.02990.71271.01240.45760.85441.12750.77050.41291.00850.76760.84180.87840.97510.78400.41581.03150.75330.9548];%將x1、x2、x3變?yōu)樾邢蛄縳1=x1(:);x2=x2(:);x3=x3(:);%計(jì)算第一類的樣本均值向量m1m1(1)=mean(x1);m1(2)=mean(x2);m1(3)=mean(x3);%計(jì)算第一類樣本類內(nèi)離散度矩陣S1S1=zeros(3,3);fori=1:36S1=S1+[-m1(1)+x1(i)-m1(2)+x2(i)-m1(3)+x3(i)]'*[-m1(1)+x1(i)-m1(2)+x2(i)-m1(3)+x3(i)];end%w2的數(shù)據(jù)點(diǎn)坐標(biāo)x4=[1.40101.23012.08141.16551.37401.18291.76321.97392.41522.58902.84721.95391.25001.28641.26142.00712.18311.79091.33221.14661.70871.59202.93531.46642.93131.83491.83402.50962.71982.31482.03532.60301.23272.14651.56732.9414];x5=[1.02980.96110.91541.49010.82000.93991.14051.06780.80501.28891.46011.43340.70911.29421.37440.93871.22661.18330.87980.55920.51500.99830.91200.71261.28331.10291.26800.71401.24461.33921.18080.55031.47081.14350.76791.1288];x6=[0.62101.36560.54980.67080.89321.43420.95080.73240.57841.49431.09150.76441.21591.30491.14080.93980.61970.66031.39281.40840.69090.84000.53811.37290.77310.73191.34390.81420.95860.73790.75480.73930.67390.86511.36991.1458];x4=x4(:);x5=x5(:);x6=x6(:);%計(jì)算第二類的樣本均值向量m2m2(1)=mean(x4);m2(2)=mean(x5);m2(3)=mean(x6);%計(jì)算第二類樣本類內(nèi)離散度矩陣S2S2=zeros(3,3);fori=1:36S2=S2+[-m2(1)+x4(i)-m2(2)+x5(i)-m2(3)+x6(i)]'*[-m2(1)+x4(i)-m2(2)+x5(i)-m2(3)+x6(i)];end%總類內(nèi)離散度矩陣SwSw=zeros(3,3);Sw=S1+S2;%樣本類間離散度矩陣SbSb=zeros(3,3);Sb=(m1-m2)'*(m1-m2);%最優(yōu)解WW=Sw^-1*(m1-m2)'%將W變?yōu)閱挝幌蛄恳砸员阌?jì)算投影W=W/sqrt(sum(W.^2));%計(jì)算一維Y空間中的各類樣本均值M1及M2fori=1:36y(i)=W'*[x1(i)x2(i)x3(i)]';endM1=mean(y)fori=1:36y(i)=W'*[x4(i)x5(i)x6(i)]';endM2=mean(y)%運(yùn)用當(dāng)P(w1)與P(w2)已知時(shí)的公式計(jì)算W0p1=0.6;p2=0.4;W0=-(M1+M2)/2+(log(p2/p1))/(36+36-2);%計(jì)算將樣本投影到最佳方向上後來(lái)的新坐標(biāo)X1=[x1*W(1)+x2*W(2)+x3*W(3)]';X2=[x4*W(1)+x5*W(2)+x6*W(3)]';%得到投影長(zhǎng)度XX1=[W(1)*X1;W(2)*X1;W(3)*X1];XX2=[W(1)*X2;W(2)*X2;W(3)*X2];%得到新坐標(biāo)%繪制樣本點(diǎn)figure(1)plot3(x1,x2,x3,'r*')%第一類holdonplot3(x4,x5,x6,'bp')%第二類legend('第一類點(diǎn)','第二類點(diǎn)')title('Fisher線性鑒別曲線')W1=5*W;%畫出最佳方向line([-W1(1),W1(1)],[-W1(2),W1(2)],[-W1(3),W1(3)],'color','b');%鑒別已給點(diǎn)的分類a1=[1,1.5,0.6]';a2=[1.2,1.0,0.55]';a3=[2.0,0.9,0.68]';a4=[1.2,1.5,0.89]';a5=[0.23,2.33,1.43]';A=[a1a2a3a4a5]n=size(A,2);%下面代碼在變化樣本時(shí)都不必修改%繪制待測(cè)數(shù)據(jù)投影到最佳方向上的點(diǎn)fork=1:nA1=A(:,k)'*W;A11=W*A1;%得到待測(cè)數(shù)據(jù)投影y=W'*A(:,k)+W0;%計(jì)算後與0相比以判斷類別,不小于0為第一類,不不小于0為第二類ify>0plot3(A(1,k),A(2,k),A(3,k),'go');%點(diǎn)為"rp"對(duì)應(yīng)第一類plot3(A11(1),A11(2),A11(3),'go');%投影為"r+"對(duì)應(yīng)go類elseplot3(A(1,k),A(2,k),A(3,k),'m+');%點(diǎn)為"bh"對(duì)應(yīng)m+類plot3(A11(1),A11(2),A11(3),'m+');%投影為"b*"對(duì)應(yīng)m+類endend%畫出最佳方向line([-W1(1),W1(1)],[-W1(2),W1(2)],[-W1(3),W1(3)],'color','k');view([-37.5,30]);axis([-2,3,-1,3,-0.5,1.5]);gridonholdoff其中,下方的實(shí)直線是最佳投影方向。待測(cè)數(shù)據(jù)投影在其上,圓圈是被分為第一類的樣本點(diǎn),拾字是被分為第二類的樣本點(diǎn)。取極大值的=(-0.0798,0.,-0.0478)試驗(yàn)分析:w的比例因子對(duì)于Fisher鑒別函數(shù)沒(méi)有影響的原因:在本試驗(yàn)中,最需要的是W的方向,或者說(shuō)是在此方向上數(shù)據(jù)的投影,那么W的比例因子,即它是單位向量的多少倍長(zhǎng)就無(wú)關(guān)緊要了,不管比例因子有多大,在最終求投影時(shí)都會(huì)被消掉而起不到實(shí)際作用.試驗(yàn)四、近鄰法分類器設(shè)計(jì)4.1試驗(yàn)類型:設(shè)計(jì)型:近鄰法分類器設(shè)計(jì)4.2試驗(yàn)?zāi)康模罕驹囼?yàn)意在讓同學(xué)理解近鄰法的原理,通過(guò)軟件編程分段線性分類器的極端狀況,理解k-近鄰法和剪輯近鄰的設(shè)計(jì)過(guò)程,掌握影響k-近鄰法錯(cuò)誤率的估算原因等。4.3試驗(yàn)條件:matlab軟件4.4試驗(yàn)原理:近來(lái)鄰法可以擴(kuò)展成找測(cè)試樣本的k個(gè)近來(lái)樣本作決策根據(jù)的措施。其基本規(guī)則是,在所有N個(gè)樣本中找到與測(cè)試樣本的k個(gè)近來(lái)鄰者,其中各類別所占個(gè)數(shù)表到達(dá),則決策規(guī)劃是:

假如

則決策X∈(3-63)

k近鄰一般采用k為奇數(shù),跟投票表決同樣,防止因兩種票數(shù)相等而難以決策。剪輯近鄰法的基本思想是從這樣一種現(xiàn)象出發(fā)的,即當(dāng)不一樣類別的樣本在分布上有交迭部分的,分類的錯(cuò)誤率重要來(lái)自處在交迭區(qū)中的樣本。當(dāng)我們得到一種作為識(shí)別用的參照樣本集時(shí),由于不一樣類別交迭區(qū)域中不一樣類別的樣本彼此穿插,導(dǎo)致用近鄰法分類出錯(cuò)。因此假如能將不一樣類別交界處的樣本以合適方式篩選,可以實(shí)現(xiàn)既減少樣本數(shù)又提高對(duì)的識(shí)別率的雙重目的。為此可以運(yùn)用既有樣本集對(duì)其自身進(jìn)行剪輯。下面以兩類別問(wèn)題為例闡明這種措施的原理。

假設(shè)既有一種樣本集N,樣本數(shù)量為N。我們將此樣本集提成兩個(gè)互相獨(dú)立的樣本子集。一種被當(dāng)作考試集,另一種作為參照集,數(shù)量分別為與,+=N。將中的樣本表到達(dá),而在中的樣本表達(dá)為。

將一種樣本集提成兩個(gè)互相獨(dú)立的樣本子集是指,分完後來(lái)的兩個(gè)子集具有相似的分布例如將一種樣本集提成兩個(gè)互相獨(dú)立的對(duì)等子集,則在每個(gè)特性空間的子區(qū)域,兩個(gè)子集均有相似的比例,或說(shuō)各類數(shù)量近似相等。要注意指出的是每個(gè)子區(qū)域(從大空間到小空間)實(shí)際做時(shí)要用從總的集合中隨機(jī)抽取的方式進(jìn)行。

剪輯的過(guò)程是:首先對(duì)中每一種Xi在中找到其近來(lái)鄰的樣本Yi(Xi),用Yi(Xi)表達(dá)Yi是Xi的近來(lái)鄰參照樣本。假如Yi與Xi不屬于同一類別,則將Xi從中刪除,最終從中得到一種通過(guò)剪輯的樣本集,稱為剪輯樣本集。可用來(lái)取代原樣本集,作為參照樣本集看待識(shí)別樣本進(jìn)行分類。

通過(guò)剪輯後,要作為新的訓(xùn)練樣本集,則是對(duì)其性能進(jìn)行測(cè)試的樣本,如發(fā)現(xiàn)中的某個(gè)訓(xùn)練樣本對(duì)分類不利,就要把它剪輯掉。

實(shí)際上剪輯樣本的過(guò)程也可以用k-近鄰法進(jìn)行,即對(duì)中的每個(gè)樣本Xi,找到在中的k個(gè)近鄰,用k-近鄰法判斷Xi與否被錯(cuò)分類。從而決定其取舍,其他過(guò)程與前述措施完全同樣。

剪輯近鄰法也可用到多類別狀況。剪輯過(guò)程也可不止一次。反復(fù)多次的稱為反復(fù)剪輯近鄰法。4.5試驗(yàn)內(nèi)容如下面的matlab程序論述的同樣,有兩個(gè)類別,x,y,樣本的分布規(guī)律服從正態(tài)分布,其均值和方差分別為(2,2),(-2,4),每個(gè)類別裏面分別有樣本100個(gè),如下面的數(shù)據(jù)所列(第一行為橫坐標(biāo),對(duì)應(yīng)的下一行對(duì)應(yīng)的是縱坐標(biāo)),圖形所示clearallcloseallx=2+sqrt(2)*randn(2,100)y=-2+sqrt(4)*randn(2,100)plot(x(1,:),x(2,:),'ro');holdonplot(y(1,:),y(2,:),'b*');gridonx1=1+sqrt(2)*randn(2,2)y1=-1+sqrt(4)*randn(2,3)plot(x1(1,:),x1(2,:),'gs');holdonplot(y1(1,:),y1(2,:),'gd');x=Columns1through82.33933.07771.87372.73392.37232.06072.79581.16134.04972.72542.96871.29590.42812.72071.98122.2180Columns9through161.51512.84220.69551.9956-1.46933.10272.08731.58881.32331.87814.16311.69720.45320.69574.45842.4472Columns17through241.22102.77025.35542.57553.15791.42992.60700.57145.01881.78641.26801.45951.40492.20752.99972.4271Columns25through321.62602.34000.72911.47222.75243.13803.78000.96392.43972.5036-0.12081.68140.11670.93580.99341.0406Columns33through401.63770.63350.59440.69002.72690.7390-0.94972.26601.12670.9308-0.28905.53621.98552.05770.2602-2.1053Columns41through480.72350.87651.2597-0.15375.86641.45640.86191.21081.88832.63201.54650.42954.60992.71260.87883.3248Columns49through561.76893.95432.13053.59103.45012.84762.27941.78771.27111.47690.69732.78462.43932.06723.15511.4484Columns57through643.97101.91223.87493.10352.17261.00683.26023.4937-0.25533.26653.28341.18043.90094.27032.64353.7896Columns65through721.89040.93812.64991.1209-0.76020.98783.32092.68252.21681.66660.6734-0.28853.4517-0.83193.51211.2336Columns73through801.39801.77503.03154.38140.07461.02330.73702.31062.85022.44972.46052.53281.84791.92841.99252.9242Columns81through881.67552.57952.07351.28354.49522.10991.52934.29771.63454.25854.43713.70473.77772.22041.78012.4882Columns89through960.55782.35031.93322.83941.91851.13662.69343.68762.71082.3.15861.09792.28281.10500.73636.0314Columns97through1000.86304.87190.77021.18242.54520.92792.40002.9063y=Columns1through8-4.0741-2.0.27722.5119-2.6005-2.7258-0.9535-2.6182-3.62421.8317-0.3543-1.3855-1.0020-2.2239-0.4317-2.0749Columns9through161.1157-2.9820-0.7157-4.8064-2.13680.0204-4.6403-1.4024-1.1429-1.9817-1.07013.5301-0.3107-2.6776-3.6135-3.0649Columns17through241.03830.0278-3.8224-1.7426-0.8970-3.1769-2.6497-1.0395-2.4134-3.3121-3.0811-2.6541-0.7125-3.1175-1.9918-2.0607Columns25through32-2.9279-5.6277-3.2913-0.3836-6.5657-6.9075-4.9407-2.4313-3.3265-2.0005-1.7682-2.1453-0.7046-3.24160.6713-1.8313Columns33through401.15330.0380-0.90460.44502.1135-0.9418-0.36340.2072-0.2502-1.9232-1.7691-3.5266-1.1793-2.3955-1.7491-2.2116Columns41through48-2.7789-0.4264-1.5418-4.5961-3.1488-3.28130.15210.3201-2.8753-3.1971-5.1734-1.7970-1.21651.01121.86580.6457Columns49through561.72582.45681.0064-2.8329-2.2510-2.2932-0.4625-2.1996-3.2806-1.3224-1.5015-3.4684-3.3073-1.03441.3012-4.8685Columns57through64-0.3933-4.4781-0.8298-2.7694-4.4012-2.3012-2.72384.4696-1.9525-3.9448-4.2802-2.8949-7.0979-1.7046-2.1852-2.5313Columns65through72-0.2649-0.50831.2236-4.0723-2.7961-3.8391-6.0491-1.26490.8744-3.6436-1.7198-1.0935-1.5580-0.2845-1.1005-7.1413Columns73through80-0.7904-4.9162-0.8173-3.0191-1.5423-3.4648-3.9127-1.0085-0.1674-1.8316-1.9920-6.5079-1.1666-3.2242-2.6278-1.7200Columns81through88-3.0868-2.2972-3.3748-1.1780-3.1821-3.7845-1.4982-1.7623-0.3149-4.3529-1.3920-1.1284-0.0297-5.7240-2.0337-2.7306Columns89through96-2.0356-3.1204-1.6235-1.6992-4.8063-2.9712-5.5903-3.5470-2.3859-0.1046-0.6685-3.2670-0.18801.7160-4.-3.9663Columns97through100-1.61452.1930-2.3963-5.2112-0.5505-5.93201.6841-3.7229請(qǐng)使用k-近鄰法判斷下列sample中樣本的分類狀況(-0.7303,2.1624),(1.4445,-0.1649),(-1.2587,0.9187),(1.2617,-0.2086),(0.7302,1.6587)4.6試驗(yàn)規(guī)定:規(guī)定用matlab編程,來(lái)確定分類的狀況,并以圖形的方式表達(dá)出來(lái)。分析k值的不一樣對(duì)分類的狀況與否有影響,并把成果用圖形的方式表達(dá)出來(lái)?;卮饐?wèn)題設(shè)在一種二維空間,A類有三個(gè)訓(xùn)練樣本,圖中用紅點(diǎn)表達(dá),B類四個(gè)樣本,圖中用藍(lán)點(diǎn)表達(dá)。

試問(wèn):

(1)按近鄰法分類,這兩類最多有多少個(gè)分界面

(2)畫出實(shí)際用到的分界面

(3)A1與B4之間的分界面沒(méi)有用到試驗(yàn)代碼:clearallcloseallx=2+sqrt(2)*randn(2,100)%生成100個(gè)均值與方差為(2,2)的樣本庫(kù)y=-2+sqrt(4)*randn(2,100)%生成100個(gè)均值與方差為(-2,4)的樣本庫(kù)x1=1+sqrt(2)*randn(2,2)%生成待測(cè)樣本x1y1=-1+sqrt(4)*randn(2,3)%生成待測(cè)樣本y1stablex=x;%創(chuàng)立x的固定備份,用于畫多種圖stabley=y;%創(chuàng)立y的固定備份,用于畫多種圖z=[x1y1];%將待測(cè)樣本合并成一種矩陣figure(1);plot(stablex(1,:),stablex(2,:),'ro');%畫樣本庫(kù)title('剪輯前的7鑒別');holdonplot(stabley(1,:),stabley(2,:),'b*');%畫樣本庫(kù)gridonplot(x1(1,:),x1(2,:),'gs');%畫待測(cè)樣本holdonplot(y1(1,:),y1(2,:),'gd');%畫待測(cè)樣本k=7%剪輯前的7鑒別%fori=1:5%tempx=stablex;%tempy=stabley;%x=jianji1(tempx,tempy,k);%y=jianji2(tempx,tempy,k);%plot(x(1,:),x(2,:),'ko');%plot(y(1,:),y(2,:),'k*');%endforj=1:2:numel(z)%用于鑒別待測(cè)樣本q=[1:(numel(z)/2)];g=oushi(x,z,j);%z中各點(diǎn)對(duì)x樣本的歐距h=oushi(y,z,j);%z中各點(diǎn)對(duì)y樣本的歐距q((j+1)/2)=panjue(g,h,k);%z中各點(diǎn)的判決狀況ifj<numel(x1)%z中點(diǎn)本來(lái)是x1類的狀況ifq((j+1)/2)==0%z中點(diǎn)假如是x類plot(z(j),z(j+1),'rs');%紅方形elseplot(z(j),z(j+1),'bs');%藍(lán)方形endelse%z中點(diǎn)假如是y類ifq((j+1)/2)==0plot(z(j),z(j+1),'rd');%紅菱形elseplot(z(j),z(j+1),'bd');%藍(lán)菱形endendendlegend('x樣本集','y樣本集','待測(cè)樣本x1(被覆蓋)','待測(cè)樣本y1(被覆蓋)',...'x1中第1點(diǎn)的歸類','x1中第2點(diǎn)的歸類','y1中第1點(diǎn)的歸類','y1中第2點(diǎn)的歸類','y1中第3點(diǎn)的歸類',2);figure(2);plot(stablex(1,:),stablex(2,:),'ro');title('剪輯前的1鑒別');holdonplot(stabley(1,:),stabley(2,:),'b*');gridonplot(x1(1,:),x1(2,:),'gs');holdonplot(y1(1,:),y1(2,:),'gd');k=1%剪輯前的1鑒別%fori=1:5%tempx=stablex;%tempy=stabley;%x=jianji1(tempx,tempy,k);%y=jianji2(tempx,tempy,k);%plot(x(1,:),x(2,:),'ko');%plot(y(1,:),y(2,:),'k*');%endforj=1:2:numel(z)%用于鑒別待測(cè)樣本q=[1:(numel(z)/2)];g=oushi(x,z,j);%z中各點(diǎn)對(duì)x樣本的歐距h=oushi(y,z,j);%z中各點(diǎn)對(duì)y樣本的歐距q((j+1)/2)=panjue(g,h,k);%z中各點(diǎn)的判決狀況ifj<numel(x1)%z中點(diǎn)本來(lái)是x1類的狀況ifq((j+1)/2)==0%z中點(diǎn)假如是x類plot(z(j),z(j+1),'rs');%紅方形elseplot(z(j),z(j+1),'bs');%藍(lán)方形endelse%z中點(diǎn)假如是y類ifq((j+1)/2)==0plot(z(j),z(j+1),'rd');%紅菱形elseplot(z(j),z(j+1),'bd');%藍(lán)菱形endendendlegend('x樣本集','y樣本集','待測(cè)樣本x1(被覆蓋)','待測(cè)樣本y1(被覆蓋)',...'x1中第1點(diǎn)的歸類','x1中第2點(diǎn)的歸類','y1中第1點(diǎn)的歸類','y1中第2點(diǎn)的歸類','y1中第3點(diǎn)的歸類',2);figure(3);plot(stablex(1,:),stablex(2,:),'ro');title('剪輯後的7鑒別');holdonplot(stabley(1,:),stabley(2,:),'b*');gridonplot(x1(1,:),x1(2,:),'gs');holdonplot(y1(1,:),y1(2,:),'gd');k=7%剪輯後的7鑒別%fori=1:5tempx=stablex

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論