版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、.Week Six Analyzing categorical data: Chi-squared tests .This week lecture will cover.Analysing categorical data (nominal) Chi-square test of differences between proportions Chi-square test of independence.SPSS單樣本非參數(shù)檢驗(yàn)總體分布的總體分布的chi-square檢驗(yàn)檢驗(yàn)(1)目的目的: 根據(jù)樣本數(shù)據(jù)推斷總體的分布與某個(gè)已知分布是否有顯著差異根據(jù)樣本數(shù)據(jù)推斷總體的分布與某個(gè)已知分布是否
2、有顯著差異-吻合性檢驗(yàn)。吻合性檢驗(yàn)。適用于分類資料的統(tǒng)計(jì)推斷適用于分類資料的統(tǒng)計(jì)推斷.SPSS單樣本非參數(shù)檢驗(yàn)單樣本非參數(shù)檢驗(yàn)l總體分布的chi-square檢驗(yàn)(2)基本假設(shè): H0:總體分布與理論分布無顯著差異(3)基本方法 根據(jù)已知總體的構(gòu)成比計(jì)算出樣本中各類別的期望頻數(shù),計(jì)算實(shí)際觀察頻數(shù)與期望頻數(shù)的差距,即:計(jì)算卡方值 卡方值較小,則實(shí)際頻數(shù)和期望頻數(shù)相差較小.如果P大于a,不能拒絕H0,認(rèn)為總體分布與已知分布無顯著差異.反之.SPSS單樣本卡方檢驗(yàn)總體分布的總體分布的chi-square檢驗(yàn)檢驗(yàn)(4)基本操作步驟基本操作步驟:菜單:analyze-nonparametric test
3、-chi square選定待檢驗(yàn)變量入test variable list 框確定待檢驗(yàn)個(gè)案的取值范圍(expected range)get from data:全部樣本use specified range:用戶自定義個(gè)案范圍指定期望頻數(shù)(expected values)all categories equal:所有類別有相同的構(gòu)成比value:用戶自定義構(gòu)成比.Categorical variableVariables that describe categories of entitiesDealing with them all the time in statisticsMaking
4、 comparisons among variablesFor example, whether consumers prefer a particular brand of a product among other competing brands.Checking whether there is a relationship between two categorical variables Gender and preference for a product, whether the preference for a product is independent from gend
5、er.Chi-square test for differences between proportionsThis test involves with nominal data produced by multinomial experimentIt is a generalisation of a binomial experimentThese test the null hypothesis that data in the target population has a particular probability distribution.Example 1We might te
6、st whether consumers are indifferent to which of four materials (glass, plastic, steel or aluminium) that could be used to make soft drink containers.The null hypothesis is that they are indifferent (or that equal numbers prefer glass, plastic, steel and aluminium).Example 1DataLet pG be the probabi
7、lity that an individual selected at random will nominate glass as his/her preference if required to make a choice. Similarly for pP (plastic), pS (steel) and pA (aluminium)HypothesesHO: pG = pP = pS = pA = 0.25.HA: at least one pi 0.25.The alternative is that at least one material is more preferred
8、(or less preferred) than the others.Example 1cont.Procedure:Select a random sample of, say, 100 consumers and determine their preferences.Under the null hypothesisWe expect 25 consumers to nominate glass, 25 to nominate plastic, 25 to nominate steel and 25 to nominate aluminiumThese are the expected
9、 frequencies, Ei.Ei = n pi.We compare the expected frequencies with the sample results or the observed frequencies, Oi. If they are approximately the same we would conclude that the null hypothesis is true.Oi Ei HO is probably true.Example 1cont., Chi squareE)EO(i221GiiWe require a test statistic to
10、 decide whether the difference is large enough to reject the null hypothesis.We use chi square with G - 1 degrees of freedom where G is the number of groups.Suppose in our example, 39 prefer glass, 16 prefer plastic, 20 prefer steel and 25 prefer aluminium. Recall that the expected frequencies were
11、all 25.08.1225)2525(25)2520(25)2516(25)2539(23222223.Obtain the critical value of chi square Critical 23 = 7.82. Obtain the critical value at 5% significance level at 3 d.f., (Table E4, page 742, Berenson et.al. 2013)i.e. there is only a 5 percent chance or less that 23 7.82 if HO is true. Compariso
12、n of chi square values23 = 12.08 7.82 reject HO. Conclusion: at the 5% significance level there is sufficient evidence to reject the null hypothesis. At least one of the probabilities (pi) is different. The sample results indicate that the materials are not equally preferred by consumers in the targ
13、et population. Thus, at least preferences for two materials are different.Chi square test using SPSSExample : Suppose that we want to test whether or not customers have a colour preference for packaging. Three different colours, Blue, Green & Purple, are considered. The null hypothesis is that t
14、hey dont have colour preference.Use Analyse/Nonparametric tests /Chi-Square.The default is that the probabilities are equal.Main display colour2630.0-4.03730.07.02730.0-3.090BlueGreenPurpleTotalObserved NExpected NResidualNumbers of consumers actually choosing particular colours.Numbers of consumers
15、 expected to choose particular colours if the null is true.Main display colour2630.0-4.03730.07.02730.0-3.090BlueGreenPurpleTotalObserved NExpected NResidualDifferent but differentenough to reject the null? .Test Statistics2.4672.291Chi-SquareadfAsymp. Sig.Main DisplayColour0 cells (.0%) have expect
16、ed frequencies less than5. The minimum expected cell frequency is 30.0.a. Degrees of freedom,groups - 1Chi-square statistic.Test Statistics2.4672.291Chi-SquareadfAsymp. Sig.Main DisplayColourCheck this to test the null.Check the sig value to test Ho Cannot reject the null (Ho) that all three colours
17、 are equally preferredbecause Sig 0.05.Conclusion: At 5% significance level there is no sufficient evidence to conclude that consumers in the target population have preference for at least one of three colours of packaging. .Tests of independence Chi-squared test of a contingency tableThis test sati
18、sfies two different problem objectives :Are two nominal variables related? Are there differences among two or more population of nominal variables?Consider the following 3 featuresHeight in centimetres, Weight in kilograms & Colour of eyes.Whilst some people are tall and thin, on average taller
19、people weigh more than shorter people.Weight and height are not independent. It seems unlikely that people with blue eyes weigh more, on average, than people with brown eyes.Weight and eye colour are almost certainly independent.交叉分組下的頻數(shù)分析目的 了解不同變量在不同水平下的數(shù)據(jù)分布情況 例:學(xué)習(xí)成績(jī)與性別有關(guān)聯(lián)嗎?(兩變量)例:職業(yè)、性別、愛逛商店有關(guān)聯(lián)嗎?(三
20、變量)分析的主要步驟產(chǎn)生交叉列聯(lián)表分析列聯(lián)表中變量間的關(guān)系.產(chǎn)生交叉列聯(lián)表收入 職稱 高(人) 中(人) 低(人) 高工 工程師 助工 技術(shù)員 合計(jì) 什么是列聯(lián)表列變量行變量地區(qū)控制變量頻數(shù).產(chǎn)生交叉列聯(lián)表基本操作步驟(1)菜單選項(xiàng): analyze-descriptive statistics- crosstabs(2)選擇一個(gè)變量作為行變量到row框.(3)選擇一個(gè)變量作為列變量到column框.(4)可選一個(gè)或多個(gè)變量作為控制變量到layer框.控制變量的層次設(shè)置:同層為水平數(shù)加水平數(shù)加;不同層為水平數(shù)積水平數(shù)積.(5)是否顯示各分組的棒圖(display clustered bar c
21、harts ).產(chǎn)生交叉列聯(lián)表進(jìn)一步計(jì)算 cells選項(xiàng):選擇在頻數(shù)分析表中輸出各種百分比.row:行百分比(Row pct);column:列百分比(Col pct);total:總百分比(Tot pct); .分析列聯(lián)表中變量間的關(guān)系目的: 通過列聯(lián)表分析,檢驗(yàn)行列變量之間是否獨(dú)立。方法: 卡方檢驗(yàn):對(duì)品質(zhì)數(shù)據(jù)的相關(guān)性進(jìn)行度量.分析列聯(lián)表中變量間的關(guān)系卡方檢驗(yàn) 年齡與工資收入交叉列聯(lián)表 低 中 高 青 400 0 0 中 0 5000 老 0 0 600 低 中 高 青 0 0 500 中 0 6000 老 400 0 0.分析列聯(lián)表中變量間的關(guān)系卡方檢驗(yàn)基本步驟(1)H0:行列變量之間無
22、關(guān)聯(lián)或相互獨(dú)立(2)構(gòu)造卡方統(tǒng)計(jì)量統(tǒng)計(jì)量服從(r-1)*(c-1)個(gè)自由度的卡方分布count:觀察(實(shí)際)頻數(shù)expected count:期望頻數(shù)(期望頻數(shù)反映的是H0成立情況下的數(shù)據(jù)分布特征)Residual:剩余(觀察頻數(shù)-期望頻數(shù))優(yōu)良中及格總數(shù)男1055323女8124125總數(shù)1817944837.535.418.88.3100eeofff22)(.不患肺癌不患肺癌患肺癌患肺癌總計(jì)總計(jì)不吸煙不吸煙7775427817吸煙吸煙2099492148總計(jì)總計(jì)98749199651、列聯(lián)表2、三維柱形圖3、二維條形圖不患肺癌患肺癌吸煙不吸煙不患肺癌患肺癌吸煙不吸煙080007000600
23、050004000300020001000從三維柱形圖能清晰看出從三維柱形圖能清晰看出各個(gè)頻數(shù)的相對(duì)大小。各個(gè)頻數(shù)的相對(duì)大小。從二維條形圖能看出,吸煙者中從二維條形圖能看出,吸煙者中患肺癌的比例高于不患肺癌的比例。患肺癌的比例高于不患肺癌的比例。通過圖形直觀判斷兩個(gè)分類變量是否相關(guān):通過圖形直觀判斷兩個(gè)分類變量是否相關(guān):.Tests of independence contExample 2Suppose we interviewed 400 people & asked themwhich of three age groups they are in (under 25, 25 t
24、o 60, and over 60).We also ask their response to the statement that “All imports of automobiles should be banned in order to protect the local industry” (agree, no view either way, disagree).attitudes towards banning importsagreeno viewdisagree Total age groupunder 2519 53 25 9725 - 6046 94 47 187ov
25、er 6030 56 30 116Total95203102 400.Tests of independence contExample 2 cont.Null hypothesis: The null hypothesis is that answers to the two questions are independent.Under the null:Probover 60 and agree = Probover 60 ProbagreeMultiplication rule for independent eventsExpected frequency= Probover 60
26、Probagree sample size.nCRnnCnREjijiijProcedureWe set up a cross-tabulation showing the observed frequencies of answers to the two questions.We calculate the expected frequencies.TestOur test is based on a comparison of the observed and expected frequencies.Short-cut for expected frequencies.Age *att
27、itude to banning imports Cross tabulation19.053.025.097.023.049.224.796.946.094.047.0187.044.494.947.7187.030.056.030.0116.027.658.929.6116.195.0203.0102.0400.095.0203.0102.0400.0CountExpected CountCountExpected CountCountExpected CountCountExpected CountUnder 2525-60Over 60AgeGroupTotalAgreeNo view
28、DisagreeAttitude to ban importsTotalCalculation for expectedfrequency of agree and over 60,95 116 / 400.Age *attitude to banning imports Cross tabulation19.053.025.097.023.049.224.796.946.094.047.0187.044.494.947.7187.030.056.030.0116.027.658.929.6116.195.0203.0102.0400.095.0203.0102.0400.0CountExpe
29、cted CountCountExpected CountCountExpected CountCountExpected CountUnder 2525-60Over 60AgeGroupTotalAgreeNo viewDisagreeAttitude to ban importsTotalThe count (observed) and the expected are different, but different enough to reject the null?.Chi-squared test for independenceE)EO(ij22)1c()1r (ijijRationale:Oij Eij HO is probably true.Test statisticWe requi
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 廣東白云學(xué)院《平面廣告》2023-2024學(xué)年第一學(xué)期期末試卷
- 共青科技職業(yè)學(xué)院《智能汽車傳感技術(shù)》2023-2024學(xué)年第一學(xué)期期末試卷
- 廣東財(cái)經(jīng)大學(xué)《古代女性文學(xué)研究》2023-2024學(xué)年第一學(xué)期期末試卷
- 貨運(yùn)司機(jī)培訓(xùn)課件
- 贛南衛(wèi)生健康職業(yè)學(xué)院《JavaWeb程序設(shè)計(jì)SIT》2023-2024學(xué)年第一學(xué)期期末試卷
- 2022年上海注冊(cè)會(huì)計(jì)師《審計(jì)》考試題庫(kù)(含典型題和真題)
- 贛東學(xué)院《小學(xué)跨學(xué)科教育研究》2023-2024學(xué)年第一學(xué)期期末試卷
- 七年級(jí)生物上冊(cè)第三單元生物圈中的綠色植物第五章綠色植物與生物圈中的碳-氧平衡第一節(jié)光合作用吸收二氧化碳釋放氧氣教案新版新人教版1
- 七年級(jí)道德與法治上冊(cè)第一單元成長(zhǎng)的節(jié)拍第一課中學(xué)時(shí)代第一框中學(xué)序曲教案新人教版
- 《常見案例分析類型》課件
- 安徽省合肥市廬江縣2022-2023學(xué)年八年級(jí)上學(xué)期期末物理試卷(含答案)
- 造價(jià)年度工作總結(jié)
- 護(hù)理人員應(yīng)急預(yù)案培訓(xùn)課件:居家病人護(hù)理與應(yīng)急服務(wù)
- 液壓與氣動(dòng)傳動(dòng)CAI第1章
- 廣告?zhèn)髅叫袠I(yè)操作人員安全培訓(xùn)
- SB-T 11238-2023 報(bào)廢電動(dòng)汽車回收拆解技術(shù)要求
- ICU呼吸系統(tǒng)護(hù)理的專業(yè)技巧與注意事項(xiàng)
- 藝術(shù)類院校加強(qiáng)藝術(shù)法教育的思考
- 銀行商會(huì)戰(zhàn)略合作協(xié)議書
- 2025年日歷表帶農(nóng)歷【陰歷】完美打印版
- 重點(diǎn)實(shí)驗(yàn)室申報(bào)
評(píng)論
0/150
提交評(píng)論