版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)
文檔簡介
Class5:ANOVA(AnalysisofVariance)andF-tests
I.WhatisANOVA
WhatisANOVA?ANOVAistheshortnamefortheAnalysisofVariance.TheessenceofANOVAistodecomposethetotalvarianceofthedependentvariableintotwoadditivecomponents,oneforthestructuralpart,andtheotherforthestochasticpart,ofaregression.Todaywearegoingtoexaminetheeasiestcase.
II.ANOVA:AnIntroduction
Letthemodelbe
.
Assumingxisacolumnvector(oflengthp)ofindependentvariablevaluesfortheith'observation,
.
Thenisthepredictedvalue.
sumofsquarestotal:
because.
ThisisalwaystruebyOLS.
=SSE+SSR
Important:thetotalvarianceofthedependentvariableisdecomposedintotwoadditiveparts:SSE,whichisduetoerrors,andSSR,whichisduetoregression.
Geometricinterpretation:[blackboard]
DecompositionofVariance
Ifwetreat(yī)Xasarandomvariable,wecandecomposetotalvariancetothebetween-groupportionandthewithin-groupportioninanypopulation:
?
Prove:
?
?
(bytheassumptionthat,forallpossiblek.)
TheANOVAtableistoestimat(yī)ethethreequantitiesofequation(1)fromthesample.Asthesamplesizegetslargerandlarger,theANOVAtablewillapproachtheequationcloserandcloser.
Inasample,decompositionofestimatedvarianceisnotstrictlytrue.Wethusneedtoseparatelydecomposesumsofsquaresanddegreesoffreedom.IsANOVAamisnomer?
III.ANOVAinMatrix
Iwilltrytogiveasimpliedrepresentat(yī)ionofANOVAasfollows:
(because)
(inyourtextbook,monsterlook)
SSE=e'e
(because,asalways)
(inyourtextbook,monsterlook)
IV.ANOVATable
SOURCE
SS
DF
MS
F
with
Regression
SSR
DF(R)
MSR
MSR/MSE
DF(R)
Error
SSE
DF(E)
MSE
DF(E)
Total
SST
DF(T)
Letususearealexample.Assumethatwehavearegressionestimatedtobe
y=-1.70+0.840x
ANOVATable
SOURCE
SS
DF
MS
F
with
Regression
6.44
1
6.44
6.44/0.19=33.89
1,18
Error
3.40
18
0.19
Total
9.84
19
Weknow,,,,.IfweknowthatDFforSST=19,whatisn?
n=20
?=201.71.7+0.840.84509.12-21.70.84100-125.0
?=6.44
SSE=SST-SSR=9.84-6.44=3.40
DF(Degreesoffreedom):demonstration.Note:discountingtheinterceptwhencalculatingSST.
MS=SS/DF
p=0.000[askstudents].Whatdoesthep-valuesay?
V.F-Tests
F-testsaremoregeneralthant-tests,t-testscanbeseenasaspecialcaseofF-tests.IfyouhavedifficultywithF-tests,pleaseaskyourGSIstoreviewF-testsinthelab.F-teststakestheformofafractionoftwoMS's.
AnFstatistichastwodegreesoffreedomassociatedwithit:thedegreeoffreedominthenumerator,andthedegreeoffreedominthedenominator.
AnFstatisticisusuallylargerthan1.TheinterpretationofanFstatisticsisthatwhethertheexplainedvariancebythealternativehypothesisisduetochance.Inotherwords,thenullhypothesisisthattheexplainedvarianceisduetochance,orallthecoefficientsarezero.
ThelargeranF-statistic,themorelikelythat(yī)thenullhypothesisisnottrue.Thereisat(yī)ableinthebackofyourbookfromwhichyoucanfindexactprobabilityvalues.
Inourexample,theFis34,whichishighlysignificant.
VI.R2
R2=SSR/SST
Theproportionofvarianceexplainedbythemodel.
Inourexample,
R-sq=65.4%
VII.Whathappensifweincreasemoreindependentvariables.
1.SSTstaysthesame.
2.SSRalwaysincreases.
3.SSEalwaysdecreases.
4.R2alwaysincreases.
5.MSRusuallyincreases.
6.MSEusuallydecreases.
7.F-testusuallyincreases.
Exceptionsto5and7:irrelevantvariablesmaynotexplainthevariancebuttakeupdegreesoffreedom.Wereallyneedtolookat(yī)theresults.
VIII.Important:GeneralWaysofHypothesisTestingwithF-Statistics.
AlltestsinlinearregressioncanbeperformedwithF-teststatistics.Thetrickistorun"nestedmodels."
Twomodelsarenestediftheindependentvariablesinonemodelareasubsetorlinearcombinat(yī)ionsofasubset(子集)oftheindependentvariablesintheothermodel.
Thatistosay.IfmodelAhasindependentvariables(1,,),andmodelBhasindependentvariables(1,,,),AandBarenested.Aiscalledtherestrictedmodel;Biscalledlessrestrictedorunrestrictedmodel.WecallArestrictedbecauseAimpliesthat.Thisisarestriction.
Anotherexample:Chasindependentvariable(1,,+),Dhas(1,+).
CandAarenotnested.
CandBarenested.OnerestrictioninC:.
CandDarenested.OnerestrictioninD:.
DandAarenotnested.
DandBarenested:tworestrictioninD:;.
Wecanalwaystesthypothesesimpliedintherestrictedmodels.Steps:runtworegressionforeachhypothesis,onefortherestrictedmodelandonefortheunrestrictedmodel.TheSSTshouldbethesameacrossthetwomodels.WhatisdifferentisSSEandSSR.Thatis,whatisdifferentisR2.Let
;
Usethefollowingformulas:
?
or
(proof:useSST=SSE+SSR)
Note,df(SSEr)-df(SSEu)=df(SSRu)-df(SSRr)=,
isthenumberofconstraints(notnumberofparameters)impliedbytherestrictedmodel
or
Notethat
Thatis,for1dftests,youcaneitherdoanF-testorat-test.Theyyieldthesameresult.Anotherwaytolookatitisthat(yī)thet-testisaspecialcaseoftheFtest,withthenumeratorDFbeing1.
IX.AssumptionsofF-tests
WhatassumptionsdoweneedtomakeanANOVAtablework?
Notmuchanassumption.Allweneedistheassumptionthat(X'X)isnotsingular,sothattheleastsquareestimatebexists.
Theassumptionof=0isneededifyouwanttheANOVAtabletobeanunbiasedestimat(yī)eofthetrueANOVA(equation1)inthepopulation.Reason:wewantbtobeanunbiasedestimatorof,andthecovariancebetweenbandtodisappear.
ForreasonsIdiscussedearlier,theassumptionsofhomoscedasticityandnon-serialcorrelationarenecessaryfortheestimat(yī)ionof.
Thenormalityassumptionthatiisdistributedinanormaldistributionisneededforsmallsamples.
X.TheConceptofIncrement
Everytimeyouputonemoreindependentvariableintoyourmodel,yougetanincreasein.Wesometimecalledtheincrease"incremental."Whatismeansisthatmorevarianceisexplained,orSSRisincreased,SSEisreduced.Whatyoushouldunderstandisthattheincrementalat(yī)tributedtoavariableisalwayssmallerthanthewhenothervariablesareabsent.
?XI.ConsequencesofOmittingRelevantIndependentVariables
Saythetruemodelisthefollowing:
.
Butforsomereasonweonlycollectorconsiderdataon.Therefore,weomitintheregression.Thatis,weomitinourmodel.Webrieflydiscussedthisproblembefore.Theshortstoryisthatwearelikelytohaveabiasduetotheomissionofarelevantvariableinthemodel.Thisissoeventhoughourprimaryinterestistoestimatetheeffectoforony.
Why?Wewillhaveaformalpresentationofthisproblem.
XII.MeasuresofGoodness-of-Fit
Therearedifferentwaystoassessthegoodness-of-fitofamodel.
A.R2
R2isaheuristicmeasurefortheoverallgoodness-of-fit.Itdoesnothaveanassociatedteststat(yī)istic.
R2measurestheproportionofthevarianceinthedependentvariablethat(yī)is“explained”bythemodel:
R2=
B.ModelF-test
ThemodelF-testteststhejointhypothesesthat(yī)allthemodelcoefficientsexceptfortheconstanttermarezero.
DegreesoffreedomsassociatedwiththemodelF-test:
Numerator:p-1
Denominator:n-p.
C.t-testsforindividualparameters
At-testforanindividualparameterteststhehypothesisthat(yī)aparticularcoefficientisequaltoaparticularnumber(commonlyzero).
tk=(bk-k0)/SEk,whereSEkisthe(k,k)elementofMSE(X’X)-1,withdegreeoffreedom=n-p.
D.IncrementalR2
Relativetoarestrictedmodel,thegaininR2fortheunrestrictedmodel:
R2=Ru2-Rr2
?E.F-testsforNestedModel
ItisthemostgeneralformofF-testsandt-tests.
?
Itisequaltoat(yī)-testiftheunrestrictedandrestrictedmodelsdifferonlybyonesingleparameter.
ItisequaltothemodelF-testifwesettherestrictedmodeltotheconstant-onlymodel.
[Askstudents]WhatareSST,SSE,andSSR,andtheirassociateddegreesoffreedom,fortheconstant-onlymodel?
NumericalExample
Asociologicalstudyisinterestedinunderstandingthesocialdeterminantsofmat(yī)hematicalachievementamonghighschoolstudents.Youarenowaskedtoansweraseriesofquestions.Thedataarerealbuthavebeentailoredforeducat(yī)ionalpurposes.Thetotalnumberofobservationsis400.Thevariablesaredefinedas:
y:mathscore
x1:fat(yī)her'seducation
x2:mother'seducation
x3:family'ssocioeconomicstatus
x4:numberofsiblings
x5:classrank
x6:parents'totaleducation(note:x6
=
x1
+
x2)
Forthefollowingregressionmodels,weknow:
Table1
SST?SSR?SSE?DFR2
(1)yon(1x1x2x3x4)?34863?4201
(2)yon(1x6x3x4)?34863???396 .1065
(3)yon(1x6x3x4x5)?34863?10426?24437?395?.2991
(4)x5on(1x6x3x4)???269753 396?.0210
1.PleasefillthemissingcellsinTable1.
2.Testthehypothesisthattheeffectsoffather'seducat(yī)ion(x1)andmother'seducation(x2)onmathscorearethesameaftercontrollingforx3andx4.
3.Testthehypothesisthatx6,x3andx4inModel(2)allhaveazeroeffectony.
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
- 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2024年緊急貸款合同3篇
- 2024年鋼板租賃合同權(quán)益保障詳解
- 2024年版權(quán)許可合同監(jiān)測與維權(quán)
- 2024年設(shè)計項目合同終止補充合同版B版
- 2024年鋼琴購置協(xié)議
- 2024廢舊鋼材采購銷售協(xié)議書版
- 2024年直播電商與供應(yīng)鏈金融合作框架協(xié)議3篇
- 2024年環(huán)境治理項目合作合同
- 2024年財務(wù)會計技術(shù)咨詢合同
- 2024年高標準大理石采購及專業(yè)安裝服務(wù)協(xié)議版B版
- 2024-2020年上海高考英語作文試題匯編 (解讀及范文)
- 依法執(zhí)業(yè)與醫(yī)療安全培訓(xùn)課件
- C語言程序設(shè)計(蘭州石化職業(yè)技術(shù)大學(xué))知到智慧樹期末考試答案題庫2024年秋蘭州石化職業(yè)技術(shù)大學(xué)
- 《GMP基礎(chǔ)知識培訓(xùn)》課件
- fsQCA方法與案例分析
- 中國火鍋文化課件
- 礦產(chǎn)貿(mào)易風險管控
- 期末 (試題) -2024-2025學(xué)年川教版(三起)英語五年級上冊
- 2025屆四川省新高考八省適應(yīng)性聯(lián)考模擬演練 生物試卷(含答案)
- 安全生產(chǎn)方案及保證措施
- 非物質(zhì)文化遺產(chǎn)主題班會之英歌舞課件
評論
0/150
提交評論