版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
ChapterSeventeen
CorrelationAndRegression
第十七章相關(guān)分析與回歸分析
學(xué)習(xí)目標(biāo)討論積矩相關(guān)系數(shù)、偏相關(guān)和部分相關(guān)的概念,并說(shuō)明這些相關(guān)關(guān)系如何為回歸分析建立基礎(chǔ)。解釋二元回歸的特點(diǎn)和方法,描述其一般模型、參數(shù)估計(jì)、標(biāo)準(zhǔn)化回歸系數(shù)、顯著性檢驗(yàn)、預(yù)測(cè)準(zhǔn)確性、殘差分析和模型交叉檢驗(yàn)。解釋多元回歸分析的特點(diǎn)和方法,尤其是逐步回歸、含虛擬變量的回歸以及回歸中的方差和協(xié)方差分析。描述多元回歸分析中用的特殊方法,尤其是逐步回歸、含虛擬變量的回歸以及回歸中的方差和協(xié)方差分析。探討非定量相關(guān)及其測(cè)量指標(biāo)學(xué)習(xí)內(nèi)容積矩相關(guān)系數(shù)偏相關(guān)非定量相關(guān)回歸分析二元回歸統(tǒng)計(jì)與二元回歸分析的關(guān)系進(jìn)行二元回歸分析多元回歸
與二元回歸分析相關(guān)的統(tǒng)計(jì)量
進(jìn)行多元回歸分析逐步回歸多重共線性預(yù)測(cè)的相對(duì)重要性交叉驗(yàn)證回歸與虛擬變量方差分析與回歸分析總結(jié)ProductMomentCorrelation
積矩相關(guān)系數(shù)積矩相關(guān)系數(shù)r是最常用的概括兩個(gè)定量(定距或定比尺度)變量X與Y的關(guān)系強(qiáng)度的統(tǒng)計(jì)量它是一個(gè)決定X與Y是否存在線性關(guān)系的指標(biāo)。由于這個(gè)指標(biāo)最早由KarlPearson提出的,因此也被稱為Pearson相關(guān)系數(shù)。
它同時(shí)也叫簡(jiǎn)單相關(guān)系數(shù)、雙變量相關(guān)系數(shù)或者相關(guān)系數(shù)。ProductMomentCorrelation
積矩相關(guān)系數(shù)對(duì)于n個(gè)觀測(cè)值的樣本,變量為X和Y,積矩相關(guān)系數(shù)r計(jì)算為r=(Xi-X)(Yi-Y)Si=1n(Xi-X)2Si=1n(Yi-Y)2Si=1nDivisionofthenumeratoranddenominatorby(n-1)givesr=(Xi-X)(Yi-Y)n-1Si=1n(Xi-X)2n-1Si=1n(Yi-Y)2n-1Si=1n=COVxySxSyProductMomentCorrelation
積矩相關(guān)系數(shù)r在-1.0和+1.0之間變化。不論兩個(gè)變量各自的測(cè)量單位是什么,相關(guān)系數(shù)都是不變的。
ExplainingAttitudeTowardtheCityofResidence
研究對(duì)居住城市的態(tài)度Table17.1調(diào)查對(duì)象編號(hào)對(duì)城市的態(tài)度居住年限天氣的重要性161032912113812443415101211646175878224911188109910111017812225ProductMomentCorrelation
積矩相關(guān)系數(shù)相關(guān)系數(shù)的計(jì)算如下: =(10+12+12+4+12+6+8+2+18+9+17+2)/12 =9.333
XY =(6+9+8+3+10+4+5+2+11+9+10+2)/12 =6.583(Xi-X)(Yi-Y)Si=1n =(10-9.33)(6-6.58)+(12-9.33)(9-6.58) +(12-9.33)(8-6.58)+(4-9.33)(3-6.58) +(12-9.33)(10-6.58)+(6-9.33)(4-6.58) +(8-9.33)(5-6.58)+(2-9.33)(2-6.58) +(18-9.33)(11-6.58)+(9-9.33)(9-6.58) +(17-9.33)(10-6.58)+(2-9.33)(2-6.58) =-0.3886+6.4614+3.7914+19.0814 +9.1314+8.5914+2.1014+33.5714 +38.3214-0.7986+26.2314+33.5714 =179.6668ProductMomentCorrelation
積矩相關(guān)系數(shù)DecompositionoftheTotalVariation
總變差分解r2
=
Explained
variationTotal
variation
=
SSxSSy
=
Total
variation
-
Error
variationTotal
variation=
SSy
-
SSerrorSSy
DecompositionoftheTotalVariation
總方差分解
DecompositionoftheTotalVariation
總變差分解r=0時(shí)的非線性關(guān)系PartialCorrelation偏相關(guān)偏相關(guān)系數(shù)是用于測(cè)量在控制或調(diào)整了一個(gè)或多個(gè)變量的基礎(chǔ)上,兩個(gè)變量之間的關(guān)系計(jì)算偏相關(guān)系數(shù)是需要考慮其“階數(shù)”,這
“階數(shù)”說(shuō)明有多少個(gè)變量被控制或調(diào)整簡(jiǎn)單相關(guān)系數(shù)r是零階的,因?yàn)樵跍y(cè)量?jī)蓚€(gè)變量之間關(guān)系時(shí)不需要控制額外變量的作用。
PartialCorrelation偏相關(guān)
PartCorrelationCoefficient
部分相關(guān)系數(shù)部分相關(guān)系數(shù)代表從X中去除其他自變量線性影響后,Y和X之間的相關(guān)性。ry(x.z)部分相關(guān)系數(shù)計(jì)算如下:通常認(rèn)為偏相關(guān)系數(shù)比部分相關(guān)系數(shù)重要。ry(x.z)
=
rxy
-
ryzrxz1
-
rxz2NonmetricCorrelation非定量相關(guān)
rs
t
rt
t
rs
rRegressionAnalysis回歸分析Regression
analysis
examinesassociativerelationshipsbetweenametricdependentvariableandoneormoreindependentvariablesinthefollowingways:回歸分析是分析定量因變量與一個(gè)或多個(gè)自變量之間相關(guān)關(guān)系的有效且易用的方法,可以用于以下幾方面;Determinewhethertheindependentvariablesexplainasignificantvariationinthedependentvariable:whetherarelationshipexists.確定自變量是否能夠解釋因變量的重要變差,即二者之間是否存在關(guān)系。Determinehowmuchofthevariationinthedependentvariablecanbeexplainedbytheindependentvariables:strengthoftherelationship.確定因變量中有多大比例的變差可以有自變量來(lái)解釋,即關(guān)系的強(qiáng)度有多大。RegressionAnalysis回歸分析Determinethestructureorformoftherelationship:themathematicalequationrelatingtheindependentanddependentvariables.確定二者關(guān)系的形式,即與自變量和因變量有關(guān)的數(shù)學(xué)方程式。Predictthevaluesofthedependentvariable.預(yù)測(cè)因變量的值。Controlforotherindependentvariableswhenevaluatingthecontributionsofaspecificvariableorsetofvariables.在評(píng)估特定變量貢獻(xiàn)時(shí),控制其他變量的作用。Regressionanalysisisconcernedwiththenatureanddegreeofassociationbetweenvariablesanddoesnotimplyorassumeanycausality.盡管自變量可能解釋一部分因變量的變差,但這并不表示必然存在因果關(guān)系StatisticsAssociatedwithBivariate
RegressionAnalysis與二元回歸分析相關(guān)的統(tǒng)計(jì)量Bivariateregressionmodel.ThebasicregressionequationisYi=+Xi
+ei,whereY=dependentorcriterionvariable,X=independentorpredictorvariable,=interceptoftheline,=slopeoftheline,andeiistheerrortermassociatedwiththeithobservation.二元回歸模型,基本的回歸等式為Yi=+Xi
+ei,其中Yi
是因變量或標(biāo)準(zhǔn)變量,Xi為自變量或預(yù)測(cè)變量,為直線截距,為直線斜率,ei為第i個(gè)觀測(cè)值的誤差。Coefficientofdetermination.Thestrengthofassociationismeasuredbythecoefficientofdetermination,r2.Itvariesbetween0and1andsignifiestheproportionofthetotalvariationinYthatisaccountedforbythevariationinX.可決系數(shù)變量之間聯(lián)系的強(qiáng)度由可決系數(shù)r2
類測(cè)量,其值在0和1之間變化,表表示Y的總變差中能被X變差解釋的比例。Estimatedorpredictedvalue.TheestimatedorpredictedvalueofYiisi
=a+bx,whereiisthepredictedvalueofYi,andaandbareestimatorsof
and,respectively.
估計(jì)值或預(yù)測(cè)值:Yi的估計(jì)值或預(yù)測(cè)值為=a+bx,為Yi預(yù)測(cè)值,a
和
b
分別為和的估計(jì)值。
b0
b1
b0
b1
b0
b1StatisticsAssociatedwithBivariate
RegressionAnalysis與二元回歸分析相關(guān)的統(tǒng)計(jì)量Regressioncoefficient.Theestimatedparameterbisusuallyreferredtoasthenon-standardizedregressioncoefficient.回歸系數(shù)。估計(jì)的參數(shù)b通常是指非標(biāo)準(zhǔn)化回歸系數(shù)。Scattergram.Ascatterdiagram,orscattergram,isaplotofthevaluesoftwovariablesforallthecasesorobservations.散點(diǎn)圖。散點(diǎn)圖是根據(jù)兩個(gè)變量的所有觀測(cè)值繪制的圖。Standarderrorofestimate.Thisstatistic,SEE,isthestandarddeviationoftheactualYvaluesfromthepredictedvalues.估計(jì)標(biāo)準(zhǔn)誤。SEE表示Y的實(shí)際值與預(yù)測(cè)值之間的標(biāo)準(zhǔn)差Standarderror.Thestandarddeviationofb,SEb,iscalledthestandarderror.標(biāo)準(zhǔn)誤。B的標(biāo)準(zhǔn)差Seb被稱作標(biāo)準(zhǔn)誤。YStatisticsAssociatedwithBivariate
RegressionAnalysis與二元回歸分析相關(guān)的統(tǒng)計(jì)量Standardizedregressioncoefficient.Alsotermedthebetacoefficientorbetaweight,thisistheslopeobtainedbytheregressionofYonXwhenthedataarestandardized.標(biāo)準(zhǔn)化回歸系數(shù)。也被稱作beta系數(shù)或beta權(quán)數(shù),是X與Y均為標(biāo)準(zhǔn)化數(shù)據(jù)時(shí)的斜率。Sumofsquarederrors.Thedistancesofallthepointsfromtheregressionlinearesquaredandaddedtogethertoarriveatthesumofsquarederrors,whichisameasureoftotalerror,誤差平方和。將所有偏離回歸擬合線的點(diǎn)的距離的平方和加總就得到誤差平方和,值總誤差的測(cè)量指標(biāo),記作tstatistic.Atstatisticwithn-2degreesoffreedomcanbeusedtotestthenullhypothesisthatnolinearrelationshipexistsbetweenXandY,orH0:β=0,wheret=b/SEbT統(tǒng)計(jì)量。自由度為n-2的t統(tǒng)計(jì)量可用于檢驗(yàn)X與Y不存在線性關(guān)系的零假設(shè)。
ejS2ConductingBivariateRegressionAnalysis
PlottheScatterDiagram
二元回歸分析散點(diǎn)圖Ascatterdiagram,orscattergram,isaplotofthevaluesoftwovariablesforallthecasesorobservations.
散點(diǎn)圖就是根據(jù)兩個(gè)變量的所有觀測(cè)值繪制的圖表Themostcommonlyusedtechniqueforfittingastraightlinetoascattergramistheleast-squaresprocedure.Infittingtheline,theleast-squaresprocedureminimizesthesumofsquarederrors,用一條直線對(duì)散點(diǎn)圖進(jìn)行擬合的最常用方法為最小二乘法.為找到最佳擬合線,最小二乘法可以令誤差平方和最小。
ejS2ConductingBivariateRegressionAnalysis進(jìn)行二元回歸分析法Fig.17.2圖PlottheScatterDiagram繪制散點(diǎn)圖FormulatetheGeneralModel建立二元回歸模型EstimatetheParameters估計(jì)參數(shù)EstimateStandardizedRegressionCoefficients估計(jì)標(biāo)準(zhǔn)化回歸系數(shù)TestforSignificance顯著性檢驗(yàn)DeterminetheStrengthandSignificanceofAssociation確定相關(guān)關(guān)系的強(qiáng)度和顯著性CheckPredictionAccuracy檢查預(yù)測(cè)準(zhǔn)確度ExaminetheResiduals殘差檢驗(yàn)
Cross-ValidatetheModel模型交叉檢驗(yàn)ConductingBivariateRegressionAnalysis
FormulatetheBivariateRegressionModel分析二元回歸模型Inthebivariateregressionmodel,thegeneralformofastraightlineis:Y
=X
b0+
b1whereY=dependentorcriterionvariable因變量或標(biāo)準(zhǔn)變量X=independentorpredictorvariable自變量或預(yù)測(cè)變量
=interceptoftheline直線的截距
b0
b1=slopeoftheline直線的斜率
Theregressionprocedureaddsanerrortermtoaccountfortheprobabilisticorstochasticnatureoftherelationship:在回歸分析中需要加上誤差項(xiàng),以便考察變量之間關(guān)系的隨機(jī)性Yi
=
b0+
b1
Xi+eiwhereeiistheerrortermassociatedwiththeithobservation.式中ei為第I個(gè)觀察值相關(guān)的誤差項(xiàng)PlotofAttitudewithDurationFig.17.34.52.256.7511.25913.593615.7518DurationofResidenceAttitudeWhichStraightLineIsBest?Fig.17.49
6
3
2.25
4.5
6.75
9
11.25
13.5
15.75
18
Line1
Line2
Line3
Line4
BivariateRegression二元回歸
Fig.17.5X2X1X3X5X4YJeJeJYJXYβ0+β1XConductingBivariateRegressionAnalysis
EstimatetheParameters二元回歸參數(shù)分析 areunknownandareestimatedfromthesampleobservationsusingtheequation在大多數(shù)情況下,和是未知的,需要根據(jù)等式從樣本觀測(cè)值中估計(jì)
whereiistheestimatedorpredictedvalueofYi,andaandbareestimatorsofInmostcases,
b0and
b1Yi=a+bxiYand,respectively.
b=COVxySx2=(Xi-X)(Yi-Y)Si=1n(Xi-X)Si=1n2=XiYi-nXYSi=1nXi2-nX2Si=1n
b0
b1b0ConductingBivariateRegressionAnalysis
EstimatetheParameters二元回歸參數(shù)分析Theintercept,a,maythenbecalculatedusing:截距a則可以計(jì)算如下 a=ForthedatainTable17.1,theestimationofparametersmaybe
illustratedasfollows:
=(10)(6)+(12)(9)+(12)(8)+(4)(3)+(12)(10)+(6)(4) +(8)(5)+(2)(2)+(18)(11)+(9)(9)+(17)(10)+(2)(2) =917
Xi2 =102+122+122+42+122+62 +82+22+182+92+172+22 =1350-bYXS12iS=112=i1XiYiConductingBivariateRegressionAnalysis
EstimatetheParameters二元回歸參數(shù)分析Itmayberecalledfromearliercalculationsofthesimplecorrelationthat:前面我們講過(guò)簡(jiǎn)單相關(guān)系數(shù)的計(jì)算為: =9.333 =6.583
Givenn=12,bcanbecalculatedas:
=0.5897
a=XYb
=
917
-
(12)
(9.333)
(
6.583)1350
-
(12)
(9.333)2Y-bX
=6.583-(0.5897)(9.333) =1.0793ConductingBivariateRegressionAnalysis
EstimatetheStandardizedRegressionCoefficient估計(jì)標(biāo)準(zhǔn)化回歸系數(shù)Standardizationistheprocessbywhichtherawdataaretransformedintonewvariablesthathaveameanof0andavarianceof1(Chapter14).標(biāo)準(zhǔn)化就是將原始數(shù)據(jù)轉(zhuǎn)換為均值為0,方差為1的新變量的過(guò)程(見14章)Whenthedataarestandardized,theinterceptassumesavalueof0.數(shù)據(jù)進(jìn)行標(biāo)準(zhǔn)化后,截距取值為0Thetermbetacoefficientorbetaweight
isusedtodenotethestandardizedregressioncoefficient.Bata系數(shù)被用來(lái)表示標(biāo)準(zhǔn)化回歸系數(shù)。 Byx=Bxy
=rxy
Thereisasimplerelationshipbetweenthestandardizedandnon-standardizedregressioncoefficients:標(biāo)準(zhǔn)化和非標(biāo)準(zhǔn)化回歸系數(shù)的關(guān)系可以簡(jiǎn)單表示如下:
Byx=byx(Sx/Sy)ConductingBivariateRegressionAnalysis
TestforSignificance二元回歸顯著性檢驗(yàn)ThestatisticalsignificanceofthelinearrelationshipbetweenXandYmaybetestedbyexaminingthehypotheses:對(duì)于X和Y之間的線性關(guān)系的統(tǒng)計(jì)顯著性可以通過(guò)以下假設(shè)進(jìn)行檢驗(yàn)Atstatisticwithn-2degreesoffreedomcanbeused,where通常采用雙尾檢驗(yàn),對(duì)此要采用自由度為n-2的t統(tǒng)計(jì)量SEbdenotesthestandarddeviationofbandiscalledthestandarderror.Seb表示b的標(biāo)準(zhǔn)差,被稱為標(biāo)準(zhǔn)誤。
H0:b1=0H1:b110t
=
bSEbConductingBivariateRegressionAnalysis
TestforSignificance二元回歸顯著性檢驗(yàn)Usingacomputerprogram,theregressionofattitudeondurationofresidence,usingthedatashowninTable17.1,yieldedtheresultsshowninTable17.2.Theintercept,a,equals1.0793,andtheslope,b,equals0.5897.Therefore,theestimatedequationis:用計(jì)算機(jī)程序,根據(jù)表17-1的數(shù)據(jù),可以建立對(duì)城市態(tài)度與居住年限的回歸方程,其中截距a=1.0793,斜率b=0.5897,估計(jì)的方程式為:Attitude()=1.0793+0.5897(Durationofresidence)Thestandarderror,orstandarddeviationofbisestimatedas0.07008,andthevalueofthetstatisticast=0.5897/0.0700=8.414,withn-2=10degreesoffreedom.B的標(biāo)準(zhǔn)誤或標(biāo)準(zhǔn)差為0.07008,t=0.5897/0.0700自由度為n-2=10FromTable4intheStatisticalAppendix,weseethatthecriticalvalueoftwith10degreesoffreedomand=0.05is2.228foratwo-tailedtest.Sincethecalculatedvalueoftislargerthanthecriticalvalue,thenullhypothesisisrejected.從附錄統(tǒng)計(jì)表4中,我們可以找到自由度為10,a=0.05時(shí)雙尾檢驗(yàn)t的臨界值為2.228,由于t的計(jì)算值大于臨界值,零假設(shè)被拒絕。
aYConductingBivariateRegressionAnalysis
DeterminetheStrengthandSignificanceofAssociation確定相關(guān)關(guān)系的強(qiáng)度和顯著性Thetotalvariation,SSy,maybedecomposedintothevariationaccountedforbytheregressionline,SSreg,andtheerrororresidualvariation,SSerrororSSres,asfollows:總變差SSy可以分解為回歸變差,SSreg和殘差SSerror或Ssres,即SSy=SSreg+SSreswhere
S
S
y
=
(
Y
i
-
Y
)
2
n
S
i
=1
S
S
r
e
g
=
(
Y
i
-
Y
)
2
S
S
r
e
s
=(
Y
i
-
Y
i
)
2
n
S
i
=1n
S
i
=1DecompositionoftheTotal
VariationinBivariateRegression
二元回歸中的總變差分解Fig.17.6X2X1X3X5X4YXTotalVariationSSyResidualVariation殘余變差SSresExplainedVariation解釋變差SSregYConductingBivariateRegressionAnalysis
DeterminetheStrengthandSignificanceofAssociation確定相關(guān)關(guān)系的強(qiáng)度和顯著性Toillustratethecalculationsofr2,letusconsideragaintheeffectofattitudetowardthecityonthedurationofresidence.Itmayberecalledfromearliercalculationsofthesimplecorrelationcoefficientthat:為說(shuō)明r2
計(jì)算,我們?nèi)稳灰跃幼∧晗迣?duì)城市態(tài)度的影響為例。在此之前我們?cè)?jīng)計(jì)算過(guò)簡(jiǎn)單相關(guān)系數(shù)
=120.9168SSy=(Yi-Y)2Si=1n
r
2
=
S
S
r
e
g
S
S
y
=
S
S
y
-
S
S
r
e
s
S
S
y
Thestrengthofassociationmaythenbecalculatedasfollows:變量之間聯(lián)系的強(qiáng)度計(jì)算如下;ConductingBivariateRegressionAnalysis
DeterminetheStrengthandSignificanceofAssociation確定相關(guān)關(guān)系的強(qiáng)度和顯著性Thepredictedvalues()canbecalculatedusingtheregression預(yù)測(cè)值可以通過(guò)回歸方程來(lái)計(jì)算equation:Attitude()=1.0793+0.5897(Durationofresidence)ForthefirstobservationinTable17.1,thisvalueis:()=1.0793+0.5897x10=6.9763.Foreachsuccessiveobservation,thepredictedvaluesare,inorder,8.1557,8.1557,3.4381,8.1557,4.6175,5.7969,2.2587,11.6939,6.3866,11.1042,and2.2587.對(duì)以后各項(xiàng)觀測(cè)值,預(yù)測(cè)值依次為8.1557,8.1557,3.4381,8.1557,4.6175,5.7969,2.2587,11.6939,6.3866,11.1042,和2.2587YYYConductingBivariateRegressionAnalysis
DeterminetheStrengthandSignificanceofAssociation確定相關(guān)關(guān)系的強(qiáng)度和顯著性Therefore,
=(6.9763-6.5833)2+(8.1557-6.5833)2 +(8.1557-6.5833)2+(3.4381-6.5833)2 +(8.1557-6.5833)2+(4.6175-6.5833)2 +(5.7969-6.5833)2+(2.2587-6.5833)2 +(11.6939-6.5833)2+(6.3866-6.5833)2 +(11.1042-6.5833)2+(2.2587-6.5833)2 =0.1544+2.4724+2.4724+9.8922+2.4724 +3.8643+0.6184+18.7021+26.1182 +0.0387+20.4385+18.7021
=105.9524SSreg=(Yi-Y)2Si=1nConductingBivariateRegressionAnalysis
DeterminetheStrengthandSignificanceofAssociation確定相關(guān)關(guān)系的強(qiáng)度和顯著性 =(6-6.9763)2+(9-8.1557)2+(8-8.1557)2
+(3-3.4381)2+(10-8.1557)2+(4-4.6175)2 +(5-5.7969)2+(2-2.2587)2+(11-11.6939)2 +(9-6.3866)2+(10-11.1042)2+(2-2.2587)2
=14.9644ItcanbeseenthatSSy=SSreg+SSres.Furthermore,
r2 =SSreg/SSy =105.9524/120.9168 =0.8762SSres=(Yi-Yi)2Si=1nConductingBivariateRegressionAnalysis
DeterminetheStrengthandSignificanceofAssociation
確定相關(guān)關(guān)系的強(qiáng)度和顯著性Another,equivalenttestforexaminingthesignificanceofthelinearrelationshipbetweenXandY(significanceofb)isthetestforthesignificanceofthecoefficientofdetermination.Thehypothesesinthiscaseare:
另外一個(gè)考察X與Y之間線性關(guān)系顯著性(b的顯著性)的等價(jià)檢驗(yàn),是可決系數(shù)顯著性檢驗(yàn)。該檢驗(yàn)的假設(shè)為:
H0:R2pop=0
H1:R2pop>0ConductingBivariateRegressionAnalysis
DeterminetheStrengthandSignificanceofAssociation
確定相關(guān)關(guān)系的強(qiáng)度和顯著性TheappropriateteststatisticistheFstatistic:適當(dāng)?shù)慕y(tǒng)計(jì)檢驗(yàn)量為F統(tǒng)計(jì)量
whichhasanFdistributionwith1andn-2degreesoffreedom.TheFtestisageneralizedformofthettest(seeChapter15).Ifarandomvariableistdistributedwithndegreesoffreedom,thent2isFdistributedwith1andndegreesoffreedom.Hence,theFtestfortestingthesignificanceofthecoefficientofdeterminationisequivalenttotestingthefollowinghypotheses:它服從F分布,自由度為1和n-2。F檢驗(yàn)是t檢驗(yàn)的一般形式,如果隨機(jī)變量服從自由度為n的t分布,那么t2就服從自由度為1和n的F分布。因此檢驗(yàn)可決系數(shù)顯著性的F檢驗(yàn)與以下假設(shè)意義相同: orF
=
SSregSSres/(n-2)
H0:b1=0H1:b110
H0:r=0H1:r10ConductingBivariateRegressionAnalysis
DeterminetheStrengthandSignificanceofAssociation
確定相關(guān)關(guān)系的強(qiáng)度和顯著性FromTable17.2,itcanbeseenthat:
r2=105.9522/(105.9522+14.9644)
=0.8762
Whichisthesameasthevaluecalculatedearlier.ThevalueoftheFstatisticis:
F=105.9522/(14.9644/10)=70.8027
with1and10degreesoffreedom.ThecalculatedFstatisticexceedsthecriticalvalueof4.96determinedfromTable5intheStatisticalAppendix.Therefore,therelationshipissignificantat=0.05,corroboratingtheresultsofthettest.自由度為1和10.計(jì)算出的F統(tǒng)計(jì)量超過(guò)了根據(jù)附錄統(tǒng)計(jì)表5查到的臨界值4.96,因此,變量之間的關(guān)系在a=0.05的對(duì)平下顯著,證實(shí)了t檢驗(yàn)的結(jié)果
aBivariateRegression
二元回歸Table17.2MultipleR 0.93608R2 0.87624AdjustedR2 0.86387StandardError 1.22329
ANALYSISOFVARIANCE
df SumofSquares MeanSquareRegression 1 105.95222 105.95222Residual 10 14.964441.49644F=70.80266 SignificanceofF=0.0000VARIABLESINTHEEQUATIONVariable b SEb Beta(?) TSignificance ofTDuration 0.58972 0.070080.93608 8.414 0.0000(Constant) 1.07932 0.74335 1.452 0.1772BivariateRegression
二元回歸Table17.2多元R
0.93608R2 0.87624調(diào)整的
R2 0.86387標(biāo)準(zhǔn)誤
1.22329
方差分析
自由度
平方和
均方回歸方程 1 105.95222 105.95222殘差
10 14.964441.49644F=70.80266
F
的顯著性
=0.0000等式中的變量變量
b SEb Beta(?) TSignificanceofT
居住年限 0.58972 0.070080.93608 8.414 0.0000(常數(shù)項(xiàng) 1.07932 0.74335 1.452 0.1772ConductingBivariateRegressionAnalysis
CheckPredictionAccuracy檢查預(yù)測(cè)準(zhǔn)確度Toestimatetheaccuracyofpredictedvalues,,itisusefultocalculatethestandarderrorofestimate,SEE.為估計(jì)預(yù)測(cè)值的準(zhǔn)確性,有必要計(jì)算估計(jì)的標(biāo)準(zhǔn)誤SEE,這個(gè)統(tǒng)計(jì)量表示Y的實(shí)際值與預(yù)測(cè)值之間的標(biāo)準(zhǔn)差。
orormoregenerally,iftherearekindependentvariables,如果有K個(gè)自變量,一般形式為
ForthedatagiveninTable17.2,theSEEisestimatedasfollows:
=1.22329Y2(12)?--=?=nSEEniiiYY2-=nSEESSres1--=knSEESSresSEE
=
14.9644/(12-2)Assumptions假設(shè)Theerrortermisnormallydistributed.ForeachfixedvalueofX,thedistributionofYisnormal.誤差項(xiàng)呈正態(tài)分布,對(duì)于每個(gè)X的固定值,Y為正態(tài)分布ThemeansofallthesenormaldistributionsofY,givenX,lieonastraightlinewithslopeb.給定X,所有正態(tài)分布的Y的均值位于一條斜率為b的直線上Themeanoftheerrortermis0.誤差項(xiàng)的均值為0Thevarianceoftheerrortermisconstant.ThisvariancedoesnotdependonthevaluesassumedbyX.誤差項(xiàng)的方差固定,方差不隨X值變化Theerrortermsareuncorrelated.Inotherwords,theobservationshavebeendrawnindependently.誤差項(xiàng)是不相關(guān)的,即觀測(cè)值是相互獨(dú)立的。MultipleRegression多元回歸Thegeneralformofthemultipleregressionmodelisasfollows:多元回歸模型的一般形式如下:whichisestimatedbythefollowingequation:該模型通過(guò)以下公式盡享估算 =a+b1X1+b2X2+b3X3+...+bkXk
Asbefore,thecoefficientarepresentstheintercept,buttheb'sarenowthepartialregressioncoefficients.如前所述,系數(shù)a代表的是截距,但b現(xiàn)在是偏回歸系數(shù)。Y
Y=b0+b1X1+b2X2+b3X3+...+bkXk+eeStatisticsAssociatedwithMultipleRegression與多元回歸有關(guān)的統(tǒng)計(jì)量AdjustedR2.R2,coefficientofmultipledetermination,isadjustedforthenumberofindependentvariablesandthesamplesizetoaccountforthediminishingreturns.Afterthefirstfewvariables,theadditionalindependentvariablesdonotmakemuchcontribution.
調(diào)整的。將多元可決系數(shù)根據(jù)自變量和樣本規(guī)模進(jìn)行調(diào)整,除了前幾個(gè)自變量,其他自變量對(duì)因變量的影響不大。Coefficientofmultipledetermination.Thestrengthofassociationinmultipleregressionismeasuredbythesquareofthemultiplecorrelationcoefficient,R2,whichisalsocalledthecoefficientofmultipledetermination.多元可決系數(shù)。多元回歸中變量之間關(guān)系的強(qiáng)度由多元相關(guān)系數(shù)的平方R2來(lái)測(cè)量Ftest.TheFtestisusedtotestthenullhypothesisthatthecoefficientofmultipledeterminationinthepopulation,R2pop,iszero.Thisisequivalenttotestingthenullhypothesis.TheteststatistichasanFdistributionwithkand(n-k-1)degreesoffreedom.F檢驗(yàn)。F檢驗(yàn)用于檢驗(yàn)樣本總體多元可決系數(shù)R2pop為0的假設(shè)。這與檢驗(yàn)零假設(shè)是等價(jià)的。檢驗(yàn)統(tǒng)計(jì)量服從F分布,自由度為k和(n-k-1).StatisticsAssociatedwithMultipleRegression與多元回歸有關(guān)的統(tǒng)計(jì)量PartialFtest.Thesignificanceofapartialregressioncoefficient,,ofXimaybetestedusinganincrementalFstatistic.TheincrementalFstatisticisbasedontheincrementintheexplainedsumofsquaresresultingfromtheadditionoftheindependentvariableXitotheregressionequationafteralltheotherindependentvariableshavebeenincluded.偏F檢驗(yàn)。對(duì)Xi的偏回歸系數(shù)進(jìn)行顯著性檢驗(yàn)可以應(yīng)用遞增F統(tǒng)計(jì)量。遞增F統(tǒng)計(jì)量取決于在所有其他自變量都包括在模型中的情況下,向回歸方程引入新自變量時(shí)可解釋平方和的增量。
Partialregressioncoefficient.Thepartialregressioncoefficient,b1,denotesthechangeinthepredictedvalue,,perunitchangeinX1whentheotherindependentvariables,X2toXk,areheldconstant.偏回歸系數(shù)。偏回歸系數(shù)b1表示在X2到Xk均固定不變時(shí),改變一單位X1引起的預(yù)測(cè)值的變化。Y
biConductingMultipleRegressionAnalysis
PartialRegressionCoefficients偏回歸系數(shù)
Tounderstandthemeaningofapartialregressioncoefficient,letusconsideracaseinwhichtherearetwoindependentvariables,sothat:
為便于理解偏回歸系數(shù)的意義,我們假設(shè)有兩個(gè)自變量,所以有如下公式
=a+b1X1+b2X2First,notethattherelativemagnitudeofthepartialregressioncoefficientofanindependentvariableis,ingeneral,differentfromthatofitsbivariateregressioncoefficient.首先,注意一個(gè)自變量的偏回歸系數(shù)的相對(duì)重要性在總體上不如其二元回歸系數(shù)。Theinterpretationofthepartialregressioncoefficient,b1,isthatitrepresentstheexpectedchangeinYwhenX1ischangedbyoneunitbutX2isheldconstantorotherwisecontrolled.Likewise,b2representstheexpectedchangein
YforaunitchangeinX2,whenX1isheldconstant.Thus,callingb1andb2partialregressioncoefficientsisappropriate.偏回歸系數(shù)b1代表的意義是,X2在不變或受到控制的前提下,X1變化一個(gè)單位會(huì)使Y產(chǎn)生的預(yù)期變化。同樣b2代表的意義是,X1在不變或受到控制的前提下,X2變化一個(gè)單位會(huì)引起Y產(chǎn)生的預(yù)期變化。YConductingMultipleRegressionAnalysis
PartialRegressionCoefficients偏回歸系數(shù)ItcanalsobeseenthatthecombinedeffectsofX1andX2onYareadditive.Inotherwords,ifX1andX2areeachchangedbyoneunit,theexpectedchangeinYwouldbe(b1+b2).。X2,和
X1對(duì)Y的聯(lián)合作用是累加的。即如果都改變一個(gè)單位,Y的預(yù)期變化就是(b1+b2
)SupposeonewastoremovetheeffectofX2fromX1.ThiscouldbedonebyrunningaregressionofX1onX2.Inotherwords,onewouldestimatetheequation1=a+bX2andcalculatetheresidualXr=(X1-1).Thepartialregressioncoefficient,
b1,isthenequaltothebivariateregressioncoefficient,br,obtainedfromtheequation=a+brXr.假設(shè)我們希望從X1中X2去除的影響,可以用X2對(duì)X1進(jìn)行回歸,也就是估計(jì)方程=a+bX2
,并計(jì)算殘差Xr=(X1-1),因此,偏回歸系數(shù)br與方程Y=a+brXr.中的二元回歸系數(shù)相等。XXYConductingMultipleRegressionAnalysis
PartialRegressionCoefficients偏回歸系數(shù)Extensiontothecaseofkvariablesisstraightforward.Thepartialregressioncoefficient,b1,representstheexpectedchangeinYwhenX1ischangedbyoneunitandX2throughXkareheldconstant.Itcanalsobeinterpretedasthebivariateregressioncoefficient,b,fortheregressionofYontheresidualsofX1,whentheeffectofX2throughXkhasbeenremovedfromX1.
以上方程可以直接擴(kuò)展到K個(gè)變量的情況。偏回歸系數(shù)b1道標(biāo)X2到Xk固定時(shí),X1 變化一單位引起Y的預(yù)期變化。它也可以解釋為去除X2到Xk對(duì)X1的影響后,Y對(duì)X1殘差回歸的二元回歸系數(shù)。Therelationshipofthestandardizedtothenon-standardizedcoefficientsremainsthesameasbefore:
標(biāo)準(zhǔn)化與非標(biāo)準(zhǔn)化系數(shù)之間的關(guān)系為:
B1=b1(Sx1/Sy) Bk=bk(Sxk/Sy)Theestimatedregressionequationis:估計(jì)出的回歸方程為;
()=0.33732+0.48108X1+0.28865X2orAttitude=0.33732+0.48108(Duration)+0.28865(Importance)YMultipleRegressionTable17.3MultipleR 0.97210R2 0.94498AdjustedR2 0.93276StandardError 0.85974
ANALYSISOFVARIANCE
df SumofSquares MeanSquareRegression 2 114.26425 57.13213
Residual 9 6.65241 0.73916
F=77.29364 SignificanceofF=0.0000VARIABLESINTHEEQUATIONVariable b SEb Beta(?) TSignificance ofTIMPORTANCE 0.28865 0.086080.31382 3.353 0.0085
DURATION 0.48108 0.058950.76363 8.160 0.0000
(Constant) 0.33732 0.56736 0.595 0.5668
多元回歸Table17.3多元
R
0.97210R2 0.94498調(diào)整的
R2 0.93276標(biāo)準(zhǔn)誤 0.85974
ANALYSISOFVARIANCE
df SumofSquares MeanSquare回歸方程 2 114.26425 57.13213
殘差 9 6.65241 0.73916
F=77.29364 SignificanceofF=0.0000VARIABLESINTHEEQUATION變量
b SEb Beta(?) TSignificance ofT天氣重要性 0.28865 0.086080.31382 3.353 0.0085
居住年限 0.48108 0.058950.76363 8.160 0.0000
(常數(shù)項(xiàng)
0.33732 0.56736 0.595 0.5668
ConductingMultipleRegressionAnalysis
StrengthofAssociation聯(lián)系的強(qiáng)度SSy=SSreg+SSreswhereSSreg=(Yi-Y)2Si=1nSSy=(Yi-Y)2Si=1nSSres=(Yi-Yi)2Si=1nConductingMultipleRegressionAnalysis
StrengthofAssociation聯(lián)系的強(qiáng)度Thestrengthofassociationismeasuredbythesquareofthemultiplecorrelationcoefficient,R2,whichisalsocalledthecoefficientofmultipledetermination. 變量之間聯(lián)系的強(qiáng)度可以用多元相關(guān)系數(shù)的平方R2,來(lái)測(cè)量,也稱多元可決系數(shù)。R2
=
SSregSSyR2isadjustedforthenumberofindependentvariablesandthesamplesizebyusingthefollowingformula:
R2可以根據(jù)自變量的數(shù)量和樣本規(guī)模按照如下公式調(diào)整AdjustedR2
=R2
-
k(1
-
R2)n
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025年分期付款室內(nèi)植物購(gòu)買合同
- 二零二五版智能物流配送服務(wù)合同范本6篇
- 2025版煤炭?jī)?chǔ)備基地建設(shè)與管理合同3篇
- 2025年度智慧社區(qū)住宅買賣二手房購(gòu)房合同4篇
- 二零二五年度城市商業(yè)街區(qū)鋪面租賃管理協(xié)議4篇
- 新時(shí)代下的商業(yè)技術(shù)創(chuàng)新發(fā)展報(bào)告
- 二零二五年度淋浴房智能節(jié)水系統(tǒng)安裝與維護(hù)合同3篇
- 2025汽車轉(zhuǎn)讓合同范本參考
- 二零二五年度韻達(dá)快遞業(yè)務(wù)承包合同及戰(zhàn)略合作伙伴關(guān)系3篇
- 2025年物業(yè)管理企業(yè)物業(yè)費(fèi)收繳與結(jié)算服務(wù)合同3篇
- 2023年十八項(xiàng)醫(yī)療核心制度考試題與答案
- 氣管切開患者氣道濕化的護(hù)理進(jìn)展資料 氣管切開患者氣道濕化
- 管理模板:某跨境電商企業(yè)組織結(jié)構(gòu)及部門職責(zé)
- 底架總組裝工藝指導(dǎo)書
- 簡(jiǎn)單臨時(shí)工勞動(dòng)合同模板(3篇)
- 聚酯合成反應(yīng)動(dòng)力學(xué)
- 自動(dòng)控制原理全套課件
- 上海科技大學(xué),面試
- 《五年級(jí)奧數(shù)總復(fù)習(xí)》精編課件
- TS2011-16 帶式輸送機(jī)封閉棧橋圖集
- 礦區(qū)道路工程施工組織設(shè)計(jì)方案
評(píng)論
0/150
提交評(píng)論