![統(tǒng)計(jì)學(xué)課件-英文版_第1頁(yè)](http://file2.renrendoc.com/fileroot_temp3/2021-11/29/42944275-4963-4fc2-aac2-5d8fffea4afd/42944275-4963-4fc2-aac2-5d8fffea4afd1.gif)
![統(tǒng)計(jì)學(xué)課件-英文版_第2頁(yè)](http://file2.renrendoc.com/fileroot_temp3/2021-11/29/42944275-4963-4fc2-aac2-5d8fffea4afd/42944275-4963-4fc2-aac2-5d8fffea4afd2.gif)
![統(tǒng)計(jì)學(xué)課件-英文版_第3頁(yè)](http://file2.renrendoc.com/fileroot_temp3/2021-11/29/42944275-4963-4fc2-aac2-5d8fffea4afd/42944275-4963-4fc2-aac2-5d8fffea4afd3.gif)
![統(tǒng)計(jì)學(xué)課件-英文版_第4頁(yè)](http://file2.renrendoc.com/fileroot_temp3/2021-11/29/42944275-4963-4fc2-aac2-5d8fffea4afd/42944275-4963-4fc2-aac2-5d8fffea4afd4.gif)
![統(tǒng)計(jì)學(xué)課件-英文版_第5頁(yè)](http://file2.renrendoc.com/fileroot_temp3/2021-11/29/42944275-4963-4fc2-aac2-5d8fffea4afd/42944275-4963-4fc2-aac2-5d8fffea4afd5.gif)
版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、a powerpoint presentation package to accompanyapplied statistics in business & economics, 4thedition 4-2descriptive statisticschapter contents4.1 numerical description4.2 measures of center4.3 measures of variability4.4 standardized data4.5 percentiles, quartiles, and box plots4.6 correlation an
2、d covariance4.7 grouped data4.8 skewness and kurtosischapter 44-3chapter learning objectiveslo4-1:explain the concepts of center, variability, and shape.lo4-2:use excel to obtain descriptive statistics and visual displays.lo4-3:calculate and interpret common measures of center.lo4-4:calculate and in
3、terpret common measures of variability.lo4-5: transform a data set into standardized values.lo4-6:apply the empirical rule and recognize outliers.chapter 4descriptive statistics4-4chapter learning objectiveslo4-7:calculate quartiles and other percentiles.lo4-8:make and interpret box plots.lo4-9:calc
4、ulate and interpret a correlation coefficient and covariance.lo4-10:calculate the mean and standard deviation from grouped data.lo4-11:assess skewness and kurtosis in a sample.chapter 4descriptive statistics4-5chapter 44.1 numerical descriptionlo4-1:explain the concepts of center, variability, and s
5、hape.three key characteristics of numerical data:lo4-14-6chapter 4lo4-2:use excel to obtain descriptive statistics and visual displays.lo4-2excel histogram display for tables 4.34.1 numerical description4-7a familiar measure of centerin excel, use function =average(data) where data is an array of da
6、ta values. formulas insert function () statisticalaveragepopulation meansample meanmeanchapter 44.2 measures of centerlo4-3lo4-3:calculate and interpret common measures of center.4-8chapter 4lo4-34.2 measures of centerthe mean balances the positive and negative deviations ( ) from the mean, in that
7、. 4-9the median (m) is the 50thpercentile or midpoint of the sortedsample data.m separates the upper and lower halves of the sorted observations.if n is odd, the median is the middle observation in the data array.if n is even, the median is the average of the middle two observations in the data arra
8、y.formulas insert function () statisticalmedianmedianchapter 4lo4-34.2 measures of centernf4-10the most frequently occurring data value.may have multiple modes or no mode.the mode is most useful for discrete or categorical data with only afew distinct data values. for continuous data or data with a
9、widerange, the mode is rarely useful.modechapter 4lo4-34.2 measures of center4-11 compare mean and median or look at the histogram to determine degree of skewness. figure 4.10 shows prototype population shapes showing varying degrees of skewness.shapechapter 4lo4-1:explain the concepts of center, va
10、riability, and shape.lo4-14.2 measures of center4-12the geometric mean (g) is a multiplicative average.formulasinsert function () statisticalgeomeangeometric meanchapter 4growth ratesa variation on the geometric mean used to find the average growth rate for a time series.4.2 measures of centerlo4-34
11、-13for example, from 2006 to 2010, jetblue airlines revenues are:yearrevenue (mil)20062,36120072,84320083,39220093,29220103,779growth ratesthe average growth rate:or 12.5 % per year.chapter 44.2 measures of centerlo4-34-14the midrange is the point halfway between the lowest and highest values of x.e
12、asy to use but sensitive to extreme data values.here, the midrange (126.5) is higher than the mean (114.70) or median (113).midrangefor the j.d. power quality data:chapter 44.2 measures of centerlo4-34-15to calculate the trimmed mean, first remove the highest and lowest k percent of the observations
13、.for example, for the n = 33 p/e ratios, we want a 5 percent trimmed mean (i.e., k = .05).to determine how many observations to trim, multiply k by n, which is 0.05 x 33 = 1.65 or 2 observations. so, we would remove the two smallest and two largest observations before averaging the remaining values.
14、trimmed meanchapter 44.2 measures of centerlo4-34-16here is a summary of all the measures of central tendency for the j.d. power data.the trimmed mean mitigates the effects of very high values, but still exceeds the median.mean:114.70=average(data)median:113=median(data)mode:111=mode.sngl(data)geome
15、tric mean:113.35=geomean(data)midrange:126.5(min(data)+max(data)/25% trim mean:113.94=trimmean(data, 0.1)trimmed meanchapter 4lo4-34.2 measures of center4-17variation is the “spread” of data points about the center of the distribution in a sample. consider the following measures of variability:stati
16、sticformulaexcelproconrangexmax xmin=max(data) -min(data)easy to calculatesensitive to extreme data values.sample variance (s2)=var.s(data)plays a key role in mathematical statistics.nonintuitivemeaning.measures of variabilitychapter 44.3 measures of variabilitylo4-4: calculate and interpret common
17、measures of variability.lo4-44-18statisticformulaexcelproconsample standard deviation (s)=stdev.s(data)most common measure. uses same units as the raw data ($ , , , grams etc.).nonintuitivemeaning.measures of variationsample coef-ficient. ofvariation (cv)nonemeasures relative variation in percent so
18、 can compare data sets.requires non-negative data.chapter 4lo4-44.3 measures of variability4-19statisticformulaexcelproconmean absolute deviation (mad)=avedev(data)easy to understand.lacks “nice” theoretical properties.measures of variability1niixxnpopulation varianceformulas insert function ()stati
19、sticalvar.ppopulation standard deviationformulas insert function() statisticalstdev.pchapter 44.3 measures of variabilitylo4-44-20chapter 44.3 measures of variabilitylo4-4variance and standard deviation4-21useful for comparing variables measured in different units or with different means.a unit-free
20、 measure of dispersion.expressed as a percent of the mean.only appropriate for nonnegative data. it is undefined if the mean is zero or negative.coefficient of variationchapter 44.3 measures of variabilitylo4-44-22this statistic reveals the average distance from the center.absolute values must be us
21、ed since otherwise the deviations around the mean would sum to zero. it is stated in the unit of measurement.the mad is appealing because of its simple interpretation.formulas insert function ()statisticalavedevmean absolute deviationchapter 44.3 measures of variabilitylo4-44-23chapter 44.3 measures
22、 of variabilitylo4-4mean absolute deviation4-24take frequent samples to monitor quality.central tendency vs. dispersion: manufacturingchapter 44.3 measures of variabilitylo4-14-25for any population with mean m and standard deviation s, the percentage of observations that lie within k standard deviat
23、ions of the mean must be at least 1001 1/k2. chebyshevs theoremfor k = 2 standard deviations, 1001 1/22 = 75%so, at least 75.0% will lie within m + 2sfor k = 3 standard deviations, 1001 1/32 = 88.9%so, at least 88.9% will lie within m + 3salthough applicable to any data set, these limits tend to be
24、rather wide.chapter 44.4 standardized data4-26the empirical rule states that for data from a normal distribution, we expect the interval m ks to contain a known percentage of data. forthe normal distribution is symmetric and is also known as thebell-shaped curve.k = 1, 68.26% will lie within m + 1sk
25、 = 2, 95.44% will lie within m + 2sk = 3, 99.73% will lie within m + 3sthe empirical rulechapter 44.4 standardized data4-27note: no upper bound is given. data values outside m + 3sare rare.the empirical rulechapter 44.4 standardized data4-28a standardized variable (z) redefines each observation in t
26、erms of the number of standard deviations from the mean.a negative zvalue means theobservation is to theleft of the mean.positive z means the observation is to the right of the mean. chapter 44.4 standardized datalo4-5standardization formula for a population:standardization formula for a sample (for
27、 n 30):lo4-5: transform a data set into standardized values.formulas insert function()statisticalstandardize4-29chapter 4lo4-6: apply the empirical rule and recognize outliers.lo4-64.4 standardized data4-30for a normal distribution, the range of values is almost 6s(from m 3s to m + 3s).if you know t
28、he range r (high low), you can estimate the standard deviation as s = r/6.useful for approximating the standard deviation when only r is known.this estimate depends on the assumption of normality.estimating sigmachapter 44.4 standardized data4-31percentiles are data that have been divided into 100 g
29、roups.for example, you score in the 83rdpercentile on a standardized test. that means that 83% of the test-takers scored below you. deciles are data that have been divided into 10 groups.quintiles are data that have been divided into 5 groups.quartiles are data that have been divided into 4 groups.p
30、ercentileschapter 44.5 percentiles, quartiles, and box-plotslo4-7: calculate quartiles and other percentileslo4-74-32percentiles may be used to establish benchmarks for comparison purposes (e.g. health care, manufacturing, and banking industries use 5th, 25th, 50th, 75th and 90th percentiles). quart
31、iles (25, 50, and 75 percent) are commonly used to assess financial performance and stock portfolios. percentiles can be used in employee merit evaluation and salary benchmarking.formulas insert function ()statisticalpercentile.incformulas insert function ()statisticalquarttile.incpercentileschapter
32、 4lo4-74.5 percentiles, quartiles, and box plots4-33quartiles are scale points that divide the sorted data into four groups of approximately equal size.the three values that separate the four groups are called q1, q2, and q3, respectively.q1q2q3lower 25%|second 25%|third 25%|upper 25%quartileschapte
33、r 4lo4-74.5 percentiles, quartiles, and box plots4-34the second quartile q2is the median, a measure of central tendency.q1and q3measure dispersion since the interquartile range q3 q1measures the degree of spread in the middle 50 percent of data values.q2 lower 50% | upper 50% q1q3lower 25%| middle 5
34、0% |upper 25%quartileschapter 4lo4-74.5 percentiles, quartiles, and box plots4-35the first quartile q1is the median of the data values below q2, and the third quartile q3is the median of the data values above q2.q1q2q3lower 25%|second 25%|third 25%|upper 25%for first half of data, 50% above, 50% bel
35、ow q1.for second half of data, 50% above, 50% below q3.quartiles the method of medianschapter 4lo4-74.5 percentiles, quartiles, and box plots4-36for small data sets, find quartiles using method of medians:step 1: sort the observations.step 2: find the median q2.step 3: find the median of the data va
36、lues that lie below q2.step 4: find the median of the data values that lie above q2.method of medianschapter 4lo4-74.5 percentiles, quartiles, and box plots4-37method of medianschapter 4lo4-7example:4.5 percentiles, quartiles, and box plots4-38so, to summarize:these quartiles express central tendenc
37、y and dispersion. what is the interquartile range?q1q2q3lower 25%of p/e ratios27second 25%of p/e ratios35.5third 25%of p/e ratios40.5upper 25%of p/e ratiosexample: p/e ratios and quartileschapter 4lo4-74.5 percentiles, quartiles, and box plots4-39a useful tool of exploratory data analysis (eda).also
38、 called a box-and-whisker plot.based on a five-number summary:xmin, q1, q2, q3, xmaxconsider the five-number summary for the previous p/e ratios example:7 27 35.5 40.5 49xmin, q1, q2, q3, xmaxchapter 4lo4-8: make and interpret box plots.lo4-84.5 percentiles, quartiles, and box plots4-40the box plot
39、is displayed visually, like this.a box plot shows variability and shape.chapter 4box plotslo4-84.5 percentiles, quartiles, and box plotsmegastat descriptive statistics dot plot4-41chapter 4box plotslo4-84.5 percentiles, quartiles, and box plots4-42use quartiles to detect unusual data points by defin
40、ing fences using the following formulas:inner fencesouter fences:lower fenceq1 1.5 (q3 q1)q1 3.0 (q3 q1)upper fenceq3+ 1.5 (q3 q1)q3+ 3.0 (q3 q1)values outside the inner fences are unusual while those outside the outer fences are outliers. here is a visual illustrating the fences: box plots: fences
41、and unusual data valueschapter 4lo4-84.5 percentiles, quartiles, and box plots4-43for example, consider the p/e ratio data:there is one outlier (170) that lies above the inner fence. there are noextreme outliers that exceed the outer fence.inner fencesouter fences:lower fence:107 1.5 (126 107) = 78.
42、5107 3.0 (126 107) = 50upper fence:126 + 1.5 (126 107) = 154.5126 + 3.0 (126 107) = 183box plots: fences and unusual data valueschapter 4lo4-84.5 percentiles, quartiles, and box plots4-44truncate the whisker at the fences and display unusual values and outliers as dots.based on these fences, there is only one outlier.chapter 4box plots: fences and unusual data valueslo4-8outlier4.5 percentiles, quartiles, and box plots4-45the average of the first and third quartiles.the name midhinge derives from the idea that, if the “box” were folded in half, it would resemble a “hinge”.box plots: mid
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 輔導(dǎo)員聘用合同
- 合同范例盜用處理
- 事業(yè)單位用工合同范例
- 勞務(wù)外聘終止合同范本
- 律師事務(wù)所常年法律顧問(wèn)合同
- 借款合同范例擔(dān)保人范例
- 發(fā)貨合同范例
- 叉車代購(gòu)合同范例
- 商業(yè)街道店鋪轉(zhuǎn)讓合同范例
- 醫(yī)療治療合同范例
- 石家莊市長(zhǎng)安區(qū)學(xué)年三年級(jí)數(shù)學(xué)第一學(xué)期期末檢測(cè)試題含解析
- 2025年中國(guó)一汽招聘筆試參考題庫(kù)含答案解析
- 特殊家長(zhǎng)課后溝通技巧培訓(xùn)
- 【MOOC】數(shù)字?jǐn)z影技術(shù)與藝術(shù)-西南石油大學(xué) 中國(guó)大學(xué)慕課MOOC答案
- 心內(nèi)科心衰一病一品護(hù)理成果匯報(bào)
- 2025檢驗(yàn)檢測(cè)中心年度工作總結(jié)及工作計(jì)劃
- 2024年總經(jīng)理助理年終工作總結(jié)(3篇)
- 2024年考研英語(yǔ)(二)真題及參考答案
- 山西省太原市2023-2024學(xué)年高二上學(xué)期期末物理試題(含答案)
- B區(qū)地下室碳纖維加固施工方案
- 幼兒園園安全培訓(xùn)
評(píng)論
0/150
提交評(píng)論