Duke@coursera 数据分析与统计推断 unit3 foundations for inference

统计学家

发布于 2019-04-10 16:47:40

7610

发布于 2019-04-10 16:47:40

文章被收录于专栏：机器学习与统计学

一、sampling variability& CLT

samplingdistribution

二、confidence interval

A plausible range of values for thepopulation parameter is called a confidence interval.

‣ If we report a point estimate, weprobably won’t hit the exact population parameter.

‣ If we report a range of plausible valueswe have a good shot at capturing the parameter

Confidence interval for a population mean: Computed as thesample mean plus/minus a margin of error (critical value corresponding to themiddle XX% of the normal distribution times the standard error of the samplingdistribution)

Conditions forthis confidence interval:

1.Independence:Sampled observations must be independent.

‣ random sample/assignment

‣ if sampling without replacement, n < 10% of population

2. Sample size/skew: n ≥ 30, larger if the population distributionis very skewed.

三、accuracy vs. precision

‣Suppose we took many samples and built aconfidence interval from each sample using the equation

‣Then about 95% of those intervals wouldcontain the true population mean (μ)

‣ Commonly used confidence levels inpractice are 90%, 95%, 98%, and 99%

四、required sample sizefor ME

backtracking ton for a given ME

given a target margin of error, confidencelevel, and information on the variability of the sample (or the population), wecan determine the required sample size to achieve the desired margin of error

五、another introductionto inference

two competingclaims…

1. “There is nothing going on.”

Promotion and gender are independent, nogender discrimination, observed difference in proportions is simply due tochance. → Null hypothesis

2. “There is something going on.”

Promotion and gender are dependent, thereis gender discrimination, observed difference in proportions is not due tochance. → Alternative hypothesis

Since it was quite unlikely to obtainresults like the actual data or something more extreme in the simulations (malepromotions being 30% or more higher than female promotions), we decided toreject the null hypothesis in favor of the alternative

recap:hypothesis testing framework

‣ We start with a null hypothesis (H0)that represents the status quo.

‣ We also have an alternative hypothesis(HA) that represents our research question, i.e. what we’re testing for.

‣ We conduct a hypothesis test under theassumption that the null hypothesis is true, either via simulation (end of Unit1) or theoretical methods — methods that rely on the CLT (in this Unit).

‣ If the test results suggest that thedata do not provide convincing evidence for the alternative hypothesis, westick with the null hypothesis. If they do, then we reject the null hypothesisin favor of the alternative.

六、hypothesis testing

hypotheses

null - H0： Often either askeptical perspective or a claim to be tested

alternative – HA： Representsan alternative claim under consideration and is often represented by a range ofpossible parameter values.

The skeptic will not abandon the H0 unlessthe evidence in favor of the HA is so strong that she rejects H0 in favor of HA

p-value

P(observed or more extreme outcome | H0true)

decision based on the p-value

‣ We used the test statistic to calculatethe p-value, the probability of observing data at least as favorable to thealternative hypothesis as our current data set, if the null hypothesis wastrue.

‣ If the p-value is low (lower than thesignificance level, α, which is usually 5%) we say that it would be veryunlikely to observe the data if the null hypothesis were true, and hence rejectH0.

‣ If the p-value is high (higher than α)we say that it is likely to observe the data even if the null hypothesis weretrue, and hence do not reject H0

interpreting the p-value

‣ If in fact college students have been in3 exclusive relationships on average, there is a 21% chance that a randomsample of 50 college students would yield a sample mean of 3.2 or higher. ‣ This is a pretty high probability, so we think that a sample meanof 3.2 or more exclusive relationships is likely to happen simply by chance.

making a decision

‣ Since p-value is high (higher than 5%)we fail to reject H0.

‣ The data do not provide convincingevidence that college students have been in more than 3 relationships onaverage.

‣ The difference between the null value of3 relationships and the observed sample mean of 3.2 relationships is due tochance or sampling variability

two-sided tests

‣ Often instead of looking for adivergence from the null in a specific direction,

we might be interested in divergence in anydirection.

‣ We call such hypothesis tests two-sided(or two-tailed).

‣ The definition of a p-value is the sameregardless of doing a one or twosided test, however the calculation is slightlydifferent since we need to consider “at least as extreme as the observedoutcome” in both directions.

七、inference for otherestimators

nearly normal sampling distributions

unbiased estimator

An important assumption about pointestimates is that they are unbiased, i.e. the sampling distribution of theestimate is centered at the true population parameter it estimates.

‣ That is, an unbiased estimate does notnaturally over or

underestimate the parameter, it provides a“good” estimate.

‣ The sample mean is an example of anunbiased point estimate, as

well as others we just listed.

confidence intervals for nearly normal point estimates

hypothesis testing for nearly normal point estimates

八、decision errors

hypothesis test as a trial

If we again think of a hypothesis test as acriminal trial then it makes sense to frame the verdict in terms of the nulland alternative hypotheses:

H0 : Defendant is innocent

HA : Defendant is guilty

type 1 error rate

‣ We reject H0 when the p-value is lessthan 0.05 (α = 0.05).

‣ This means that, for those cases whereH0 is actually true, we do not want to incorrectly reject it more than 5% ofthose times.

‣ In other words, when using a 5%significance level there is about 5% chance of making a Type 1 error if thenull hypothesis is true.

P(Type 1 error | H0 true) = α

‣ This is why we prefer small values of α– increasing α increases the Type 1 error rate

choosing α

type 2 error rate

If the alternative hypothesis is actuallytrue, what is the chance that we make a Type 2 Error, i.e. we fail to rejectthe null hypothesis even when we should reject it?

‣ The answer is not obvious.

‣ If the true population average is veryclose to the null value, it will be difficult to detect a difference (andreject H0).

‣ If the true population average is verydifferent from the null value, it will be easier to detect a difference.

‣ Clearly, β depends on the effect size(δ), difference between point estimate and null value.

九、significance vs.confidence level

agreement of CI and HT

‣ A two sided hypothesis with threshold ofα is equivalent to a confidence interval with CL = 1 − α.

‣ A one sided hypothesis with threshold ofα is equivalent to a confidence interval with CL = 1 − (2 x α).

‣ If H0 is rejected, a confidence intervalthat agrees with the result of the hypothesis test should not include the nullvalue.

‣ If H0 is failed to be rejected, aconfidence interval that agrees with the result of the hypothesis test should includethe null value.

十、statistical vs.practical significance

‣ Real differences between the pointestimate and null value are easier to detect with larger samples.

‣ However, very large samples will resultin statistical significance even for tiny differences between the sample meanand the null value (effect size), even when the difference is not practicallysignificant.

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2015-05-06，如有侵权请联系 cloudcommunity@tencent.com 删除

编程算法

本文分享自机器学习与统计学微信公众号，前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

编程算法

登录后参与评论

0 条评论

热度

Duke@coursera 数据分析与统计推断 unit3 foundations for inference

Duke@coursera 数据分析与统计推断 unit3 foundations for inference

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐