专栏首页机器学习与统计学Duke@coursera 数据分析与统计推断 unit4 inference for numerical variables

Duke@coursera 数据分析与统计推断 unit4 inference for numerical variables

inference for numerical variables

一、hypothesis testing for paired data

hypotheses for paired means:

二、confidence intervals for paired data

estimating the difference between pairedmeans:

三、comparing independent means

Conditions for inference for comparing twoindependent means:

1. Independence:

✓ within groups: sampled observations mustbe independent

‣ random sample/assignment

‣ if sampling without replacement, n < 10% of population

✓ between groups: the two groups must beindependent of each other (non-paired)

2. Sample size/skew: Each sample size mustbe at least 30 (n1 ≥ 30 and n2 ≥ 30), larger if the population distributionsare very skewed.

testing for a differencebetween independent means

‣ null hypothesis: no difference

‣ alternative hypothesis: some difference

‣ same conditions and SE as the confidenceinterval

summary

四、bootstrapping

‣ An alternative approach to constructingconfidence intervals is bootstrapping.

‣ This term comes from the phrase “pullingoneself up by one’s bootstraps”, which is a metaphor for accomplishing an impossibletask without any outside help.

‣ In this case the im/possible task isestimating a population parameter, and we’ll accomplish it using data from onlythe given sample.

bootstrapping scheme

(1) take a bootstrap sample - a randomsample taken with replacement from the original sample, of the same size as theoriginal sample

(2) calculate the bootstrap statistic - astatistic such as mean, median, proportion, etc. computed on the bootstrap samples

(3) repeat steps (1) and (2) many times tocreate a bootstrap distribution - a distribution of bootstrap statistics

bootstrappinglimitations

‣ Not as rigid conditions as CLT basedmethods.

‣ However if the bootstrap distribution isextremely skewed or sparse, the bootstrap interval might be unreliable.

‣ A representative sample is required forgeneralizability. If the sample is biased, the estimates resulting from thissample will also be biased

bootstrap vs.sampling distribution

‣ Sampling distribution created usingsampling (with replacement) from the population.

‣ Bootstrap distribution created usingsampling (with replacement) from the sample.

‣ Both are distributions of samplestatistics

五、t distribution

‣ n is small & σ unknown (almostalways), use the t distribution to address the uncertainty of the standarderror estimate

‣ bell shaped but thicker tails than thenormal

‣ observations more likely to fall beyond2 SDs from the mean ‣ extra thick tails helpful for mitigatingthe effect of a less reliable estimate for the standard error of the samplingdistribution

‣ always centered at 0 (like the standardnormal)

‣ has one parameter: degrees of freedom(df) - determines thickness of tails

‣ remember, the normal distribution hastwo parameters: mean and SD

tstatistic

六、inference for a small sample mean

七、inference for comparing two small sample means

八、comparing more than two means

本文分享自微信公众号 - 机器学习与统计学(tjxj666)

原文出处及转载信息见文内详细说明,如有侵权,请联系 yunjia_community@tencent.com 删除。

原始发表时间:2015-05-07

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • Duke@coursera 数据分析与统计推断unit5 inference for categorical variables

    一、sampling variability & CLT for proportions

    统计学家
  • R Programming week2 Functions and Scoping Rules

    When R tries to bind a value to a symbol,it searches through a series of environ...

    统计学家
  • Duke@coursera 数据分析与统计推断unit6introduction to linear regression

    properties (I) the magnitude (absolutevalue) of the correlation coefficient meas...

    统计学家
  • JDK7并行计算框架介绍一 Fork/Join概述(官方原版-英文)

    New in the Java SE 7 release, the fork/join framework is an implementation of th...

    数据饕餮
  • HOJ-1005 Fast Food(动态规划)

    Fast Food My Tags (Edit) Source : Unknown Time limit : 3 sec Memory...

    ShenduCC
  • Assembly - Registers

    Processor operations mostly involve processing data. This data can be stored in ...

    RainMark
  • Kotlin 编码规约

    如需根据本风格指南配置 IntelliJ 格式化程序,请安装 Kotlin 插件1.2.20 或更高版本,转到“Settings | Editor | Code...

    一个会写诗的程序员
  • nano:基本操作

    JNingWei
  • Fix Notification Switching Position Issue

    I once faced with a problem. I wrote a piece of code related with notifcation. E...

    技术小黑屋
  • 基于注意力的基于神经网络的远程监督情感态度提取(CS CL)

    在情感态度提取任务中,目标是识别文本中实体之间的情感关系。本文提供了一种在情感态度提取任务中基于注意力的上下文编码器的研究。基于此任务,采用两种类型的注意力上下...

    用户7454091

扫码关注云+社区

领取腾讯云代金券