这篇文章用于记录学习贝叶斯定理及其应用过程中的记录,希望由浅及深的提供一份自我学习教程。
黑球
和罐1
.
假如我们知道取得黑球的概率P(黑球)和给定黑球后球是从罐1取得的概率 P(罐1|黑球)(这个是我们要计算的,假设个变量标记下就好), 我们可以计算出联合概率: P(黑球 and 罐1) = P(黑球)*P(罐1|黑球)
另外我们也可以先选择罐1,然后再取出黑球,这样联合概率就是:
P(黑球 and 罐1) = P(罐1)*P(黑球|罐1)
综合以上2个公式,我们就可以得到:
P(黑球)*P(罐1|黑球) = P(罐1)*P(黑球|罐1) P(罐1|黑球) = P(罐1)*P(黑球|罐1) / P(黑球) = P(罐1)*P(黑球|罐1) / (P(罐1)*P(黑球|罐1)+P(罐2)*P(黑球|罐2)) = 0.5 * 0.75 / (0.5 * 0.75 + 0.5 * 0.5) = 0.6
注:这是一个简单的例子作为引子,是一个非常规解法。 例子中的P(黑球)可以比较容易计算,所 以我们只需要一步就可以算出黑球从罐1中取出的概率有多大。P(H|E) = P(H)*P(E|H) / P(E)
where
|
denotes a conditional probability (so that (A|B)} means A given B).H
stands for any hypothesis whose probability may be affected by data (called evidence below). Often there are competing hypotheses, and the task is to determine which is the most probable.E
corresponds to new data that were not used in computing the prior probability.P(H)
, the prior probability, is the estimate of the probability of the hypothesis H
before the data E
, the current evidence, is observed.P(H|E)
the posterior probability, is the probability of H
given E
, i.e., after E
is observed. This is what we want to know: the probability of a hypothesis given the observed evidence.P(E|H)
is the probability of observing E
given H
. As a function of E
with H
fixed, this is the likelihood –it indicates the compatibility of the evidence with the given hypothesis. The likelihood function is a function of the evidence, E
, while the posterior probability is a function of the hypothesis,H
.P(E)
is sometimes termed the marginal likelihood or “model evidence”. This factor is the same for all possible hypotheses being considered (as is evident from the fact that the hypothesis H
does not appear anywhere in the symbol, unlike for all the other factors), so this factor does not enter into determining the relative probabilities of different hypotheses.For different values of H
, only the factors P(H)
and P(E|H)
, both in the numerator, affect the value of P(H|E)
-the posterior probability of a hypothesis is proportional to its prior probability (its inherent likeliness) and the newly acquired likelihood (its compatibility with the new observed evidence).
贝叶斯方法来源于托马斯·贝叶斯s生前为解决一个“逆概”问题而写的文章。 在贝叶斯之前,“正向概率”已经能够计算,如“假设封闭袋子里有N个白球, M个黑球,随机摸一个出来是黑球的概率有多大”。 而一个自然而然的反向思考是:如果事先不知道袋子里面黑白球的比例, 随机取出一个(或多个)球,观察取出的球的颜色, 是否就可以推测袋子里黑白球的比例?
这正如我们日常所观察到的都是表面的结果,很难看到事务后面的本质, 如上述例子中封闭袋子里黑白球的比例。因此我们需要根据我们的观察, 提出一个猜测或假设,然后评估这个假设发生的概率。如算出不同猜测的可能性 大小,即后验概率。对于连续的猜测空间,则是计算猜测的概率密度函数; 最后得到最靠谱的猜测。
概括来讲,贝叶斯方法是一个分而治之的思想,把难以计算的概率用先验知识和 似然值估算出来。也反映了我们随着观察的不管深入,对之前的认识的不断更新。
P(H|D) = P(H)*P(D|H)/P(D)
贝叶斯推理分为两个过程;第一步是根据观测数据穷举出全部独立的模型,也叫假设; 第二步是使用模型推测未知现象发生的概率。这时我们不是选择最靠谱的模型, 而是把全部模型对未知的预测加权平均起来(权重值就是模型相应的概率)。
一所学校,男生60%,女生40%。男生总是穿长裤,女生则一半穿长裤, 一半穿裙子。假设高度近视的你未带眼镜走在校园中, 发现迎面走来一个穿裤子的学生,但未分辨出男女, 那么推断这个是男生的概率多大?
D是:穿裤子的学生。
H是:A:这个学生是男生;B:这个学生是女生
待求:P(A|D) = P(男生|长裤)
表格表示
Prior | Likelihood | Posterior | ||
---|---|---|---|---|
P(H) | P(D|H) | P(H)*P(D|H) | P(H|D) | |
A | 0.6 | 1 | 0.6 | 3/4 |
B | 0.4 | 0.5 | 0.2 | 1/4 |
公司在不同年份生产的M&M豆包含的不同颜色的豆的比例不同, 1994年产的M&M豆包装中,棕色30%,黄色20%,红色20%,绿色10%,橙色10%, 茶色10%;1996年产的M&M豆包装中,棕色13%,黄色14%,红色13%,绿色20%, 橙色16%,蓝色24%。假设手中有两粒M&M豆,分别是橙色和绿色, 一个来自1994年包装,一个来自1996年包装,求算橙色来源于1994年包装的概率?
解题思路:
为了计算方便,似然值可以乘以任意一个因子,不影响结果。
为了计算方便,似然值可以乘以任意一个因子,不影响结果。
Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat. He then says to you, “Do you want to pick door No. 2?” Is it to your advantage to switch your choice?”
有个游戏,翻译过来暂且叫“开门大吉”。Monty向你展示3个关着的门, 每个门后面都有一个奖品,其中一个是汽车,另外两个是不值钱的物品, 参与人选中一个门,如果门后面是车,则车子归参与人所有。 假设你选中了一个门A,还剩下两个门B和C。在打开你选中的门A之前, Monty会随机打开另外两个门中的一个,比如B,发现门后面没有车。 这时Monty会问你要不要坚持之前的选择A还是选择C?
注:
Suppose that an individual’s probability of having cancer, assigned according to the general prevalence of cancer, is 1%. This is known as the ‘base rate’ or prior (i.e. before being informed about the particular case at hand) probability of having cancer.
Next, suppose that the person is 65 years old. Let us assume that cancer and age are related. The probability that a person has cancer when they are 65 years old is known as the ‘current probability’, where “current” refers to the theorised situation upon finding out information about the particular case at hand.
Then, suppose that the probability of being 65 years old is 0.2%, and that the probability of a person diagnosed with cancer being 65 years old is 0.5% (a greater probability than that of being 65 years old).
Knowing this, along with the base rate, what is the probability of having cancer giving a person is 65 years old.
假设人群中每个人患癌的概率为1%。老年人(65岁)占人群比例的0.2%。 患有癌症的老年人占人群比例的0.5%。那么给定一位65岁老人,推测其患癌症的 概率是多少?
It may come as a surprise that even though being 65 years old increases the risk of having cancer, that person’s probability of having cancer is still fairly low. This is because the base rate of cancer (regardless of age) is low. This illustrates both the importance of base rate, as well as that it is commonly neglected.Base rate neglect leads to serious misinterpretation of statistics; therefore, special care should be taken to avoid such mistakes. Becoming familiar with Bayes’ theorem is one way to combat the natural tendency to neglect base rates.
在利用贝叶斯定理处理问题时,要注意先验概率的获取。少数情况可以随便假设 ,多数情况需要对先验概率有个模型或者好的统计资料来计算。只有在有足够多 的观察或者合适的似然值模型下,先验概率的影响才会变小。
Two people have left traces of their own blood at the scene of a crime. A suspect, Oliver, is tested and found to have type O blood. The blood groups of the two traces are found to be of type O (a common type in the local population, having frequency 60%) and of type AB (a rare type, with frequency 1%). Do these data (the blood types found at the scene) give evidence in favour of the proposition that Oliver was one of the two people whose blood was found at the scene?
凶杀案现场留有两个人的血迹,一种为常见的O型血(人群中出现概率60%), 另一种为AB型血(人群中出现概率为1%)。一位叫Oliver的人被认定为嫌疑人, 经检测其位O型血,则请判断现场中血迹有一个来源于Oliver的概率多大?
According to the CDC, “Compared to nonsmokers, men who smoke are about 23 times more likely to develop lung cancer and women who smoke are about 13 times more likely.” If you learn that a woman has been diagnosed with lung cancer, and you know nothing else about her, what is the probability that she is a smoker?
据CDC统计,与不抽烟者相比,抽烟的男人患肺癌的几率高23倍,抽烟的女性患 肺癌的几率高13倍。现有一名女性,诊断为肺癌,请判断她抽烟的概率多大?
注:
You’re about to get on a plane to Seattle. You want to know if you should bring an umbrella. You call 3 random friends of yours who live there and ask each independently if it’s raining. Each of your friends has a 2/3 chance of telling you the truth and a 1/3 chance of messing with you by lying. All 3 friends tell you that “Yes” it is raining. What is the probability that it’s actually raining in Seattle?
你打算去西雅图旅游,但不确定是否会下雨。你打电话给三个在西雅图居住但彼 此不认识的朋友打电话询问下。你的每个朋友都有2/3的可能告诉你真实情况, 也有1/3的可能他们会搞砸。询问后所有的朋友都告诉你会下雨。 那么请问西雅图下雨的概率有多大?
假如X为0.5,则西雅图下雨的概率P(A|D) = 8/9。
根据1965-99年记录的气象资料显示,西雅图一年有822个小时在下雨,约占全年 的10%。即西雅图会下雨的先验概率是10%。以此计算的后验概率 P(A|D)=8/17。
另外一种基于odds的思考方法,给定西雅图下雨的先验概率为10%,则下雨与不 下雨的先验优势比 (prior odds)为1:9。每个朋友判断正确的概率为判断错误的 概率的2倍。每个朋友对下雨贡献的似然值为2。先验优势比与似然值相乘后得到 后验优势比(posterior odds)为8:9, 对应于下雨的概率是8/17。
A base rate of 10% corresponds to prior odds of 1:9. Each friend is twice as likely to tell the truth as to lie, so each friend contributes evidence in favor of rain with a likelihood ratio, or Bayes factor, of 2. Multiplying the prior odds by the likelihood ratios yields posterior odds 8:9, which corresponds to probability 8/17, or 0.47.
Suppose a drug test is 99% sensitive and 99% specific. That is, the test will produce 99% true positive results for drug users and 99% true negative results for non-drug users. Suppose that 0.5% of people are users of the drug. If a randomly selected individual tests positive, what is the probability that he is a user?
假设一项药物测试的假阳性率(非特异性)和假阴性率(不敏感性)都是1%。 已知人群中服用过该药物的个体约占0.5%。如果随机选择一个个体检测为阳性, 那么他服药的概率是多少?
在概率论应用中,贝叶斯法则指事件A1相对于事件A2在给定另一事件之前和之后 的比值比(odds ratio)。先验比值比(prior odds ratio)是事件之间先验概率的比值。 后验比值比(posterior odds ratio)是在给定条件事件之后的后验概率的比值。 后验比值比正相关于先验比值比乘以似然值(likelihood ratio, 又称Bayes factor)。
# O: odds ratio
# ^: likelihood
# P: probability
O(A1:A2|B) = ^(A1:A2|B) * O(A1:A2)
^(A1:A2|B) = P(B|A1) / P(B|A2)
O(A1:A2) = P(A1) / P(A2)
O(A1:A2|B) = P(A1|B) / P(A2|B)
假设一项药物测试的假阳性率(非特异性)和假阴性率(不敏感性)都是1%。 已知人群中服用过该药物的个体约占0.5%。如果随机选择一个个体检测为阳性, 那么他服药的概率是多少?
如果用贝叶斯法则的比值比来表示:
Here lists all code for bayes learning in ipython notbook
format.