1 简介
气象部门在发布预报时,发布的是一定区域范围的网格化(或站点化)的气象要素结果,以降水预报为例,
图1 14时的降水预测与观测值对比
2 评价指标及其python实现
假设有两个类别,正和负,分别用1,0表示,如下表格。
预测 负例 | 预测 正例 | |
---|---|---|
真实 负例 | TN(True negative) | FP(False positive) |
真实 正例 | FN(False negative) | TP(True positive) |
该表格称为 混淆矩阵(confusion matrix)。
在实际应用中,我们不仅希望Accuracy高,还希望模型对每个类别都有很强的分类能力,即recall 和 precision都要高。
气象上的降水评价指标基本都建立在二分类基础上。
以上面的y_pre 和 y_obs 为例,共计有3600个格点,选定一个阈值rain_threshold ,格点数值 >= rain_threshold 即为正例, 否则为负例。这里采取晴雨分类,即rain_threshold = 0.1
构建混淆矩阵,晴为负例,雨为正例,如下:
预测 晴 | 预测 雨 | ||
---|---|---|---|
真实 晴 | TN: 1968 | FP: 52 | 2020 |
真实 雨 | FN: 458 | TP: 1122 | 1580 |
2426 | 1174 |
类比到气象上,概念一致,只是换了名称。
预测 晴 | 预测 雨 | ||
---|---|---|---|
真实 晴 | correctnegatives | falsealarms(误警) | 2020 |
真实 雨 | misses(漏报) | hits(击中) | 1580 |
2426 | 1174 |
代码如下:
def prep_clf(obs,pre, threshold=0.1):
'''
func: 计算二分类结果-混淆矩阵的四个元素
inputs:
obs: 观测值,即真实值;
pre: 预测值;
threshold: 阈值,判别正负样本的阈值,默认0.1,气象上默认格点 >= 0.1才判定存在降水。
returns:
hits, misses, falsealarms, correctnegatives
#aliases: TP, FN, FP, TN
'''
#根据阈值分类为 0, 1
obs = np.where(obs >= threshold, 1, 0)
pre = np.where(pre >= threshold, 1, 0)
# True positive (TP)
hits = np.sum((obs == 1) & (pre == 1))
# False negative (FN)
misses = np.sum((obs == 1) & (pre == 0))
# False positive (FP)
falsealarms = np.sum((obs == 0) & (pre == 1))
# True negative (TN)
correctnegatives = np.sum((obs == 0) & (pre == 0))
return hits, misses, falsealarms, correctnegatives
def precision(obs, pre, threshold=0.1):
'''
func: 计算精确度precision: TP / (TP + FP)
inputs:
obs: 观测值,即真实值;
pre: 预测值;
threshold: 阈值,判别正负样本的阈值,默认0.1,气象上默认格点 >= 0.1才判定存在降水。
returns:
dtype: float
'''
TP, FN, FP, TN = prep_clf(obs=obs, pre = pre, threshold=threshold)
return TP / (TP + FP)
def recall(obs, pre, threshold=0.1):
'''
func: 计算召回率recall: TP / (TP + FN)
inputs:
obs: 观测值,即真实值;
pre: 预测值;
threshold: 阈值,判别正负样本的阈值,默认0.1,气象上默认格点 >= 0.1才判定存在降水。
returns:
dtype: float
'''
TP, FN, FP, TN = prep_clf(obs=obs, pre = pre, threshold=threshold)
return TP / (TP + FN)
def ACC(obs, pre, threshold=0.1):
'''
func: 计算准确度Accuracy: (TP + TN) / (TP + TN + FP + FN)
inputs:
obs: 观测值,即真实值;
pre: 预测值;
threshold: 阈值,判别正负样本的阈值,默认0.1,气象上默认格点 >= 0.1才判定存在降水。
returns:
dtype: float
'''
TP, FN, FP, TN = prep_clf(obs=obs, pre = pre, threshold=threshold)
return (TP + TN) / (TP + TN + FP + FN)
def FSC(obs, pre, threshold=0.1):
'''
func:计算f1 score = 2 * ((precision * recall) / (precision + recall))
'''
precision_socre = precision(obs, pre, threshold=threshold)
recall_score = recall(obs, pre, threshold=threshold)
return 2 * ((precision_socre * recall_score) / (precision_socre + recall_score))
由以上四个基本指标,引申出许多气象降水评价指标。
1 物理概念
def TS(obs, pre, threshold=0.1):
'''
func: 计算TS评分: TS = hits/(hits + falsealarms + misses)
alias: TP/(TP+FP+FN)
inputs:
obs: 观测值,即真实值;
pre: 预测值;
threshold: 阈值,判别正负样本的阈值,默认0.1,气象上默认格点 >= 0.1才判定存在降水。
returns:
dtype: float
'''
hits, misses, falsealarms, correctnegatives = prep_clf(obs=obs, pre = pre, threshold=threshold)
return hits/(hits + falsealarms + misses)
def ETS(obs, pre, threshold=0.1):
'''
ETS - Equitable Threat Score
details in the paper:
Winterrath, T., & Rosenow, W. (2007). A new module for the tracking of
radar-derived precipitation with model-derived winds.
Advances in Geosciences,10, 77–83. https://doi.org/10.5194/adgeo-10-77-2007
Args:
obs (numpy.ndarray): observations
pre (numpy.ndarray): prediction
threshold (float) : threshold for rainfall values binaryzation
(rain/no rain)
Returns:
float: ETS value
'''
hits, misses, falsealarms, correctnegatives = prep_clf(obs=obs, pre = pre,
threshold=threshold)
num = (hits + falsealarms) * (hits + misses)
den = hits + misses + falsealarms + correctnegatives
Dr = num / den
ETS = (hits - Dr) / (hits + misses + falsealarms - Dr)
return ETS
def FAR(obs, pre, threshold=0.1):
'''
func: 计算误警率。falsealarms / (hits + falsealarms)
FAR - false alarm rate
Args:
obs (numpy.ndarray): observations
pre (numpy.ndarray): prediction
threshold (float) : threshold for rainfall values binaryzation
(rain/no rain)
Returns:
float: FAR value
'''
hits, misses, falsealarms, correctnegatives = prep_clf(obs=obs, pre = pre,
threshold=threshold)
return falsealarms / (hits + falsealarms)
def MAR(obs, pre, threshold=0.1):
'''
func : 计算漏报率 misses / (hits + misses)
MAR - Missing Alarm Rate
Args:
obs (numpy.ndarray): observations
pre (numpy.ndarray): prediction
threshold (float) : threshold for rainfall values binaryzation
(rain/no rain)
Returns:
float: MAR value
'''
hits, misses, falsealarms, correctnegatives = prep_clf(obs=obs, pre = pre,
threshold=threshold)
return misses / (hits + misses)
def POD(obs, pre, threshold=0.1):
'''
func : 计算命中率 hits / (hits + misses)
pod - Probability of Detection
Args:
obs (numpy.ndarray): observations
pre (numpy.ndarray): prediction
threshold (float) : threshold for rainfall values binaryzation
(rain/no rain)
Returns:
float: PDO value
'''
hits, misses, falsealarms, correctnegatives = prep_clf(obs=obs, pre = pre,
threshold=threshold)
return hits / (hits + misses)
def BIAS(obs, pre, threshold = 0.1):
'''
func: 计算Bias评分: Bias = (hits + falsealarms)/(hits + misses)
alias: (TP + FP)/(TP + FN)
inputs:
obs: 观测值,即真实值;
pre: 预测值;
threshold: 阈值,判别正负样本的阈值,默认0.1,气象上默认格点 >= 0.1才判定存在降水。
returns:
dtype: float
'''
hits, misses, falsealarms, correctnegatives = prep_clf(obs=obs, pre = pre,
threshold=threshold)
return (hits + falsealarms) / (hits + misses)
HSS公式如下:
def HSS(obs, pre, threshold=0.1):
'''
HSS - Heidke skill score
Args:
obs (numpy.ndarray): observations
pre (numpy.ndarray): pre
threshold (float) : threshold for rainfall values binaryzation
(rain/no rain)
Returns:
float: HSS value
'''
hits, misses, falsealarms, correctnegatives = prep_clf(obs=obs, pre = pre,
threshold=threshold)
HSS_num = 2 * (hits * correctnegatives - misses * falsealarms)
HSS_den = (misses**2 + falsealarms**2 + 2*hits*correctnegatives +
(misses + falsealarms)*(hits + correctnegatives))
return HSS_num / HSS_den
def BSS(obs, pre, threshold=0.1):
'''
BSS - Brier skill score
Args:
obs (numpy.ndarray): observations
pre (numpy.ndarray): prediction
threshold (float) : threshold for rainfall values binaryzation
(rain/no rain)
Returns:
float: BSS value
'''
obs = np.where(obs >= threshold, 1, 0)
pre = np.where(pre >= threshold, 1, 0)
obs = obs.flatten()
pre = pre.flatten()
return np.sqrt(np.mean((obs - pre) ** 2))
def MAE(obs, pre):
"""
Mean absolute error
Args:
obs (numpy.ndarray): observations
pre (numpy.ndarray): prediction
Returns:
float: mean absolute error between observed and simulated values
"""
obs = pre.flatten()
pre = pre.flatten()
return np.mean(np.abs(pre - obs))
def RMSE(obs, pre):
"""
Root mean squared error
Args:
obs (numpy.ndarray): observations
pre (numpy.ndarray): prediction
Returns:
float: root mean squared error between observed and simulated values
"""
obs = obs.flatten()
pre = pre.flatten()
return np.sqrt(np.mean((obs - pre) ** 2))
threshold : mm | 0.1 | 10 | 25 | 50 | 100 |
---|---|---|---|---|---|
降雨类型 | 小雨 | 中雨 | 大雨 | 暴雨 | 特大暴雨 |
选取上述例子,来看在不同阈值下各评分函数数值。
在真实的检验中,y_obs并不是均匀网格的,而是站点分布的,依据相同思路,比较区域内的所有站点预测和站点观测值,也能得到对应评分。
[1]
Kong et al, 2008: http://www.gyqx.ac.cn/article/2018/02/20180217.htm#bkong2008
[2]
rainymotion v0.1 github: https://github.com/hydrogo/rainymotion/blob/master/rainymotion/metrics.py
[3]
分类模型评估指标——准确率、精准率、召回率、F1、ROC曲线、AUC曲线: https://easyai.tech/ai-definition/accuracy-precision-recall-f1-roc-auc/