我想计算mysql中列表的熵。现在我运行它并移到python:
select group_concat(first_name), last_name
from table
group by last name
我要找的是相当于
entropy(first_name)
为每个号码返回一个数字。类似于以下数字用法:
std(age)/avg(age)
编辑-部分回答:谢谢您对@IVO GELOV的一个非常有效的近似:
SELECT LOG2(COUNT(DISTINCT column)) FROM Table
发布于 2022-05-20 14:01:57
基于上述解和t检验的近似性,我们得到了比较加权熵.无趣,但却很有魅力:
CASE
WHEN count(*)-1 < 6 THEN (1 + LOG2(COUNT(distinct first_name)))*5.61*power(count(*)-1,-0.71)
WHEN count(*)-1 >= 6 and cnt-1 < 27 THEN (1 + LOG2(COUNT(distinct first_name)))*2.2*power(count(*)-1,-0.081)
ELSE (1 + LOG2(COUNT(distinct first_name)))*1.815*power(count(*)-1,-0.02)
END as entropy
为计数(*)>1的行定义
https://stackoverflow.com/questions/72317567
复制相似问题