问Python集群“纯度”指标
EN

Stack Overflow用户

提问于 2015-12-03 00:14:17

回答 1查看 20.5K关注 0票数 11

我正在使用来自sklearn.mixture的Gaussian Mixture Model (GMM)来执行我的数据集的聚类。

我可以使用函数score()来计算模型下的对数概率。

但是，我正在寻找在this article中定义的名为“purity”的指标。

我如何在Python中实现它？我当前的实现如下所示：

from sklearn.mixture import GMM

# X is a 1000 x 2 array (1000 samples of 2 coordinates).
# It is actually a 2 dimensional PCA projection of data
# extracted from the MNIST dataset, but this random array
# is equivalent as far as the code is concerned.
X = np.random.rand(1000, 2)

clusterer = GMM(3, 'diag')
clusterer.fit(X)
cluster_labels = clusterer.predict(X)

# Now I can count the labels for each cluster..
count0 = list(cluster_labels).count(0)
count1 = list(cluster_labels).count(1)
count2 = list(cluster_labels).count(2)

但是我不能遍历每个集群来计算混淆矩阵(根据这个question)

python

scikit-learn

cluster-analysis

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/34047540

复制

相似问题

问Python集群“纯度”指标
EN

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python集群“纯度”指标EN

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python集群“纯度”指标
EN