如何(在python中)绘制DBSCAN中给定的最小点的距离图?
我正在寻找膝盖和相应的epsilon值。
在雪板上,我没有看到任何返回如此距离的方法.我是不是遗漏了什么?
发布于 2018-03-21 12:16:05
要获得距离,可以使用以下函数:
import numpy as np
import pandas as pd
import math
def k_distances(X, n=None, dist_func=None):
"""Function to return array of k_distances.
X - DataFrame matrix with observations
n - number of neighbors that are included in returned distances (default number of attributes + 1)
dist_func - function to count distance between observations in X (default euclidean function)
"""
if type(X) is pd.DataFrame:
X = X.values
k=0
if n == None:
k=X.shape[1]+2
else:
k=n+1
if dist_func == None:
# euclidean distance square root of sum of squares of differences between attributes
dist_func = lambda x, y: math.sqrt(
np.sum(
np.power(x-y, np.repeat(2,x.size))
)
)
Distances = pd.DataFrame({
"i": [i//10 for i in range(0, len(X)*len(X))],
"j": [i%10 for i in range(0, len(X)*len(X))],
"d": [dist_func(x,y) for x in X for y in X]
})
return np.sort([g[1].iloc[k].d for g in iter(Distances.groupby(by="i"))])X应该是pandas.DataFrame或numpy.ndarray。n是d邻里的邻居数.你应该知道这个号码。默认情况下是属性数+ 1。
要绘制这些距离,可以使用以下代码:
import matplotlib.pyplot as plt
d = k_distances(X,n,dist_func)
plt.plot(d)
plt.ylabel("k-distances")
plt.grid(True)
plt.show()https://stackoverflow.com/questions/43160240
复制相似问题