常规的模糊算法如高斯模糊等会模糊图像边缘,很多场景中我们需要保留图像纹理并模糊一些细节,这就可以使用PS中的表面模糊。
其中:r 为半径,Y为阈值, x_1为当前像素阶值,x_i为模板中某个像素值,x为当前像素结果阶值
参考了网络流行的Python版本,做了一点点优化和修正 使用了numba cpu加速,可以提速10倍,但还是没有c++快
@nb.jit(nopython=True)
def Surface_blur(I_in, thre, radius):
I_out = I_in.copy()
row, col = I_in.shape
for ii in range(radius, row-1-radius):
for jj in range(radius, col - 1 - radius):
p0 = I_in[ii, jj]
aa = I_in[ii-radius: ii+radius+1, jj-radius: jj + radius + 1]
mask_2 = 1-np.abs(aa-p0)/(2.5*thre)
mask_3 = mask_2 * (mask_2 > 0)
t1 = aa * mask_3
I_out[ii, jj] = t1.sum()/mask_3.sum()
return I_out
from numba import cuda
@cuda.jit
def image_process_cuda(img_cuda, result_img_cuda, y_size, x_size):
y = cuda.threadIdx.y + cuda.blockDim.y * cuda.blockIdx.y
x = cuda.threadIdx.x + cuda.blockDim.x * cuda.blockIdx.x
radius = 8
thre = 20
if radius < y < y_size-radius and radius < x < x_size-radius:
a = 0.0
b = 0.0
x_1 = float(img_cuda[y, x, 0])
for x_index in range(x-radius, x+radius+1):
for y_index in range(y-radius, y+radius+1):
x_i = float(img_cuda[y_index, x_index, 0])
temp_b = (1 - abs(x_i - x_1) / 2.5 / thre)
if temp_b <= 0:
continue
b += temp_b
a += temp_b * x_i
for i in range(3):
result_img_cuda[y, x, i] = int(round(a / b))
调用:
img = ...
ori_image_cuda = cuda.to_device(img)
copy_image_cuda = cuda.to_device(copy_image)
image_process_cuda[blocks_per_grid, threads_per_block](ori_image_cuda, copy_image_cuda, y_size, x_size)
result_img = copy_image_cuda.copy_to_host()
相比于 cpu 加速运算可以有百倍以上的速度提升