快速高斯模糊算法

刚才发现一份快速高斯模糊的实现。

源地址为:http://incubator.quasimondo.com/processing/gaussian_blur_1.php

作者信息为:
 Fast Gaussian Blur v1.3
 by Mario Klingemann <http://incubator.quasimondo.com>
processing源码: http://incubator.quasimondo.com/processing/fastblur.pde
效果图:
转为C语言实现版本。
 
代码如下:
// Fast Gaussian Blur v1.3
// by Mario Klingemann <http://incubator.quasimondo.com>
// C version updated and performance optimization by tntmonks(http://tntmonks.cnblogs.com)
// One of my first steps with Processing. I am a fan
// of blurring. Especially as you can use blurred images
// as a base for other effects. So this is something I
// might get back to in later experiments.
//
// What you see is an attempt to implement a Gaussian Blur algorithm
// which is exact but fast. I think that this one should be
// relatively fast because it uses a special trick by first
// making a horizontal blur on the original image and afterwards
// making a vertical blur on the pre-processed image. This
// is a mathematical correct thing to do and reduces the
// calculation a lot.
//
// In order to avoid the overhead of function calls I unrolled
// the whole convolution routine in one method. This may not
// look nice, but brings a huge performance boost.
//
//
// v1.1: I replaced some multiplications by additions
//       and added aome minor pre-caclulations.
//       Also add correct rounding for float->int conversion
//
// v1.2: I completely got rid of all floating point calculations
//       and speeded up the whole process by using a
//       precalculated multiplication table. Unfortunately
//       a precalculated division table was becoming too
//       huge. But maybe there is some way to even speed
//       up the divisions.
//
// v1.3: Fixed a bug that caused blurs that start at y>0
//	 to go wrong. Thanks to Jeroen Schellekens for 
//       finding it!

void GaussianBlur(unsigned char* img, unsigned  int x, unsigned int y, unsigned int w, unsigned int h, unsigned int comp, unsigned int radius)
{
	unsigned int i, j ;
	radius = min(max(1, radius), 248); 
	unsigned int kernelSize = 1 + radius * 2; 
	unsigned int* kernel = (unsigned int*)malloc(kernelSize* sizeof(unsigned int));
	memset(kernel, 0, kernelSize* sizeof(unsigned int)); 
	unsigned int(*mult)[256] = (unsigned int(*)[256])malloc(kernelSize * 256 * sizeof(unsigned int)); 
	memset(mult, 0, kernelSize * 256 * sizeof(unsigned int));
	unsigned	int sum = 0;
	for (i = 1; i < radius; i++){
		unsigned int szi = radius - i;
		kernel[radius + i] = kernel[szi] = szi*szi;
		sum += kernel[szi] + kernel[szi];
		for (j = 0; j < 256; j++){
			mult[radius + i][j] = mult[szi][j] = kernel[szi] * j;
		}
	}
	kernel[radius] = radius*radius;
	sum += kernel[radius];
	for (j = 0; j < 256; j++){
		 
		mult[radius][j] = kernel[radius] * j;
	}

	unsigned int   cr, cg, cb;
	unsigned int   xl, yl, yi, ym, riw;
	unsigned int   read, ri, p,   n;
	unsigned	int imgWidth = w;
	unsigned	int imgHeight = h;
	unsigned	int imageSize = imgWidth*imgHeight;
	unsigned char *	rgb = (unsigned char *)malloc(sizeof(unsigned char) * imageSize * 3);
	unsigned char *	r = rgb;
	unsigned char *	g = rgb + imageSize;
	unsigned char *	b = rgb + imageSize * 2;
	unsigned char *	rgb2 = (unsigned char *)malloc(sizeof(unsigned char) * imageSize * 3);
	unsigned char *	r2 = rgb2;
	unsigned char *	g2 = rgb2 + imageSize;
	unsigned char *	b2 = rgb2 + imageSize * 2;
 
	for (size_t yh = 0; yh < imgHeight; ++yh) {
 
		for (size_t xw = 0; xw < imgWidth; ++xw) {
			n = xw + yh* imgWidth;
			p = n*comp;
			r[n] = img[p];
			g[n] = img[p + 1];
			b[n] = img[p + 2];
		}
	}
	
	x = max(0, x);
	y = max(0, y);
	w = x + w - max(0, (x + w) - imgWidth);
	h = y + h - max(0, (y + h) - imgHeight);
	yi = y*imgWidth;
 
	for (yl = y; yl < h; yl++){
 
		for (xl = x; xl < w; xl++){
			cb = cg = cr = sum = 0;
			ri = xl - radius; 
			for (i = 0; i < kernelSize; i++){
				read = ri + i;
				if (read >= x && read < w)
				{
					read += yi;
					cr += mult[i][r[read]];
					cg += mult[i][g[read]];
					cb += mult[i][b[read]];
					sum += kernel[i];
				}
			}
			ri = yi + xl;
			r2[ri] = cr / sum;
			g2[ri] = cg / sum;
			b2[ri] = cb / sum;
		}
		yi += imgWidth;
	}
	yi = y*imgWidth;
 
	for (yl = y; yl < h; yl++){
		ym = yl - radius;
		riw = ym*imgWidth; 
		for (xl = x; xl < w; xl++){
			cb = cg = cr = sum = 0;
			ri = ym;
			read = xl + riw; 
			for (i = 0; i < kernelSize; i++){
				if (ri < h && ri >= y)
				{
					cr += mult[i][r2[read]];
					cg += mult[i][g2[read]];
					cb += mult[i][b2[read]];
					sum += kernel[i];
				}
				ri++;
				read += imgWidth;
			}
			p = (xl + yi)*comp;
			img[p] = (unsigned char)(cr / sum);
			img[p + 1] = (unsigned char)(cg / sum);
			img[p + 2] = (unsigned char)(cb / sum);
		}
		yi += imgWidth;
	}

	free(rgb);
	free(rgb2);
	free(kernel);
	free(mult);
}

  该代码,将二维数组进一步优化后可提升一定的速度。

在博主机子上测试一张5000x3000的图像,模糊半径为10的情况下,耗时4s.

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏一心无二用,本人只专注于基础图像算法的实现与优化。

SSE图像算法优化系列二:高斯模糊算法的全面优化过程分享(二)。

      相关链接: 高斯模糊算法的全面优化过程分享(一)      在高斯模糊算法的全面优化过程分享(一)一文中我们已经给出了一种相当高性能的高斯模糊过程,...

46460
来自专栏GreenLeaves

SQL学习之高级联结(自联结、自然联结、外联接)

create table Customers( Id int identity(1000000,1), Company varchar(30) null, Na...

27470
来自专栏ATYUN订阅号

NLP项目:使用NLTK和SpaCy进行命名实体识别

命名实体识别(NER)是信息提取的第一步,旨在在文本中查找和分类命名实体转换为预定义的分类,例如人员名称,组织,地点,时间,数量,货币价值,百分比等。NER用于...

98740
来自专栏CreateAMind

Curiosity-driven Exploration 好奇心代码阅读

25320
来自专栏iOSDevLog

Turi Create 机器学习模型实战:你也能轻松做出Prisma 风格的图片!

如果你一直有关注Apple去年所发布的消息,就会知道他们在机器学习上投入了大量心力。自他们去年在WWDC 2017上推出Core ML以来,已经有大量结合机器学...

41520
来自专栏ATYUN订阅号

使用Tensor2Tensor和10行代码训练尖端语言翻译神经网络

有许多库可以帮助人们构建深度学习应用程序,但如果想使用最新架构的最先进模型和最少的代码,有这样一个API脱颖而出:Google的Tensor2Tensor。我通...

33830
来自专栏工科狗和生物喵

【图】Dijkstra算法

13530
来自专栏ml

nyoj-----284坦克大战(带权值的图搜索)

坦克大战 时间限制:1000 ms  |  内存限制:65535 KB 难度:3 描述 Many of us had played the game "Batt...

32350
来自专栏BestSDK

Pytorch 0.3.0 发布:新增张量函数,支持模型移植

根据官方文档的介绍,此次增加了多个函数和功能,多方面的性能均实现提升。 重大变更 0.3 版本中删掉了 Variable.reinforce() 等随机函数,因...

37880
来自专栏逍遥剑客的游戏开发

有向无环图的自动布局算法

40850

扫码关注云+社区

领取腾讯云代金券