基于OpenCV DNN模块给黑白老照片上色(附Python/C++源码)

Color Space

发布于 2022-04-06 20:39:46

7840

发布于 2022-04-06 20:39:46

文章被收录于专栏：OpenCV与AI深度学习

导读

本文给大家分享一个用OpenCV DNN模块给黑白老照片上色的实例，并给出Python和C++版本源码。

背景介绍

这个项目是基于在加利福尼亚大学，伯克利，Richard Zhang，Phillip Isola和Alexei A. Efros开发的研究工作--Colorful Image Colorization，对应论文地址：https://arxiv.org/pdf/1603.08511.pdf，作者项目github地址：https://github.com/richzhang/colorization/tree/caffe

正如在最初的论文中所解释的，作者们接受了问题的潜在不确定性，将其作为一项分类任务，在训练时使用类别再平衡来增加结果中的颜色多样性。人工智能（AI）方法在测试时在CNN（“卷积神经网络”）中作为前馈传递实现，并在100多万张彩色图像上进行训练。

这个项目将使用的颜色空间模型是“Lab”。CIELAB颜色空间（也称为CIE L*a*b*或有时简称为“Lab”颜色空间）是国际照明委员会（CIE）在1976年定义的颜色空间。它将颜色表示为三个数值，L*表示亮度，a*和b*表示绿色、红色和蓝黄色。

深度学习的过程：正如引言中所述，人工智能（AI）方法在测试时作为CNN（“卷积神经网络”）中的前馈传递实现，并在100多万张彩色图像上进行训练。换句话说，数百万张彩色照片使用Lab模型进行分解，并用作输入特征（“L”）和分类标签（“a”和“b”）。为了简单起见，我们分成两部分：“L”和“a+b”，如方框图所示：

有了经过训练的模型（可以公开获得），我们可以用它给一个新的黑白照片上色，这张照片将作为模型或组件“L”的输入。模型的输出将是其他组件“a”和“b”，它们一旦添加到原始“L”中，将返回一张完整的彩色照片，如下所示：

简言之，使用ImageNet的130万张照片的广泛多样的对象和场景数据集，并应用深度学习算法（前馈CNN），生成了最终模型，可在以下网址获得：

https://github.com/richzhang/colorization/tree/caffe/models

效果展示

http://mpvideo.qpic.cn/0bc3zeacoaaapuahm2qkrzrfbsode7eqajya.f10002.mp4?dis_k=b64e1733323d6a15b8607e27c213c164&dis_t=1649248574&vid=wxv_2269138147215392768&format_id=10002&support_redirect=0&mmversion=false

详细步骤与演示

OpenCV DNN模块可以直接使用Caffe训练好的模型，下面是加载模型测试步骤：

【1】下载模型和配置文件：

wget http://eecs.berkeley.edu/~rich.zhang/projects/2016_colorization/files/demo_v2/colorization_release_v2.caffemodel -O ./models/colorization_release_v2.caffemodel

【2】准备测试图片--黑白老照片：

【3】代码测试：

Python OpenCV实现代码：


# Importing libraries
import numpy as np
import matplotlib.pyplot as plt
import cv2

print(cv2.__version__)

# Path of our caffemodel, prototxt, and numpy files
prototxt = "./model/colorization_deploy_v2.prototxt"
caffe_model = "./model/colorization_release_v2.caffemodel"
pts_npy = "./model/pts_in_hull.npy"

img_path = './imgs/11.jpg'
# Loading our model
net = cv2.dnn.readNetFromCaffe(prototxt, caffe_model)
pts = np.load(pts_npy)
 
layer1 = net.getLayerId("class8_ab")
print(layer1)
layer2 = net.getLayerId("conv8_313_rh")
print(layer2)
pts = pts.transpose().reshape(2, 313, 1, 1)
net.getLayer(layer1).blobs = [pts.astype("float32")]
net.getLayer(layer2).blobs = [np.full([1, 313], 2.606, dtype="float32")]

# Converting the image into RGB and plotting it
# Read image from the path
test_image = cv2.imread(img_path)
# Convert image into gray scale
test_image = cv2.cvtColor(test_image, cv2.COLOR_BGR2GRAY)
# Convert image from gray scale to RGB format
test_image = cv2.cvtColor(test_image, cv2.COLOR_GRAY2RGB)
# Check image using matplotlib
plt.imshow(test_image)
plt.show()

# Converting the RGB image into LAB format
# Normalizing the image
normalized = test_image.astype("float32") / 255.0
# Converting the image into LAB
lab_image = cv2.cvtColor(normalized, cv2.COLOR_RGB2LAB)
# Resizing the image
resized = cv2.resize(lab_image, (224, 224))
# Extracting the value of L for LAB image
L = cv2.split(resized)[0]
L -= 50   # OR we can write L = L - 50

# Predicting a and b values
# Setting input
net.setInput(cv2.dnn.blobFromImage(L))
# Finding the values of 'a' and 'b'
ab = net.forward()[0, :, :, :].transpose((1, 2, 0))
# Resizing
ab = cv2.resize(ab, (test_image.shape[1], test_image.shape[0]))

# Combining L, a, and b channels
L = cv2.split(lab_image)[0]
# Combining L,a,b
LAB_colored = np.concatenate((L[:, :, np.newaxis], ab), axis=2)
# Checking the LAB image
plt.imshow(LAB_colored)
plt.title('LAB image')
plt.show()

## Converting LAB image to RGB
RGB_colored = cv2.cvtColor(LAB_colored,cv2.COLOR_LAB2RGB)
# Limits the values in array
RGB_colored = np.clip(RGB_colored, 0, 1)
# Changing the pixel intensity back to [0,255],as we did scaling during pre-processing and converted the pixel intensity to [0,1]
RGB_colored = (255 * RGB_colored).astype("uint8")
# Checking the image
plt.imshow(RGB_colored)
plt.title('Colored Image')
plt.show()

# Saving the colored image
# Converting RGB to BGR
RGB_BGR = cv2.cvtColor(RGB_colored, cv2.COLOR_RGB2BGR)
# Saving the image in desired path
cv2.imwrite('result.jpg', RGB_BGR)

C++ OpenCV实现代码:

// ImgColorization_OpenCV_DNN.cpp : 此文件包含 "main" 函数。程序执行将在此处开始并结束。
// 公众号：OpenCV与AI深度学习
#include "pch.h"
#include <iostream>
#include <opencv2/opencv.hpp>
#include <opencv2/dnn.hpp>

using namespace cv;
using namespace cv::dnn;
using namespace std;

// the 313 ab cluster centers from pts_in_hull.npy (already transposed)
static float hull_pts[] = {
  -90., -90., -90., -90., -90., -80., -80., -80., -80., -80., -80., -80., -80., -70., -70., -70., -70., -70., -70., -70., -70.,
  -70., -70., -60., -60., -60., -60., -60., -60., -60., -60., -60., -60., -60., -60., -50., -50., -50., -50., -50., -50., -50., -50.,
  -50., -50., -50., -50., -50., -50., -40., -40., -40., -40., -40., -40., -40., -40., -40., -40., -40., -40., -40., -40., -40., -30.,
  -30., -30., -30., -30., -30., -30., -30., -30., -30., -30., -30., -30., -30., -30., -30., -20., -20., -20., -20., -20., -20., -20.,
  -20., -20., -20., -20., -20., -20., -20., -20., -20., -10., -10., -10., -10., -10., -10., -10., -10., -10., -10., -10., -10., -10.,
  -10., -10., -10., -10., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 10., 10., 10., 10., 10., 10., 10.,
  10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 20., 20., 20., 20., 20., 20., 20., 20., 20., 20., 20., 20., 20., 20., 20.,
  20., 20., 20., 30., 30., 30., 30., 30., 30., 30., 30., 30., 30., 30., 30., 30., 30., 30., 30., 30., 30., 30., 40., 40., 40., 40.,
  40., 40., 40., 40., 40., 40., 40., 40., 40., 40., 40., 40., 40., 40., 40., 40., 50., 50., 50., 50., 50., 50., 50., 50., 50., 50.,
  50., 50., 50., 50., 50., 50., 50., 50., 50., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60.,
  60., 60., 60., 70., 70., 70., 70., 70., 70., 70., 70., 70., 70., 70., 70., 70., 70., 70., 70., 70., 70., 70., 70., 80., 80., 80.,
  80., 80., 80., 80., 80., 80., 80., 80., 80., 80., 80., 80., 80., 80., 80., 80., 90., 90., 90., 90., 90., 90., 90., 90., 90., 90.,
  90., 90., 90., 90., 90., 90., 90., 90., 90., 100., 100., 100., 100., 100., 100., 100., 100., 100., 100., 50., 60., 70., 80., 90.,
  20., 30., 40., 50., 60., 70., 80., 90., 0., 10., 20., 30., 40., 50., 60., 70., 80., 90., -20., -10., 0., 10., 20., 30., 40., 50.,
  60., 70., 80., 90., -30., -20., -10., 0., 10., 20., 30., 40., 50., 60., 70., 80., 90., 100., -40., -30., -20., -10., 0., 10., 20.,
  30., 40., 50., 60., 70., 80., 90., 100., -50., -40., -30., -20., -10., 0., 10., 20., 30., 40., 50., 60., 70., 80., 90., 100., -50.,
  -40., -30., -20., -10., 0., 10., 20., 30., 40., 50., 60., 70., 80., 90., 100., -60., -50., -40., -30., -20., -10., 0., 10., 20.,
  30., 40., 50., 60., 70., 80., 90., 100., -70., -60., -50., -40., -30., -20., -10., 0., 10., 20., 30., 40., 50., 60., 70., 80., 90.,
  100., -80., -70., -60., -50., -40., -30., -20., -10., 0., 10., 20., 30., 40., 50., 60., 70., 80., 90., -80., -70., -60., -50.,
  -40., -30., -20., -10., 0., 10., 20., 30., 40., 50., 60., 70., 80., 90., -90., -80., -70., -60., -50., -40., -30., -20., -10.,
  0., 10., 20., 30., 40., 50., 60., 70., 80., 90., -100., -90., -80., -70., -60., -50., -40., -30., -20., -10., 0., 10., 20., 30.,
  40., 50., 60., 70., 80., 90., -100., -90., -80., -70., -60., -50., -40., -30., -20., -10., 0., 10., 20., 30., 40., 50., 60., 70.,
  80., -110., -100., -90., -80., -70., -60., -50., -40., -30., -20., -10., 0., 10., 20., 30., 40., 50., 60., 70., 80., -110., -100.,
  -90., -80., -70., -60., -50., -40., -30., -20., -10., 0., 10., 20., 30., 40., 50., 60., 70., 80., -110., -100., -90., -80., -70.,
  -60., -50., -40., -30., -20., -10., 0., 10., 20., 30., 40., 50., 60., 70., -110., -100., -90., -80., -70., -60., -50., -40., -30.,
  -20., -10., 0., 10., 20., 30., 40., 50., 60., 70., -90., -80., -70., -60., -50., -40., -30., -20., -10., 0.
};

int main()
{
  string modelTxt = "./model/colorization_deploy_v2.prototxt";
  string modelBin = "./model/colorization_release_v2.caffemodel";
  string img_path = "./imgs/10.jpg";
   
  Mat img = imread(img_path);
  if (img.empty())
  {
    cout << "Can't read image from file: " << img_path << endl;
    return 2;
  }
  // fixed input size for the pretrained network
  const int W_in = 224;
  const int H_in = 224;
  Net net = dnn::readNetFromCaffe(modelTxt, modelBin);

  // setup additional layers:
  int sz[] = { 2, 313, 1, 1 };
  const Mat pts_in_hull(4, sz, CV_32F, hull_pts);
  Ptr<dnn::Layer> class8_ab = net.getLayer("class8_ab");
  class8_ab->blobs.push_back(pts_in_hull);
  Ptr<dnn::Layer> conv8_313_rh = net.getLayer("conv8_313_rh");
  conv8_313_rh->blobs.push_back(Mat(1, 313, CV_32F, Scalar(2.606)));
  // extract L channel and subtract mean
  Mat lab, L, input;
  img.convertTo(img, CV_32F, 1.0 / 255);
  cvtColor(img, lab, COLOR_BGR2Lab);
  extractChannel(lab, L, 0);
  resize(L, input, Size(W_in, H_in));
  input -= 50;
  // run the L channel through the network
  Mat inputBlob = blobFromImage(input);
  net.setInput(inputBlob);
  Mat result = net.forward();
  // retrieve the calculated a,b channels from the network output
  Size siz(result.size[2], result.size[3]);
  Mat a = Mat(siz, CV_32F, result.ptr(0, 0));
  Mat b = Mat(siz, CV_32F, result.ptr(0, 1));
  resize(a, a, img.size());
  resize(b, b, img.size());
  // merge, and convert back to BGR
  Mat color, chn[] = { L, a, b };
  merge(chn, 3, lab);
  cvtColor(lab, color, COLOR_Lab2BGR);
  imshow("color", color);
  imshow("original", img);
  waitKey();
  return 0;
}

【4】效果展示：