如何使用 OpenCV 编写基于 Node.js 命令行界面和神经网络模型的图像分类

AI研习社

发布于 2018-12-19 10:59:52

1.3K0

发布于 2018-12-19 10:59:52

文章被收录于专栏：AI研习社

本文为 AI 研习社编译的技术博客，原标题： How to Write a Node.js CLI using OpenCV with Neural Network Models for Image Classification 作者 | Jeff Galbraith 翻译 | 苏珊娜•本森、康奈尔•斯摩校对 | 酱番梨整理 | 菠萝妹原文链接： https://itnext.io/how-to-write-a-node-js-cli-using-opencv-with-neural-network-models-for-image-classification-57785d6f09fe

如何使用 OpenCV 编写基于 Node.js 命令行界面和神经网络模型的图像分类

使用SDD Coco Model 对图像进行分类（没错，这是我的皮卡。）

在这篇文章中我们将学习三件事情（这些是我在Github创建项目时不得不忍受的挣扎。）

如何使用git-lfs（Git大文件系统）上传大文件到GitHub项目中。
如何创建一个Node CLI（命令行接口）。
如何使用深度神经网络进行图像分类。

我对每件事情都创建了一个章节，因此你可以阅读所有或者直接阅读你感兴趣的部分。

背景故事

在我们开始之前，了解一下这些是如何发生的。在我工作的地方，我们使用内置摄像头来做分析（比如检测油或者气体泄露）。当发生警报时，从MOEG流中获取当时的照片。我的团队另一个项目是使用Python程序对这些照片进行分类。我很好奇是否可以用Node做同样的事情。在这之前我从未使用过神经网络，因此这对我来说是具有挑战性的。我开始用tensflow.js，但是我需要tfjs-node包将我们现有的模型转换成一个“web-friendly”模型。

然后我在Medium发现一篇由Vincent Mühler写的很棒的文章叫做“Node.js meets OpenCV’s Deep Neural Networks — Fun with Tensorflow and Caffe”。这篇文章通过node包： opencv4nodejs向我介绍了OpenCV。从这件事情我开始工作并且取得了较好的结果。

在我将所有的包以及readme文件放在一起之后，我开始在Github上开始我的项目，但是模型文件太大了！然后我开始学习git-lfs（Git大文件系统）。几天的挣扎后（那个时候在有限的宽带下--我在露营），我搞明白了。然后npm（标志）问题来了，我试图发布npm，但是在包装后，上传注册表失败因为“javascript heap out of memory”，再次是因为所有的包再放一起太大了！

我仍然没有获得npm注册表。我需要探索不同的方式。如果你已经解决了大文件包的问题，请随时告诉我你是怎么做到的。

Github 和超大文件

首先，Github是有容量限制的。从他们的官方文档来看，“我们所能存储的文件大小必须小于100MB”。因此，如果模型大于这个大小，则一定不能运行。

输入 git-lfs。这个参数让你在 git 或者 Github 来追踪超大文件。尽管一开始免费的，超过一定限制 Github 也会开始收费。同时，Gitkeaken 也支持 git-lfs —— 赞！

GitKraken 对 LFS 的支持

你注意到文件最后的 LFS 了吗？赞。

好的，所以它并不总是尽如人意。首先你把超大文件放到你的目录下，你必须初始化 git-lfs 并告诉它你的项目需要追踪什么类型的文件。点击阅读详情。

创建一个有 CLI 的结点

我确定你听说过 CLI —— 命令行界面。它让用户通过计算机程序来与电脑交互。通过创建一个 CLI 结点，你的结点库就会向原生的电脑程序那样来运行。

比如说，运行一个叫“classify”的结点库，你通常需要如下操作（在 classify 的文件夹中）：

node index.js [arguments]

你可以把库全局安装到结点系统，它会把库增加到路径。如下命令来安装（在 classify 文件夹中）：

npm install -g . classify

该语句在当前文件夹中使用“classify”来安装了这个库。

现在你可以从命令行来执行下述语句：

classify --image --filter ./filter.txt --confidence 50

CLI 输出

所有的 CLI 都有输出因此用户可以理解如何如何来使用它。在下面这个案例中，“classify”是这样的：

当然，库可以帮助你能来了解它的功能。我这里使用command-line-usage和command-line-args来了解每个库的功能。

但是，在我们做这个之前，我们先来看看电脑是如何不在命令行里面定义一个Node，却能够通过结点来运行一个JavaScript文件的？

这都归功于Linux系统中所有脚本的第一行。这行代码帮助脚本编译器来使用she-bang解译：

该代码告诉系统使用“node”作为该脚本的编译器，因此当你需要使用一个 CLI，它应该永远位于你的 JavaScript 文件中的顶部。

命令行使用

命令行的使用非常简单，它定义了用户看到的样式。

命令行如下：

const commandLineUsage = require('command-line-usage')

const sections = [
  {
    header: 'classify',
    content: 'Classifies an image using machine learning from passed in image path.'
  },
  {
    header: 'Options',
    optionList: [
      {
        name: 'image',
        typeLabel: '{underline imagePath}',
        description: '[required] The image path.'
      },
      {
        name: 'confidence',
        typeLabel: '{underline value}',
        description: '[optional; default 50] The minimum confidence level to use for classification (ex: 50 for 50%).'
      },
      {
        name: 'filter',
        typeLabel: '{underline filterFile}',
        description: '[optional] A filter file used to filter out classification not wanted.'
      },
      {
        name: 'quick',
        description: '[optional; default slow] Use quick classification, but may be more inaccurate.'
      },
      {
        name: 'version',
        description: 'Application version.'
      },
      {
        name: 'help',
        description: 'Print this usage guide.'
      }
    ]
  }
]
const usage = commandLineUsage(sections)

然后，要输出结果，代码如下所示：console.log(usage)

命令行-ARGS

再一次地，相当容易使用。只需确保您处理并验证所有内容：

const fs = require('fs')
const path = require('path')
const commandLineArgs = require('command-line-args')

/**
 * Returns true if the passed in object is empty * @param {Object} obj 
 */
const isEmptyObject = (obj) => {
  return JSON.stringify(obj) === JSON.stringify({})
}

const optionDefinitions = [
  { name: 'image', alias: 'i', type: String },
  { name: 'confidence', alias: 'c', type: Number },
  { name: 'filter', alias: 'f', type: String },
  { name: 'quick', alias: 'q' },
  { name: 'version', alias: 'v' },
  { name: 'help', alias: 'h' }
]let optionstry {
  options = commandLineArgs(optionDefinitions)
}catch(e) {
  console.error()
  console.error('classify:', e.name, e.optionName)
  console.log(usage)
  process.exit(1)
}

// check for helpif (isEmptyObject(options) || 'help' in options) {
  console.log(usage)
  process.exit(1)
}

// check for versionif ('version' in options) {
  let pkg = require('./package.json')
  console.log(pkg.version)
  process.exit(1)
}let imagePath
// check for pathif ('image' in options) {
  imagePath = options.image
}if (!imagePath) {
  console.error('"--image imagePath" is required.')
  process.exit(1)
}if (!fs.existsSync(imagePath)) {
  console.log(`exiting: could not find image: ${imagePath}`)
  process.exit(2)
}let confidence = 50 // defaultif ('confidence' in options) {
  confidence = options.confidence
}

// validate confidenceif (confidence < 0) {
  console.error(`Negative numbers are not valid for 'confidence'.`)
  process.exit(1)
}if (confidence > 100) {
  console.error(`A value greater than 100 is not valid for 'confidence'.`)
  process.exit(1)
}

confidence = confidence / 100.0let filterItems = []if ('filter' in options) {
  const filterFile = options.filter  // verify file exist
  if (!fs.existsSync(filterFile)) {
    console.log(`exiting: could not find filter file: ${filterFile}`)
    process.exit(2)
  }
  filterItems = fs.readFileSync(filterFile).toString().split('\n')
}

// get quick option, if available - default to slowlet quick = falseif ('quick' in options) {
  quick = true
}

// get data file based on model and quick optionslet dataFileif (model === 'coco') {
  if (quick) {
    dataFile = 'coco300'  }
  else {
    dataFile = 'coco512'  }
}else if (model === 'inception') {
  dataFile = 'inception224'}if (!dataFile) {
  console.error(`'${model}' is not valid model.`)
  process.exit(1)
}

你会注意到的 --version 命令，让我们接下来要做的就是——在package.json中读取并输出版本。这样，我们只需要将它保存在一个地方。

剩下的处理是检查是否使用了一个选项，如果是，则验证它，等等。

一旦我们收集了分类处理所需的所有数据，我们就可以开始分类了。

使用OpenCV来做图像分类

现在我们已经收集并验证了从用户与CLI交互中收集的参数，真正的乐趣就可以开始了。高级处理并不像您想象的那么困难。

// OpenCV
const cv = require('opencv4nodejs')

// initialize model from prototxt and modelFile
let net
if (dataFile === 'coco300' || dataFile === 'coco512') {
  net = cv.readNetFromCaffe(prototxt, modelFile)
}

// read the image
const img = cv.imread(imagePath)

// starting time of classification
let start = new Date()

// get predictions
const predictions = predict(img).filter((item) => {
  // filter out what we don't want
  if (item.confidence < confidence) {
    return false
  }
  // user wants to filter items
  if (filterItems.length > 0) {
    if (filterItems.indexOf(classes[item.classIndex]) < 0) {
      return false
    }
  }
  return true
})

// end of classification
let end = new Date()
finalize(start, end)

// write updated image with new name
updateImage(imagePath, img, predictions)

结果不是很坏啦。你需要知道，这是我们使用了一些实际用户的数据来信任或过滤文件的。

但是，这其中大部分的工作量是预测功能，用来返回预测值。并且，这也需要许多数据抓取功能来支持这个预测功能。

让我们看看这个预测是如何实现的：

/**
 * Predicts classifications based on passed in image
 * @param {Object} img The image to use for predictions
 */
const predict = (img) => {
  // white is the better padding color
  const white = new cv.Vec(255, 255, 255)

  // resize to model size
  const theImage = img.resizeToMax(modelData.size, modelData.size).padToSquare(white)

  // network accepts blobs as input
  const inputBlob = cv.blobFromImage(theImage)
  net.setInput(inputBlob)

  // forward pass input through entire network, will return
  // classification result as (coco: 1x1xNxM Mat) (inception: 1xN Mat)
  let outputBlob = net.forward()

  if (dataFile === 'coco300' || dataFile === 'coco512') {
    // extract NxM Mat from 1x1xNxM Mat
    outputBlob = outputBlob.flattenFloat(outputBlob.sizes[2], outputBlob.sizes[3])
    // pass original image
    return extractResultsCoco(outputBlob, img)
  }
}

首先，这些模型都是训练好的。一个是300x300，另一个是512x512。300x300的这个模型会快一些，它需要的数据也较少。512x512的模型相对慢一点，但是它总体的预测精度更高，因此它需要更多数据。

上面的代码还有一个功能是对输入图片进行重采样，使它的尺寸能够满足模型训练图片的要求。如果原始图片不是矩形，我们需要把它填充至矩形。填充时通常使用白色，因为白色相对比黑色对原图的影响要小。

然后，图片会被转换成一个“blob”，并传入“net.setInput”。务必记得，我们之前有过这个代码：

// initialize model from prototxt and modelFile
let net
if (dataFile === 'coco300' || dataFile === 'coco512') {
  net = cv.readNetFromCaffe(prototxt, modelFile)
}

我们再展示一下，防止你忘了（当然，这是一个全局的定义，因此也会被传入模型）。

所以，现在你主导上述功能中的最后一个步骤是获取结果：

/**
 * Extracts results from a network OutputBob
 * @param {Object} outputBlob The outputBlob returned from net.forward()
 * @param {Object} img The image used for classification
 */
const extractResultsCoco = (outputBlob, img) => {
  return Array(outputBlob.rows).fill(0)
    .map((res, i) => {
      // get class index
      const classIndex = outputBlob.at(i, 1);
      const confidence = outputBlob.at(i, 2);
      // output blobs are in a percentage
      const bottomLeft = new cv.Point(
        outputBlob.at(i, 3) * img.cols,
        outputBlob.at(i, 6) * img.rows
      );
      const topRight = new cv.Point(
        outputBlob.at(i, 5) * img.cols,
        outputBlob.at(i, 4) * img.rows
      );
      // create a rect
      const rect = new cv.Rect(
        bottomLeft.x,
        topRight.y,
        topRight.x - bottomLeft.x,
        bottomLeft.y - topRight.y
      );

      return ({
        classIndex,
        confidence,
        rect
      })
    })
}

这就是你能够读取到的“index”（这将与分类结果对应），分类的“置信”水平，识别对象的“锚系”方位。这些就是我们的“预测”，随后我们来过滤结果。

还记得你在上面也看到过下面这个代码吗：

// write updated image with new name
updateImage(imagePath, img, predictions)

该代码将过滤后的结果作为新的图像写入文件，这样用户就可以看到预测结果以及它的“置信”水平。下面展示了多种方法来重现结果：

/**
 * Generate a random color
 */
const getRandomColor = () => new cv.Vec(Math.random() * 255, Math.random() * 255, Math.random() * 255);

/**
 * Returns a function that, for each prediction, draws a rect area with rndom color
 * @param {Arry} predictions Array of predictions
 */
const makeDrawClassDetections = (predictions) => (drawImg, getColor, thickness = 2) => {
  predictions
    .forEach((p) => {
      let color = getColor()
      let confidence = p.confidence
      let rect = p.rect
      let className = classes[p.classIndex]
      drawRect(className, confidence, drawImg, rect, color, { thickness })
    })
  return drawImg
}

/*
  Take the original image and add rectanges on predictions.
  Write it to a new file.
 */
const updateImage = (imagePath, img, predictions) => {
  // get the filename and replace last occurrence of '.' with '_classified.'
  const filename = imagePath.replace(/^.*[\\\/]/, '').replace(/.([^.]*)$/,`_classified_${dataFile}_${confidence * 100.0}.` + '$1')

  // get function to draw rect around predicted object
  const drawClassDetections = makeDrawClassDetections(predictions);

  // draw a rect around predicted object
  drawClassDetections(img, getRandomColor);

  // write updated image to current directory
  cv.imwrite('./' + filename, img)
}

// draw a rect and label in specified area
/**
 * 
 * @param {String} className Predicted class name (identified object)
 * @param {Number} confidence The confidence level (ie: .80 = 80%)
 * @param {Object} image The image
 * @param {Object} rect The rect area
 * @param {Object} color The color to use
 * @param {Object} [opts={ thickness: 2 }] Options (currently only supports thikness)
 */
const drawRect = (className, confidence, image, rect, color, opts = { thickness: 2 }) => {
  let level = Math.round(confidence * 100.0)
  image.drawRectangle(
    rect,
    color,
    opts.thickness,
    cv.LINE_8
  )
  // draw the label (className and confidence level)
  let label = className + ': ' + level
  image.putText(label, new cv.Point2(rect.x, rect.y + 20), cv.FONT_ITALIC, .65, color, 2)

我不会详细来解释这段代码，因为他们还是比较常见的JavaScript代码（而且文中注释也写得很好）。

缺点

你应该使用一些过滤器，通常是基于置信水平的过滤器。我通常会使用50作为阈值来过滤，但是有时候也会降低到30。你想知道为什么？因为这是我们有时会碰到的情况：

没有置信过滤的分类结果

如果图像中的物体过于“繁重”，你会得到许多分类结果。这其中的大部分是假的。大部分的置信水平低于10。你可以试试调整过滤置信水平的阈值，来看看哪个值的效果最好。请记得，这是和本文的第一个图片一样的那张图哦（哈哈，我是不是让你回看文章的开头了？）

案例

没有分类的火车

分类的火车

未分类的皇室

分类的皇室

Harry，露齿呀！这样你就是一个100%的置信的人了。哈哈，不开玩笑了，这还是很有趣的！我依然还在学习中。并且还有很多可以学的。我希望我写的内容可以帮助到你的学习，希望你也这么觉得。

你可以在GitHub里找到完整的项目。

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2018-11-26，如有侵权请联系 cloudcommunity@tencent.com 删除

其他

本文分享自 AI研习社微信公众号，前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

其他

登录后参与评论

0 条评论

热度