blocks|key|3118313|text|好了，我决定解决我的问题来解决上面的问题。我想要的是使用OpenCV中的KNearest或支持向量机功能来实现一个简单的光学字符识别。下面是我做了什么以及如何做的。(它只是为了学习如何使用KNearest实现简单的光学字符识别目的)。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|3118314|1)我的第一个问题是关于OpenCV示例附带的letter_recognition.data文件。我想知道那个文件里面有什么。|offset|length|style|BOLD|3118315|它包含一个字母，以及该字母的16个特征。|3118316|this+SOF帮我找到了它。这16个特性在论文中进行了解释。(尽管我不理解结尾的一些功能)|CODE|3118317|2)，因为我知道，如果不了解所有这些特性，就很难做到这一点。我尝试了一些其他的论文，但对于初学者来说都有点难。|3118318|So+I+just+decided+to+take+all+the+pixel+values+as+my+features.+(我并不担心准确性或性能，我只是希望它能正常工作，至少要有最低的准确性)|3118319|我为我的训练数据拍摄了下面的图片：|3118320|​|3118321|📷|atomic|3118322|3118323|(我知道训练数据量较少。但是，由于所有的字母都是相同的字体和大小，我决定试一下这个)。|3118324|为了给准备训练的数据，我用OpenCV做了一个小代码。它做以下事情：|3118325|它加载image.|3118326|Selects的数字(显然是通过轮廓查找和应用对字母的面积和高度的约束，以避免错误的detections).|unordered-list-item|3118327|
|3118328|Draws一个字母周围的边界矩形，并等待key+press+manually。这一次，我们自己按下与字母对应的数字键。一旦按下相应的数字键，它就会将此框的大小调整为10x10，并将100像素值保存在一个数组(这里是)中，并将相应的手动输入的数字保存在另一个数组中(在这里，responses).|3118329|3118330|Then将这两个数组保存在单独的txt文件中。|3118331|3118332|在人工分类结束时，训练数据(+train.png)中的所有数字都是我们自己手动标记的，图像如下：|3118333|3118334|3118335|3118336|下面是我用于上述目的的代码(当然，不是很干净)：|3118337|import+sys

import+numpy+as+np
import+cv2

im+=+cv2.imread('pitrain.png')
im3+=+im.copy()

gray+=+cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
blur+=+cv2.GaussianBlur(gray,(5,5),0)
thresh+=+cv2.adaptiveThreshold(blur,255,1,1,11,2)

#################++++++Now+finding+Contours+++++++++###################

contours,hierarchy+=+cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)

samples+=++np.empty((0,100))
responses+=+[]
keys+=+[i+for+i+in+range(48,58)]

for+cnt+in+contours:
++++if+cv2.contourArea(cnt)>50:
++++++++[x,y,w,h]+=+cv2.boundingRect(cnt)

++++++++if++h>28:
++++++++++++cv2.rectangle(im,(x,y),(x%2Bw,y%2Bh),(0,0,255),2)
++++++++++++roi+=+thresh[y:y%2Bh,x:x%2Bw]
++++++++++++roismall+=+cv2.resize(roi,(10,10))
++++++++++++cv2.imshow('norm',im)
++++++++++++key+=+cv2.waitKey(0)

++++++++++++if+key+==+27:++#+(escape+to+quit)
++++++++++++++++sys.exit()
++++++++++++elif+key+in+keys:
++++++++++++++++responses.append(int(chr(key)))
++++++++++++++++sample+=+roismall.reshape((1,100))
++++++++++++++++samples+=+np.append(samples,sample,0)

responses+=+np.array(responses,np.float32)
responses+=+responses.reshape((responses.size,1))
print+"training+complete"

np.savetxt('generalsamples.data',samples)
np.savetxt('generalresponses.data',responses)|code-block|syntax|javascript|3118338|3118339|现在我们进入培训和测试部分。|3118340|对于测试部分，我使用了下面的图像，它具有与我用于训练的相同类型的字母。|3118341|3118342|3118343|3118344|用于训练的我们按如下方式做：|3118345|将我们已经保存的txt文件加载到我们正在使用的分类器实例中(在这里，我们使用KNearest.train+|3118346|+KNearest)|3118347|3118348|Then+|3118349|KNearest.train|ordered-list-item|3118350|3118351|3118352|出于测试目的，我们执行以下操作：|3118353|3118354|我们像之前一样加载用于testing|3118355|process图像的图像，并使用轮廓方法提取每个数字，|3118356|为其绘制边界框，然后将其大小调整为10x10，并像前面一样将其像素值存储在一个数组中。|3118357|，然后我们使用KNearest.find_nearest()函数来查找与我们给定的项最接近的项。(如果幸运的话，它能识别正确的数字。)|3118358|3118359|我在下面的代码中包含了最后两个步骤(训练和测试)：|3118360|import+cv2
import+numpy+as+np

#######+++training+part++++###############+
samples+=+np.loadtxt('generalsamples.data',np.float32)
responses+=+np.loadtxt('generalresponses.data',np.float32)
responses+=+responses.reshape((responses.size,1))

model+=+cv2.KNearest()
model.train(samples,responses)

#############################+testing+part++#########################

im+=+cv2.imread('pi.png')
out+=+np.zeros(im.shape,np.uint8)
gray+=+cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
thresh+=+cv2.adaptiveThreshold(gray,255,1,1,11,2)

contours,hierarchy+=+cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)

for+cnt+in+contours:
++++if+cv2.contourArea(cnt)>50:
++++++++[x,y,w,h]+=+cv2.boundingRect(cnt)
++++++++if++h>28:
++++++++++++cv2.rectangle(im,(x,y),(x%2Bw,y%2Bh),(0,255,0),2)
++++++++++++roi+=+thresh[y:y%2Bh,x:x%2Bw]
++++++++++++roismall+=+cv2.resize(roi,(10,10))
++++++++++++roismall+=+roismall.reshape((1,100))
++++++++++++roismall+=+np.float32(roismall)
++++++++++++retval,+results,+neigh_resp,+dists+=+model.find_nearest(roismall,+k+=+1)
++++++++++++string+=+str(int((results[0][0])))
++++++++++++cv2.putText(out,string,(x,y%2Bh),0,1,(0,255,0))

cv2.imshow('im',im)
cv2.imshow('out',out)
cv2.waitKey(0)|3118361|它起作用了，下面是我得到的结果：|3118362|3118363|3118364|3118365|3118366|在这里，它以100%25的准确率工作。我认为这是因为所有的数字都是相同类型和相同大小的。|3118367|但不管怎样，这对初学者来说是一个很好的开始(我希望如此)。|3118368|entityMap|0|LINK|mutability|MUTABLE|url|https://stackoverflow.com/questions/1270798/how-to-create-data-fom-image-like-letter-image-recognition-dataset-from-uci|1|IMAGE|IMMUTABLE|imageUrl|https://ask.qcloudimg.com/http-save/yehe-900000/5cff5965f7ea17a1985b70c5e9a18257.png|imageAlt|2|https://ask.qcloudimg.com/http-save/yehe-900000/f90a36e930175a4717318f81da11b17a.png|3|https://ask.qcloudimg.com/http-save/yehe-900000/2c0b976c66fd8ff8987b985ab8f88c8b.png|4|https://ask.qcloudimg.com/http-save/yehe-900000/239c4f88150c92b2d1108789f2328a07.png^0|0|0|2|0|0|0|8|0|8|0|0|0|2|0|0|1Q|0|0|0|0|1|1|0|0|0|3|V|0|0|0|0|K|I|2Z|14|0|0|0|N|0|0|0|1C|0|0|1|0|0|1|0|1|2|0|0|1|0|0|O|0|0|YO|0|0|0|E|0|0|Z|0|0|1|0|0|1|0|1|3|0|0|1|0|0|E|0|0|1H|0|0|A|0|0|0|5|0|0|E|0|0|0|0|G|0|0|0|I|0|0|R|0|0|17|0|0|1V|0|0|0|P|0|0|XD|0|0|G|0|0|1|0|0|1|0|1|4|0|0|1|0|0|0|16|0|0|T|0^^$0|@$1|2|3|4|5|6|7|3N|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|3O|8|@$D|3P|E|3Q|F|G]]|9|@]|A|$]]|$1|H|3|I|5|6|7|3R|8|@]|9|@]|A|$]]|$1|J|3|K|5|6|7|3S|8|@$D|3T|E|3U|F|L]]|9|@$D|3V|E|3W|1|3X]]|A|$]]|$1|M|3|N|5|6|7|3Y|8|@$D|3Z|E|40|F|G]]|9|@]|A|$]]|$1|O|3|P|5|6|7|41|8|@$D|42|E|43|F|L]]|9|@]|A|$]]|$1|Q|3|R|5|6|7|44|8|@]|9|@]|A|$]]|$1|S|3|T|5|6|7|45|8|@]|9|@]|A|$]]|$1|U|3|V|5|W|7|46|8|@]|9|@$D|47|E|48|1|49]]|A|$]]|$1|X|3|T|5|6|7|4A|8|@]|9|@]|A|$]]|$1|Y|3|Z|5|6|7|4B|8|@]|9|@]|A|$]]|$1|10|3|11|5|6|7|4C|8|@$D|4D|E|4E|F|G]]|9|@]|A|$]]|$1|12|3|13|5|6|7|4F|8|@]|9|@]|A|$]]|$1|14|3|15|5|16|7|4G|8|@]|9|@]|A|$]]|$1|17|3|18|5|6|7|4H|8|@]|9|@]|A|$]]|$1|19|3|1A|5|16|7|4I|8|@$D|4J|E|4K|F|L]|$D|4L|E|4M|F|G]]|9|@]|A|$]]|$1|1B|3|18|5|6|7|4N|8|@]|9|@]|A|$]]|$1|1C|3|1D|5|16|7|4O|8|@$D|4P|E|4Q|F|G]]|9|@]|A|$]]|$1|1E|3|-4|5|6|7|4R|8|@]|9|@]|A|$]]|$1|1F|3|1G|5|6|7|4S|8|@$D|4T|E|4U|F|G]]|9|@]|A|$]]|$1|1H|3|T|5|6|7|4V|8|@$D|4W|E|4X|F|G]]|9|@]|A|$]]|$1|1I|3|V|5|W|7|4Y|8|@$D|4Z|E|50|F|G]]|9|@$D|51|E|52|1|53]]|A|$]]|$1|1J|3|T|5|6|7|54|8|@$D|55|E|56|F|G]]|9|@]|A|$]]|$1|1K|3|1L|5|6|7|57|8|@$D|58|E|59|F|G]]|9|@]|A|$]]|$1|1M|3|1N|5|1O|7|5A|8|@$D|5B|E|5C|F|G]]|9|@]|A|$1P|1Q]]|$1|1R|3|-4|5|6|7|5D|8|@]|9|@]|A|$]]|$1|1S|3|1T|5|6|7|5E|8|@$D|5F|E|5G|F|G]]|9|@]|A|$]]|$1|1U|3|1V|5|6|7|5H|8|@$D|5I|E|5J|F|G]]|9|@]|A|$]]|$1|1W|3|T|5|6|7|5K|8|@$D|5L|E|5M|F|G]]|9|@]|A|$]]|$1|1X|3|V|5|W|7|5N|8|@$D|5O|E|5P|F|G]]|9|@$D|5Q|E|5R|1|5S]]|A|$]]|$1|1Y|3|T|5|6|7|5T|8|@$D|5U|E|5V|F|G]]|9|@]|A|$]]|$1|1Z|3|20|5|6|7|5W|8|@$D|5X|E|5Y|F|G]]|9|@]|A|$]]|$1|21|3|22|5|6|7|5Z|8|@$D|60|E|61|F|G]]|9|@]|A|$]]|$1|23|3|24|5|16|7|62|8|@$D|63|E|64|F|G]]|9|@]|A|$]]|$1|25|3|18|5|6|7|65|8|@]|9|@]|A|$]]|$1|26|3|27|5|16|7|66|8|@$D|67|E|68|F|G]]|9|@]|A|$]]|$1|28|3|29|5|2A|7|69|8|@$D|6A|E|6B|F|G]]|9|@]|A|$]]|$1|2B|3|-4|5|6|7|6C|8|@]|9|@]|A|$]]|$1|2C|3|-4|5|6|7|6D|8|@]|9|@]|A|$]]|$1|2D|3|2E|5|6|7|6E|8|@$D|6F|E|6G|F|G]]|9|@]|A|$]]|$1|2F|3|-4|5|6|7|6H|8|@]|9|@]|A|$]]|$1|2G|3|2H|5|2A|7|6I|8|@$D|6J|E|6K|F|G]]|9|@]|A|$]]|$1|2I|3|2J|5|2A|7|6L|8|@$D|6M|E|6N|F|G]]|9|@]|A|$]]|$1|2K|3|2L|5|2A|7|6O|8|@$D|6P|E|6Q|F|G]]|9|@]|A|$]]|$1|2M|3|2N|5|2A|7|6R|8|@$D|6S|E|6T|F|G]]|9|@]|A|$]]|$1|2O|3|-4|5|6|7|6U|8|@]|9|@]|A|$]]|$1|2P|3|2Q|5|6|7|6V|8|@$D|6W|E|6X|F|G]]|9|@]|A|$]]|$1|2R|3|2S|5|1O|7|6Y|8|@$D|6Z|E|70|F|G]]|9|@]|A|$1P|1Q]]|$1|2T|3|2U|5|6|7|71|8|@$D|72|E|73|F|G]]|9|@]|A|$]]|$1|2V|3|T|5|6|7|74|8|@$D|75|E|76|F|G]]|9|@]|A|$]]|$1|2W|3|V|5|W|7|77|8|@$D|78|E|79|F|G]]|9|@$D|7A|E|7B|1|7C]]|A|$]]|$1|2X|3|T|5|6|7|7D|8|@$D|7E|E|7F|F|G]]|9|@]|A|$]]|$1|2Y|3|-4|5|6|7|7G|8|@]|9|@]|A|$]]|$1|2Z|3|30|5|6|7|7H|8|@$D|7I|E|7J|F|G]]|9|@]|A|$]]|$1|31|3|32|5|6|7|7K|8|@$D|7L|E|7M|F|G]]|9|@]|A|$]]|$1|33|3|-4|5|6|7|7N|8|@]|9|@]|A|$]]]|34|$35|$5|36|37|38|A|$39|3A]]|3B|$5|3C|37|3D|A|$3E|3F|3G|-4]]|3H|$5|3C|37|3D|A|$3E|3I|3G|-4]]|3J|$5|3C|37|3D|A|$3E|3K|3G|-4]]|3L|$5|3C|37|3D|A|$3E|3M|3G|-4]]]]

Well, I decided to workout myself on my question to solve above problem. What I wanted is to implement a simpl OCR using KNearest or SVM features in OpenCV. And below is what I did and how. ( it is just for learning how to use KNearest for simple OCR purposes).

1) My first question was about letter_recognition.data file that comes with OpenCV samples. I wanted to know what is inside that file.

It contains a letter, along with 16 features of that letter.

And <a href="https://stackoverflow.com/questions/1270798/how-to-create-data-fom-image-like-letter-image-recognition-dataset-from-uci"><code>this SOF</code></a> helped me to find it. These 16 features are explained in the paper<a href="http://cns-classes.bu.edu/cn550/Readings/frey-slate-91.pdf" rel="noreferrer"><code>Letter Recognition Using Holland-Style Adaptive Classifiers</code></a>.
( Although I didn't understand some of the features at end)

2) Since I knew, without understanding all those features, it is difficult to do that method. I tried some other papers, but all were a little difficult for a beginner.

<code>So I just decided to take all the pixel values as my features.</code> (I was not worried about accuracy or performance, I just wanted it to work, at least with the least accuracy)

I took below image for my training data:

<img src="https://i.stack.imgur.com/IwQY6.png" alt="enter image description here">

( I know the amount of training data is less. But, since all letters are of same font and size, I decided to try on this).

To prepare the data for training, I made a small code in OpenCV. It does following things:

<ol>
<li>It loads the image.</li>
<li>Selects the digits ( obviously by contour finding and applying constraints on area and height of letters to avoid false detections).</li>
<li>Draws the bounding rectangle around one letter and wait for <code>key press manually</code>. This time we press the digit key ourselves corresponding to the letter in box.</li>
<li>Once corresponding digit key is pressed, it resizes this box to 10x10 and saves 100 pixel values in an array (here, samples) and corresponding manually entered digit in another array(here, responses).</li>
<li>Then save both the arrays in separate txt files.</li>
</ol>

At the end of manual classification of digits, all the digits in the train data( train.png) are labeled manually by ourselves, image will look like below:

<img src="https://i.stack.imgur.com/jyAhT.png" alt="enter image description here">

Below is the code I used for above purpose ( of course, not so clean):

<pre><code>import sys

import numpy as np
import cv2

im = cv2.imread('pitrain.png')
im3 = im.copy()

gray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray,(5,5),0)
thresh = cv2.adaptiveThreshold(blur,255,1,1,11,2)

################# Now finding Contours ###################

contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)

samples = np.empty((0,100))
responses = []
keys = [i for i in range(48,58)]

for cnt in contours:
 if cv2.contourArea(cnt)&gt;50:
 [x,y,w,h] = cv2.boundingRect(cnt)

 if h&gt;28:
 cv2.rectangle(im,(x,y),(x+w,y+h),(0,0,255),2)
 roi = thresh[y:y+h,x:x+w]
 roismall = cv2.resize(roi,(10,10))
 cv2.imshow('norm',im)
 key = cv2.waitKey(0)

 if key == 27: # (escape to quit)
 sys.exit()
 elif key in keys:
 responses.append(int(chr(key)))
 sample = roismall.reshape((1,100))
 samples = np.append(samples,sample,0)

responses = np.array(responses,np.float32)
responses = responses.reshape((responses.size,1))
print "training complete"

np.savetxt('generalsamples.data',samples)
np.savetxt('generalresponses.data',responses)
</code></pre>

<hr>

Now we enter in to training and testing part.

For testing part I used below image, which has same type of letters I used to train.

<img src="https://i.stack.imgur.com/dPaE8.png" alt="enter image description here">

For training we do as follows:

<ol>
<li>Load the txt files we already saved earlier</li>
<li>create a instance of classifier we are using ( here, it is KNearest)</li>
<li>Then we use KNearest.train function to train the data</li>
</ol>

For testing purposes, we do as follows:

<ol>
<li>We load the image used for testing</li>
<li>process the image as earlier and extract each digit using contour methods</li>
<li>Draw bounding box for it, then resize to 10x10, and store its pixel values in an array as done earlier. </li>
<li>Then we use KNearest.find_nearest() function to find the nearest item to the one we gave. ( If lucky, it recognises the correct digit.)</li>
</ol>

I included last two steps ( training and testing) in single code below:

<pre><code>import cv2
import numpy as np

####### training part ############### 
samples = np.loadtxt('generalsamples.data',np.float32)
responses = np.loadtxt('generalresponses.data',np.float32)
responses = responses.reshape((responses.size,1))

model = cv2.KNearest()
model.train(samples,responses)

############################# testing part #########################

im = cv2.imread('pi.png')
out = np.zeros(im.shape,np.uint8)
gray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
thresh = cv2.adaptiveThreshold(gray,255,1,1,11,2)

contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)

for cnt in contours:
 if cv2.contourArea(cnt)&gt;50:
 [x,y,w,h] = cv2.boundingRect(cnt)
 if h&gt;28:
 cv2.rectangle(im,(x,y),(x+w,y+h),(0,255,0),2)
 roi = thresh[y:y+h,x:x+w]
 roismall = cv2.resize(roi,(10,10))
 roismall = roismall.reshape((1,100))
 roismall = np.float32(roismall)
 retval, results, neigh_resp, dists = model.find_nearest(roismall, k = 1)
 string = str(int((results[0][0])))
 cv2.putText(out,string,(x,y+h),0,1,(0,255,0))

cv2.imshow('im',im)
cv2.imshow('out',out)
cv2.waitKey(0)
</code></pre>

And it worked, below is the result I got:

<img src="https://i.stack.imgur.com/xS3gF.png" alt="enter image description here">

<hr>

Here it worked with 100% accuracy. I assume this is because all the digits are of same kind and same size.

But any way, this is a good start to go for beginners ( I hope so).

blocks|key|235074|text|对于那些对C%2B%2B代码感兴趣的人，可以参考下面的代码。感谢Abid+Rahman的精彩解释。|type|unstyled|depth|inlineStyleRanges|offset|length|style|BOLD|entityRanges|data|235075|235076|该过程与上面相同，但是，轮廓查找仅使用第一层次轮廓，因此该算法仅使用每个手指的外轮廓。|235077|用于创建样本和标签数据的代码|235078|//Process+image+to+extract+contour
Mat+thr,gray,con;
Mat+src=imread("digit.png",1);
cvtColor(src,gray,CV_BGR2GRAY);
threshold(gray,thr,200,255,THRESH_BINARY_INV);+//Threshold+to+find+contour
thr.copyTo(con);

//+Create+sample+and+label+data
vector<+vector+<Point>+>+contours;+//+Vector+for+storing+contour
vector<+Vec4i+>+hierarchy;
Mat+sample;
Mat+response_array;++
findContours(+con,+contours,+hierarchy,CV_RETR_CCOMP,+CV_CHAIN_APPROX_SIMPLE+);+//Find+contour

for(+int+i+=+0;+i<+contours.size();+i=hierarchy[i][0]+)+//+iterate+through+first+hierarchy+level+contours
{
++++Rect+r=+boundingRect(contours[i]);+//Find+bounding+rect+for+each+contour
++++rectangle(src,Point(r.x,r.y),+Point(r.x%2Br.width,r.y%2Br.height),+Scalar(0,0,255),2,8,0);
++++Mat+ROI+=+thr(r);+//Crop+the+image
++++Mat+tmp1,+tmp2;
++++resize(ROI,tmp1,+Size(10,10),+0,0,INTER_LINEAR+);+//resize+to+10X10
++++tmp1.convertTo(tmp2,CV_32FC1);+//convert+to+float
++++sample.push_back(tmp2.reshape(1,1));+//+Store++sample+data
++++imshow("src",src);
++++int+c=waitKey(0);+//+Read+corresponding+label+for+contour+from+keyoard
++++c-=0x30;+++++//+Convert+ascii+to+intiger+value
++++response_array.push_back(c);+//+Store+label+to+a+mat
++++rectangle(src,Point(r.x,r.y),+Point(r.x%2Br.width,r.y%2Br.height),+Scalar(0,255,0),2,8,0);++++
}

//+Store+the+data+to+file
Mat+response,tmp;
tmp=response_array.reshape(1,1);+//make+continuous
tmp.convertTo(response,CV_32FC1);+//+Convert++to+float

FileStorage+Data("TrainingData.yml",FileStorage::WRITE);+//+Store+the+sample+data+in+a+file
Data+<<+"data"+<<+sample;
Data.release();

FileStorage+Label("LabelData.yml",FileStorage::WRITE);+//+Store+the+label+data+in+a+file
Label+<<+"label"+<<+response;
Label.release();
cout<<"Training+and+Label+data+created+successfully....!!+"<<endl;

imshow("src",src);
waitKey();|code-block|syntax|javascript|235079|训练和测试代码|235080|Mat+thr,gray,con;
Mat+src=imread("dig.png",1);
cvtColor(src,gray,CV_BGR2GRAY);
threshold(gray,thr,200,255,THRESH_BINARY_INV);+//+Threshold+to+create+input
thr.copyTo(con);


//+Read+stored+sample+and+label+for+training
Mat+sample;
Mat+response,tmp;
FileStorage+Data("TrainingData.yml",FileStorage::READ);+//+Read+traing+data+to+a+Mat
Data["data"]+>>+sample;
Data.release();

FileStorage+Label("LabelData.yml",FileStorage::READ);+//+Read+label+data+to+a+Mat
Label["label"]+>>+response;
Label.release();


KNearest+knn;
knn.train(sample,response);+//+Train+with+sample+and+responses
cout<<"Training+compleated.....!!"<<endl;

vector<+vector+<Point>+>+contours;+//+Vector+for+storing+contour
vector<+Vec4i+>+hierarchy;

//Create+input+sample+by+contour+finding+and+cropping
findContours(+con,+contours,+hierarchy,CV_RETR_CCOMP,+CV_CHAIN_APPROX_SIMPLE+);
Mat+dst(src.rows,src.cols,CV_8UC3,Scalar::all(0));

for(+int+i+=+0;+i<+contours.size();+i=hierarchy[i][0]+)+//+iterate+through+each+contour+for+first+hierarchy+level+.
{
++++Rect+r=+boundingRect(contours[i]);
++++Mat+ROI+=+thr(r);
++++Mat+tmp1,+tmp2;
++++resize(ROI,tmp1,+Size(10,10),+0,0,INTER_LINEAR+);
++++tmp1.convertTo(tmp2,CV_32FC1);
++++float+p=knn.find_nearest(tmp2.reshape(1,1),+1);
++++char+name[4];
++++sprintf(name,"%25d",(int)p);
++++putText(+dst,name,Point(r.x,r.y%2Br.height)+,0,1,+Scalar(0,+255,+0),+2,+8+);
}

imshow("src",src);
imshow("dst",dst);
imwrite("dest.jpg",dst);
waitKey();|235081|结果|235082|在结果中，第一行中的点被检测为8，并且我们没有对点进行训练。此外，我还考虑将第一层次中的每个轮廓作为样本输入，用户可以通过计算面积来避免它。|235083|​|235084|📷|atomic|235085|235086|entityMap|0|IMAGE|mutability|IMMUTABLE|imageUrl|https://ask.qcloudimg.com/http-save/yehe-900000/6e93c3782c63e957cb26179015f6aad1.jpeg|imageAlt^0|S|B|0|0|0|0|0|0|0|0|0|0|0|1|0|0|0^^$0|@$1|2|3|4|5|6|7|1C|8|@$9|1D|A|1E|B|C]]|D|@]|E|$]]|$1|F|3|-4|5|6|7|1F|8|@]|D|@]|E|$]]|$1|G|3|H|5|6|7|1G|8|@]|D|@]|E|$]]|$1|I|3|J|5|6|7|1H|8|@]|D|@]|E|$]]|$1|K|3|L|5|M|7|1I|8|@]|D|@]|E|$N|O]]|$1|P|3|Q|5|6|7|1J|8|@]|D|@]|E|$]]|$1|R|3|S|5|M|7|1K|8|@]|D|@]|E|$N|O]]|$1|T|3|U|5|6|7|1L|8|@]|D|@]|E|$]]|$1|V|3|W|5|6|7|1M|8|@]|D|@]|E|$]]|$1|X|3|Y|5|6|7|1N|8|@]|D|@]|E|$]]|$1|Z|3|10|5|11|7|1O|8|@]|D|@$9|1P|A|1Q|1|1R]]|E|$]]|$1|12|3|Y|5|6|7|1S|8|@]|D|@]|E|$]]|$1|13|3|-4|5|6|7|1T|8|@]|D|@]|E|$]]]|14|$15|$5|16|17|18|E|$19|1A|1B|-4]]]]

For those who interested in C++ code can refer below code. 
Thanks Abid Rahman for the nice explanation.

<hr>

The procedure is same as above but, the contour finding uses only first hierarchy level contour, so that the algorithm uses only outer contour for each digit.

<h2>Code for creating sample and Label data</h2>

<pre class="lang-cpp prettyprint-override"><code>//Process image to extract contour
Mat thr,gray,con;
Mat src=imread("digit.png",1);
cvtColor(src,gray,CV_BGR2GRAY);
threshold(gray,thr,200,255,THRESH_BINARY_INV); //Threshold to find contour
thr.copyTo(con);

// Create sample and label data
vector&lt; vector &lt;Point&gt; &gt; contours; // Vector for storing contour
vector&lt; Vec4i &gt; hierarchy;
Mat sample;
Mat response_array; 
findContours( con, contours, hierarchy,CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE ); //Find contour

for( int i = 0; i&lt; contours.size(); i=hierarchy[i][0] ) // iterate through first hierarchy level contours
{
 Rect r= boundingRect(contours[i]); //Find bounding rect for each contour
 rectangle(src,Point(r.x,r.y), Point(r.x+r.width,r.y+r.height), Scalar(0,0,255),2,8,0);
 Mat ROI = thr(r); //Crop the image
 Mat tmp1, tmp2;
 resize(ROI,tmp1, Size(10,10), 0,0,INTER_LINEAR ); //resize to 10X10
 tmp1.convertTo(tmp2,CV_32FC1); //convert to float
 sample.push_back(tmp2.reshape(1,1)); // Store sample data
 imshow("src",src);
 int c=waitKey(0); // Read corresponding label for contour from keyoard
 c-=0x30; // Convert ascii to intiger value
 response_array.push_back(c); // Store label to a mat
 rectangle(src,Point(r.x,r.y), Point(r.x+r.width,r.y+r.height), Scalar(0,255,0),2,8,0); 
}

// Store the data to file
Mat response,tmp;
tmp=response_array.reshape(1,1); //make continuous
tmp.convertTo(response,CV_32FC1); // Convert to float

FileStorage Data("TrainingData.yml",FileStorage::WRITE); // Store the sample data in a file
Data &lt;&lt; "data" &lt;&lt; sample;
Data.release();

FileStorage Label("LabelData.yml",FileStorage::WRITE); // Store the label data in a file
Label &lt;&lt; "label" &lt;&lt; response;
Label.release();
cout&lt;&lt;"Training and Label data created successfully....!! "&lt;&lt;endl;

imshow("src",src);
waitKey();
</code></pre>

<h2>Code for training and testing</h2>

<pre class="lang-cpp prettyprint-override"><code>Mat thr,gray,con;
Mat src=imread("dig.png",1);
cvtColor(src,gray,CV_BGR2GRAY);
threshold(gray,thr,200,255,THRESH_BINARY_INV); // Threshold to create input
thr.copyTo(con);


// Read stored sample and label for training
Mat sample;
Mat response,tmp;
FileStorage Data("TrainingData.yml",FileStorage::READ); // Read traing data to a Mat
Data["data"] &gt;&gt; sample;
Data.release();

FileStorage Label("LabelData.yml",FileStorage::READ); // Read label data to a Mat
Label["label"] &gt;&gt; response;
Label.release();


KNearest knn;
knn.train(sample,response); // Train with sample and responses
cout&lt;&lt;"Training compleated.....!!"&lt;&lt;endl;

vector&lt; vector &lt;Point&gt; &gt; contours; // Vector for storing contour
vector&lt; Vec4i &gt; hierarchy;

//Create input sample by contour finding and cropping
findContours( con, contours, hierarchy,CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE );
Mat dst(src.rows,src.cols,CV_8UC3,Scalar::all(0));

for( int i = 0; i&lt; contours.size(); i=hierarchy[i][0] ) // iterate through each contour for first hierarchy level .
{
 Rect r= boundingRect(contours[i]);
 Mat ROI = thr(r);
 Mat tmp1, tmp2;
 resize(ROI,tmp1, Size(10,10), 0,0,INTER_LINEAR );
 tmp1.convertTo(tmp2,CV_32FC1);
 float p=knn.find_nearest(tmp2.reshape(1,1), 1);
 char name[4];
 sprintf(name,"%d",(int)p);
 putText( dst,name,Point(r.x,r.y+r.height) ,0,1, Scalar(0, 255, 0), 2, 8 );
}

imshow("src",src);
imshow("dst",dst);
imwrite("dest.jpg",dst);
waitKey();
</code></pre>

<h2>Result</h2>

In the result the dot in the first line is detected as 8 and we haven’t trained for dot. Also I am considering every contour in first hierarchy level as the sample input, user can avoid it by computing the area. 

<img src="https://i.stack.imgur.com/Hm0B8.jpg" alt="Results">

blocks|key|238726|text|我在生成训练数据时遇到了一些问题，因为有时很难识别最后选择的字母，所以我将图像旋转了1.5度。现在，每个字符都是按顺序选择的，经过训练后，测试仍然显示100%25的准确率。代码如下：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|238727|import+numpy+as+np
import+cv2

def+rotate_image(image,+angle):
++image_center+=+tuple(np.array(image.shape[1::-1])+/+2)
++rot_mat+=+cv2.getRotationMatrix2D(image_center,+angle,+1.0)
++result+=+cv2.warpAffine(image,+rot_mat,+image.shape[1::-1],+flags=cv2.INTER_LINEAR)
++return+result

img+=+cv2.imread('training_image.png')
cv2.imshow('orig+image',+img)
whiteBorder+=+[255,255,255]
#+extend+the+image+border
image1+=+cv2.copyMakeBorder(img,+80,+80,+80,+80,+cv2.BORDER_CONSTANT,+None,+whiteBorder)
#+rotate+the+image+1.5+degrees+clockwise+for+ease+of+data+entry
image_rot+=+rotate_image(image1,+-1.5)
#crop_img+=+image_rot[y:y%2Bh,+x:x%2Bw]
cropped+=+image_rot[70:350,+70:710]
cv2.imwrite('rotated.png',+cropped)
cv2.imshow('rotated+image',+cropped)
cv2.waitKey(0)|code-block|syntax|javascript|238728|对于示例数据，我对脚本进行了一些更改，如下所示：|238729|import+sys
import+numpy+as+np
import+cv2

def+sort_contours(contours,+x_axis_sort='LEFT_TO_RIGHT',+y_axis_sort='TOP_TO_BOTTOM'):
++++#+initialize+the+reverse+flag
++++x_reverse+=+False
++++y_reverse+=+False
++++if+x_axis_sort+==+'RIGHT_TO_LEFT':
++++++++x_reverse+=+True
++++if+y_axis_sort+==+'BOTTOM_TO_TOP':
++++++++y_reverse+=+True
++++
++++boundingBoxes+=+[cv2.boundingRect(c)+for+c+in+contours]
++++
++++#+sorting+on+x-axis+
++++sortedByX+=+zip(*sorted(zip(contours,+boundingBoxes),
++++key=lambda+b:b[1][0],+reverse=x_reverse))
++++
++++#+sorting+on+y-axis+
++++(contours,+boundingBoxes)+=+zip(*sorted(zip(*sortedByX),
++++key=lambda+b:b[1][1],+reverse=y_reverse))
++++#+return+the+list+of+sorted+contours+and+bounding+boxes
++++return+(contours,+boundingBoxes)

im+=+cv2.imread('rotated.png')
im3+=+im.copy()

gray+=+cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
blur+=+cv2.GaussianBlur(gray,(5,5),0)
thresh+=+cv2.adaptiveThreshold(blur,255,1,1,11,2)

contours,hierarchy+=+cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
contours,+boundingBoxes+=+sort_contours(contours,+x_axis_sort='LEFT_TO_RIGHT',+y_axis_sort='TOP_TO_BOTTOM')

samples+=++np.empty((0,100))
responses+=+[]
keys+=+[i+for+i+in+range(48,58)]

for+cnt+in+contours:
++++if+cv2.contourArea(cnt)>50:
++++++++[x,y,w,h]+=+cv2.boundingRect(cnt)

++++++++if++h>28+and+h+<+40:
++++++++++++cv2.rectangle(im,(x,y),(x%2Bw,y%2Bh),(0,0,255),2)
++++++++++++roi+=+thresh[y:y%2Bh,x:x%2Bw]
++++++++++++roismall+=+cv2.resize(roi,(10,10))
++++++++++++cv2.imshow('norm',im)
++++++++++++key+=+cv2.waitKey(0)

++++++++++++if+key+==+27:++#+(escape+to+quit)
++++++++++++++++sys.exit()
++++++++++++elif+key+in+keys:
++++++++++++++++responses.append(int(chr(key)))
++++++++++++++++sample+=+roismall.reshape((1,100))
++++++++++++++++samples+=+np.append(samples,sample,0)

responses+=+np.array(responses,np.ubyte)
responses+=+responses.reshape((responses.size,1))
print("training+complete")

np.savetxt('generalsamples.data',samples,fmt='%25i')
np.savetxt('generalresponses.data',responses,fmt='%25i')|238730|entityMap^0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|M|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|N|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|O|8|@]|9|@]|A|$]]|$1|I|3|J|5|D|7|P|8|@]|9|@]|A|$E|F]]|$1|K|3|-4|5|6|7|Q|8|@]|9|@]|A|$]]]|L|$]]

I had some problems to generate the training data, because it was hard sometimes to identify the last selected letter, so I rotated the image 1.5 degrees. Now each character is selected in order and the test still shows a 100% accuracy rate after training. Here is the code:
<pre><code>import numpy as np
import cv2

def rotate_image(image, angle):
 image_center = tuple(np.array(image.shape[1::-1]) / 2)
 rot_mat = cv2.getRotationMatrix2D(image_center, angle, 1.0)
 result = cv2.warpAffine(image, rot_mat, image.shape[1::-1], flags=cv2.INTER_LINEAR)
 return result

img = cv2.imread('training_image.png')
cv2.imshow('orig image', img)
whiteBorder = [255,255,255]
# extend the image border
image1 = cv2.copyMakeBorder(img, 80, 80, 80, 80, cv2.BORDER_CONSTANT, None, whiteBorder)
# rotate the image 1.5 degrees clockwise for ease of data entry
image_rot = rotate_image(image1, -1.5)
#crop_img = image_rot[y:y+h, x:x+w]
cropped = image_rot[70:350, 70:710]
cv2.imwrite('rotated.png', cropped)
cv2.imshow('rotated image', cropped)
cv2.waitKey(0)
</code></pre>
For sample data, I made some changes to the script, like this:
<pre><code>import sys
import numpy as np
import cv2

def sort_contours(contours, x_axis_sort='LEFT_TO_RIGHT', y_axis_sort='TOP_TO_BOTTOM'):
 # initialize the reverse flag
 x_reverse = False
 y_reverse = False
 if x_axis_sort == 'RIGHT_TO_LEFT':
 x_reverse = True
 if y_axis_sort == 'BOTTOM_TO_TOP':
 y_reverse = True
 
 boundingBoxes = [cv2.boundingRect(c) for c in contours]
 
 # sorting on x-axis 
 sortedByX = zip(*sorted(zip(contours, boundingBoxes),
 key=lambda b:b[1][0], reverse=x_reverse))
 
 # sorting on y-axis 
 (contours, boundingBoxes) = zip(*sorted(zip(*sortedByX),
 key=lambda b:b[1][1], reverse=y_reverse))
 # return the list of sorted contours and bounding boxes
 return (contours, boundingBoxes)

im = cv2.imread('rotated.png')
im3 = im.copy()

gray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray,(5,5),0)
thresh = cv2.adaptiveThreshold(blur,255,1,1,11,2)

contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
contours, boundingBoxes = sort_contours(contours, x_axis_sort='LEFT_TO_RIGHT', y_axis_sort='TOP_TO_BOTTOM')

samples = np.empty((0,100))
responses = []
keys = [i for i in range(48,58)]

for cnt in contours:
 if cv2.contourArea(cnt)&gt;50:
 [x,y,w,h] = cv2.boundingRect(cnt)

 if h&gt;28 and h &lt; 40:
 cv2.rectangle(im,(x,y),(x+w,y+h),(0,0,255),2)
 roi = thresh[y:y+h,x:x+w]
 roismall = cv2.resize(roi,(10,10))
 cv2.imshow('norm',im)
 key = cv2.waitKey(0)

 if key == 27: # (escape to quit)
 sys.exit()
 elif key in keys:
 responses.append(int(chr(key)))
 sample = roismall.reshape((1,100))
 samples = np.append(samples,sample,0)

responses = np.array(responses,np.ubyte)
responses = responses.reshape((responses.size,1))
print(&quot;training complete&quot;)

np.savetxt('generalsamples.data',samples,fmt='%i')
np.savetxt('generalresponses.data',responses,fmt='%i')
</code></pre>

I am trying to implement a "Digit Recognition OCR" in OpenCV-Python (cv2). It is just for learning purposes. I would like to learn both KNearest and SVM features in OpenCV. 

I have 100 samples (i.e. images) of each digit. I would like to train with them.

There is a sample <code>letter_recog.py</code> that comes with OpenCV sample. But I still couldn't figure out on how to use it. I don't understand what are the samples, responses etc. Also, it loads a txt file at first, which I didn't understand first.

Later on searching a little bit, I could find a letter_recognition.data in cpp samples. I used it and made a code for cv2.KNearest in the model of letter_recog.py (just for testing):

<pre><code>import numpy as np
import cv2

fn = 'letter-recognition.data'
a = np.loadtxt(fn, np.float32, delimiter=',', converters={ 0 : lambda ch : ord(ch)-ord('A') })
samples, responses = a[:,1:], a[:,0]

model = cv2.KNearest()
retval = model.train(samples,responses)
retval, results, neigh_resp, dists = model.find_nearest(samples, k = 10)
print results.ravel()
</code></pre>

It gave me an array of size 20000, I don't understand what it is.

Questions:

1) What is letter_recognition.data file? How to build that file from my own data set?

2) What does <code>results.reval()</code> denote? 

3) How we can write a simple digit recognition tool using letter_recognition.data file (either KNearest or SVM)?

Simple Digit Recognition OCR in OpenCV-Python

Python

我正在尝试用OpenCV-Python (cv2)实现一个“数字识别光学字符识别”。它只是为了学习的目的。我想学习OpenCV中的KNearest和支持向量机的特性。我有每个数字的100个样本(即图像)。我想和他们一起训练。OpenCV sample附带了一个示例letter_recog.py。但我仍然不知道如何使用它...

问OpenCV-Python中的简单数字识别OCR
EN

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问OpenCV-Python中的简单数字识别OCREN

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问OpenCV-Python中的简单数字识别OCR
EN