首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >照片相似性检查-将循环结果附加到数据帧

照片相似性检查-将循环结果附加到数据帧
EN

Stack Overflow用户
提问于 2021-07-17 07:56:03
回答 1查看 33关注 0票数 1

我有两个文件夹的照片。第二个文件夹假定完全由第一个文件夹中的照片的副本组成。我的工作是确认第二个文件夹实际上完全由副本组成。

我的脚本从2号文件夹拍摄一张照片,并将其与1号文件夹中的每张照片进行比较。每次比较都会产生一个相似值。如果相似性值大于16 (表示正匹配),则计数器变量加1。一旦针对文件夹1中的所有照片检查了2号文件夹中的照片,就会检查计数器。如果它仍然是零,则将照片添加到列表中。这部分代码可以正常工作,我对此很满意。

问题是,我还想从文件夹一(即,具有从1到16的相似度排名的照片与文件夹2中的照片),以便我可以对这些照片进行手动检查。我还希望这些结果是数据帧格式的,以便轻松呈现到可视化的html页面中。以下是我想要的最终结果:

代码语言:javascript
运行
复制
data = {'Photo': ['C:\Lucy Maud in Garden.jpg','C:\Henry by car.jpg','C:\Lucy and Henry arms together.jpg','C:\Lucy Maud with dog.jpg'],
     'NearMatch': ['C:\Lucy Maud in Garden2.jpg','C:\Henry by car2.jpg','C:\Lucy and Henry arms together2.jpg','C:\Lucy Maud with dog2.jpg'],
     'Similarity': [1,2,1,11]
        }


df = pd.DataFrame (data, columns = ['Photo','NearMatch','Similarity'])

下面是我的代码:

代码语言:javascript
运行
复制
from __future__ import division

import cv2
import numpy as np
import glob
import pandas as pd

    # Sift and Flann
sift = cv2.SIFT_create()


index_params = dict(algorithm=0, trees=5)
search_params = dict()
flann = cv2.FlannBasedMatcher(index_params, search_params)

#prep the empty lists

countInner = 0
countOuter = 1
countNoMatch = 0
nearMatch = []
nearMatch2 = []
listOfSimilarities = []
listOfDisimilarities = []

# Load all the images

folder1 = r"C:/ProbablyDups/**"
folder2 = r"C:/DefinitiveCopy/**"


extensionsOnly = ('.jpeg','.jpg','.png','.tif','.tiff','.gif')

siftOut1 = {}
for a in glob.iglob(folder1,recursive=True):
    if not a.lower().endswith(extensionsOnly):
        continue
    image1 = cv2.imread(a)
    kp_1, desc_1 = sift.detectAndCompute(image1, None)
    siftOut1[a]=(kp_1,desc_1)

siftOut2 = {}
for a in glob.iglob(folder2,recursive=True):
    if not a.lower().endswith(extensionsOnly):
        continue
    image1 = cv2.imread(a)
    kp_1, desc_1 = sift.detectAndCompute(image1, None)
    siftOut2[a]=(kp_1,desc_1)

#Compare photos in loops
for a in glob.iglob(folder1,recursive=True):
    if not a.lower().endswith(extensionsOnly):
        continue

    (kp_1,desc_1) = siftOut1[a]

    for b in glob.iglob(folder2,recursive=True):


        if not b.lower().endswith(extensionsOnly):

            continue

        if b.lower().endswith(extensionsOnly):

            countInner += 1


        (kp_2,desc_2) = siftOut2[b]

        matches = flann.knnMatch(desc_1, desc_2, k=2)

        good_points = []

        for m, n in matches:
            if m.distance < 0.6*n.distance:
                good_points.append(m)

        number_keypoints = 0
        if len(kp_1) >= len(kp_2):
            number_keypoints = len(kp_1)
        else:
            number_keypoints = len(kp_2)

        percentage_similarity = int(float(len(good_points)) / number_keypoints * 100)
        # add a tick to the counter if there is positive match
        if percentage_similarity > 16:
            countNoMatch =+1
        #part that is not working:
        if percentage_similarity < 16 and percentage_similarity > 0:
            nearMatch.append(a)
            nearMatch2.append(b)
            listOfSimilarities.append(percentage_similarity)
    
    if countNoMatch == 0:
        listOfDisimilarities.append(a)
        df2=pd.DataFrame({"NoMatch":listOfDisimilarities})
        zippedList =  list(zip(nearMatch,nearMatch2, listOfSimilarities))
        print(zippedList)
        nearMatch = []
        nearMatch2 = []
        final_df = pd.concat(zippedList, ignore_index=True)
    
    countNoMatch = 0
    if a.lower().endswith(extensionsOnly):
        countOuter += 1
print(final_df)

df.to_csv(r"C:/Documents/NearMatch.csv")

我尝试做的事情:

我试图添加一个新的循环,它在比较点问:这个相似度排名是在1到16之间吗?如果是,则添加到列表nearMatch2。然后,当循环完成时,代码会问一个新的问题:计数器(表示没有大于16的正匹配)是否为零?如果是,将以下列表压缩在一起: nearMatch2、nearMatch和listOfSimilarities (代表排名号)。

问题是,当一切都完成后,我以元组列表的形式获得数据,但我不知道如何将其转换为数据帧。我尝试过append、assign、loc和iloc、concat,但都不起作用。使用concat时,我得到的错误是Error: cannot concatenate object of type '<class 'tuple'>'; only Series and DataFrame objs are valid

EN

回答 1

Stack Overflow用户

发布于 2021-07-18 01:57:24

让它工作了-找到了一个名为extend的东西,它添加到了一个列表中。但仍然不是完全优雅--欢迎其他解决方案。

代码语言:javascript
运行
复制
from __future__ import division

import cv2
import numpy as np
import glob
import pandas as pd



    # Sift and Flann
sift = cv2.SIFT_create()


index_params = dict(algorithm=0, trees=5)
search_params = dict()
flann = cv2.FlannBasedMatcher(index_params, search_params)

# Load all the images1

countInner = 0
countOuter = 1
countNoMatch = 0
nearMatch = []
nearMatch2 = []
listOfSimilarities = []
nearMatchAgg = []
nearMatch2Agg = []
listOfSimilaritiesAgg = []


folder1 = r"/media/folderTwo/**"
folder2 = r"/media/folderOne/**"


extensionsOnly = ('.jpeg','.jpg','.png','.tif','.tiff','.gif')

siftOut1 = {}
for a in glob.iglob(folder1,recursive=True):
    if not a.lower().endswith(extensionsOnly):
        continue
    image1 = cv2.imread(a)
    kp_1, desc_1 = sift.detectAndCompute(image1, None)
    siftOut1[a]=(kp_1,desc_1)

siftOut2 = {}
for a in glob.iglob(folder2,recursive=True):
    if not a.lower().endswith(extensionsOnly):
        continue
    image1 = cv2.imread(a)
    kp_1, desc_1 = sift.detectAndCompute(image1, None)
    siftOut2[a]=(kp_1,desc_1)


for a in glob.iglob(folder1,recursive=True):
    if not a.lower().endswith(extensionsOnly):
        continue

    (kp_1,desc_1) = siftOut1[a]

    for b in glob.iglob(folder2,recursive=True):


        if not b.lower().endswith(extensionsOnly):

            continue

        if b.lower().endswith(extensionsOnly):

            countInner += 1

        # print(countInner, "", countOuter, "", countNoMatch)

        # you don't need this when you are comparing two folders
        # if countInner <= countOuter:

        #     continue


        (kp_2,desc_2) = siftOut2[b]

        matches = flann.knnMatch(desc_1, desc_2, k=2)

        good_points = []

        for m, n in matches:
            if m.distance < 0.6*n.distance:
                good_points.append(m)

        number_keypoints = 0
        if len(kp_1) >= len(kp_2):
            number_keypoints = len(kp_1)
        else:
            number_keypoints = len(kp_2)

        percentage_similarity = int(float(len(good_points)) / number_keypoints * 100)
        # print(percentage_similarity)
        if percentage_similarity > 16:
            countNoMatch =+1
        if percentage_similarity < 16 and percentage_similarity > 0:
            nearMatch.append(a)
            nearMatch2.append(b)
            listOfSimilarities.append(percentage_similarity)
    
    if countNoMatch == 0:
        listOfDisimilarities.append(a)
        df2=pd.DataFrame({"NoMatch":listOfDisimilarities})
        nearMatchAgg.extend(nearMatch)
        nearMatch2Agg.extend(nearMatch2)
        listOfSimilaritiesAgg.extend(listOfSimilarities)
        nearMatch = []
        nearMatch2 = []
        listOfSimilarities=[]
    
    zippedList = list(zip(nearMatchAgg,nearMatch2Agg, listOfSimilaritiesAgg))
    
    countNoMatch = 0
    if a.lower().endswith(extensionsOnly):
        countOuter += 1
dfObj = pd.DataFrame(zippedList, columns = ['Original', 'Title' , 'Similarity'])

dfObj.to_csv(r"C:/Documents/PhotoResults.csv")
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/68416339

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档