前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >更稳定的手势识别方法--基于手部骨架与关键点检测

更稳定的手势识别方法--基于手部骨架与关键点检测

作者头像
Color Space
发布2021-03-10 13:44:15
2.2K0
发布2021-03-10 13:44:15
举报
文章被收录于专栏:OpenCV与AI深度学习

导读

本期将介绍并演示基于MediaPipe的手势骨架与特征点提取步骤以及以此为基础实现手势识别的方法。

介绍

关于MediaPipe以前有相关文章介绍,可以参看下面链接:

Google开源手势识别--基于TF Lite/MediaPipe

它能做些什么?它支持的语言和平台有哪些?请看下面两张图:

我们主要介绍手势骨架与关键点提取,其他内容大家有兴趣自行学习了解。github地址:https://github.com/google/mediapipe

效果展示

手势骨架提取与关键点标注:

手势识别0~6:

实现步骤

具体可参考下面链接:

https://google.github.io/mediapipe/solutions/hands

(1) 安装mediapipe,执行pip install mediapipe

(2) 下载手势检测与骨架提取模型,地址:

https://github.com/google/mediapipe/tree/master/mediapipe/modules/hand_landmark

(3) 代码测试(摄像头实时测试):

代码语言:javascript
复制
import cv2import mediapipe as mpfrom os import listdirmp_drawing = mp.solutions.drawing_utilsmp_hands = mp.solutions.hands

hands = mp_hands.Hands(    min_detection_confidence=0.5, min_tracking_confidence=0.5)cap = cv2.VideoCapture(0)while cap.isOpened():  success, image = cap.read()  if not success:    print("Ignoring empty camera frame.")    continue
  image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB)  image.flags.writeable = False  results = hands.process(image)
  image.flags.writeable = True  image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)  if results.multi_hand_landmarks:    for hand_landmarks in results.multi_hand_landmarks:      mp_drawing.draw_landmarks(          image, hand_landmarks, mp_hands.HAND_CONNECTIONS)  cv2.imshow('result', image)  if cv2.waitKey(5) & 0xFF == 27:    breakcv2.destroyAllWindows()hands.close()cap.release()

输出与结果:

图片检测(可支持多个手掌):

代码语言:javascript
复制
import cv2import mediapipe as mpfrom os import listdirmp_drawing = mp.solutions.drawing_utilsmp_hands = mp.solutions.hands
# For static images:hands = mp_hands.Hands(    static_image_mode=True,    max_num_hands=5,    min_detection_confidence=0.2)img_path = './multi_hands/'save_path = './'index = 0file_list = listdir(img_path) for filename in file_list:  index += 1  file_path = img_path + filename  # Read an image, flip it around y-axis for correct handedness output (see  # above).  image = cv2.flip(cv2.imread(file_path), 1)  # Convert the BGR image to RGB before processing.  results = hands.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
  # Print handedness and draw hand landmarks on the image.  print('Handedness:', results.multi_handedness)  if not results.multi_hand_landmarks:    continue  image_hight, image_width, _ = image.shape  annotated_image = image.copy()  for hand_landmarks in results.multi_hand_landmarks:    print('hand_landmarks:', hand_landmarks)    print(        f'Index finger tip coordinates: (',        f'{hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].x * image_width}, '        f'{hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].y * image_hight})'    )    mp_drawing.draw_landmarks(        annotated_image, hand_landmarks, mp_hands.HAND_CONNECTIONS)  cv2.imwrite(      save_path + str(index) + '.png', cv2.flip(annotated_image, 1))hands.close()
# For webcam input:hands = mp_hands.Hands(    min_detection_confidence=0.5, min_tracking_confidence=0.5)cap = cv2.VideoCapture(0)while cap.isOpened():  success, image = cap.read()  if not success:    print("Ignoring empty camera frame.")    # If loading a video, use 'break' instead of 'continue'.    continue
  # Flip the image horizontally for a later selfie-view display, and convert  # the BGR image to RGB.  image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB)  # To improve performance, optionally mark the image as not writeable to  # pass by reference.  image.flags.writeable = False  results = hands.process(image)
  # Draw the hand annotations on the image.  image.flags.writeable = True  image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)  if results.multi_hand_landmarks:    for hand_landmarks in results.multi_hand_landmarks:      mp_drawing.draw_landmarks(          image, hand_landmarks, mp_hands.HAND_CONNECTIONS)  cv2.imshow('result', image)  if cv2.waitKey(5) & 0xFF == 27:    breakcv2.destroyAllWindows()hands.close()cap.release()

总结后续说明

总结:MediaPipe手势检测与骨架提取模型识别相较传统方法更稳定,而且提供手指关节的3D坐标点,对于手势识别与进一步手势动作相关开发有很大帮助。

其他说明:

(1) 手部关节点标号与排序定义如下图:

(2) 手部关节点坐标(x,y,z)输出为小于1的小数,需要归一化后显示到图像上,这部分可以查看上部分源码后转到定义查看,这里给出demo代码,另外Z坐标靠近屏幕增大,远离屏幕减小

代码语言:javascript
复制
def Normalize_landmarks(image, hand_landmarks):  new_landmarks = []  for i in range(0,len(hand_landmarks.landmark)):    float_x = hand_landmarks.landmark[i].x    float_y = hand_landmarks.landmark[i].y    # Z坐标靠近屏幕增大,远离屏幕减小    float_z = hand_landmarks.landmark[i].z    print(float_z)    width = image.shape[1]    height = image.shape[0]     pt = mp_drawing._normalized_to_pixel_coordinates(float_x,float_y,width,height)    new_landmarks.append(pt)  return new_landmarks

(3) 基于此你可以做个简单额手势识别或者手势靠近远离屏幕的小程序,当然不仅要考虑关节点的坐标,可能还需要计算角度已经以前的状态等等,比如下面这样:

其他demo与相关代码均在知识星球主题中发布,需要的朋友可以加入获取。另外后续有时间更新C++版本也将直接发布在知识星球中。

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2021-03-08,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 OpenCV与AI深度学习 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 介绍
    • Google开源手势识别--基于TF Lite/MediaPipe
    • 效果展示
    • 实现步骤
    • 总结后续说明
    相关产品与服务
    领券
    问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档