Package posedetection_python_module
Pose Detection Module
Overview
Pose Detection
model is supposed to detect 17 2D keypoints
for a person,thus allowing to estimate and compare poses.PoseEstimation
allows some very interesting applications like full-body gesture control
,
quantifying physical exercises
, and applications in AR-VR by overlaying digital content guided by the pose.
Architecture
is based on the mobileNetV2
as the backbone followed by a simple encoder-decoder
architecture for predicting heatmap
for each of the keypoints.
Since this model expects to have an image/frame with a single person in it ,it either has to be coupled with a person detector model or to be provided a video/frame with a single person.
Features
- Model runs at very high-speed,so concatenating this model after a person detector model has a very low cost.
- Works great with videos having a single person in the frame.
- Highly portable being a compiled python module.
- Minimal dependencies (all dependencies are standard DLLs).
- Only expected python dependencies are
numpy
andopencv
( for reading/displaying images purpose ).
Limitations
- Person Tracker is not built in, hence only useful if a single person with relatively stable background is present in the corresponding video.
Benchmarks (show me the stats!)
Architecture | OS | Time*(ms) |
---|---|---|
Intel i5-8300H Cpu @ 2.30GHz | Windows | ~ 5ms |
- *Time measured in python runtime averaging over 100 loops.
- *Time measured with inputs upto full-HD resolutions.
- (All resolutions are resized to fixed input-size specified by model before further processing.)*
Usage
joint_pairs = [[0, 1], [1, 3], [0, 2], [2, 4],
[5, 6], [5, 7], [7, 9], [6, 8],
[8, 10],[5, 11], [6, 12], [11, 12],
[11, 13], [12, 14], [13, 15], [14, 16]]
#python dependencies
import numpy as np
import cv2
frame = cv2.imread("./test.jpg") #read the frame/image,[BGR UINT8 data]
import posedetection_python_module as pd
pd.load_model("./posedetection.bin") #Initialize the model and load weights
keypoints = pd.detect_pose(frame,keypoints)
#plot keypoints on the frame.
keypoint_threshold = 0.2
for pair in joint_pairs:
y1_k = int(keypoints[pair[0],0])
x1_k = int(keypoints[pair[0],1])
confidence_1 = keypoints[pair[0],2]
y2_k = int(keypoints[pair[1],0])
x2_k = int(keypoints[pair[1],1])
confidence_2 = keypoints[pair[1],2]
if confidence_1 > keypoint_threshold and confidence_2 > keypoint_threshold:
cv2.line(frame,(x1_k,y1_k), (x2_k,y2_k), (255,255,0), 2)
cv2.circle(frame, (x1_k,y1_k), 1, (0,255,255), 2)
cv2.circle(frame, (x2_k,y2_k), 1, (0,255,255), 2)
cv2.imshow("pose",frame)
cv2.waitKey(0)
Resources:
- Pose-Estimation and tracking: https://arxiv.org/abs/1804.06208
- MobileNet: https://arxiv.org/abs/1704.04861
- YOLO: https://arxiv.org/abs/1506.02640
How to install
Once SDK has been downloaded into a local directory. Follow these steps.
cd
into directory. i.e at the root of directory.- Make sure all the requirements have been fulfilled as stated in requirements.
-
On
Windows
run following command at the root of the directory :pip install .
On
Linux
run following command at the root of the directory :pip install .
Functions
def detect_pose(frame, in_center=True)
-
Detect keypoints in a given frame to estimate pose of a person. Expects BGR format UINT8 data,generally resulting from
cv2.imread
orcv2.VideoCapture() based sources
.Inputs:
frame: [Height,Width,3] numpy array containing UINT8 data in BGR format,although may work with RGB data but expected BGR. in_center: boolean If true, assumes person is relatively in the center of frame. Default True.
Returns:
keypoints: numpy array FLOAT32 data of shape [17,3],each row would be in format [y,x,confidence], (y,x) being the coordinates on frame/image passed.
def load_model(weightsPath)
-
Initialize the model and load weights from the path specified by weightsPath argument.
Inputs:
weightsPath:str /path/to/weights ,path to file to load weights from, generally a .bin extension