Package posedetection_python_module

Pose Detection Module


Pose Detection model is supposed to detect 17 2D keypoints for a person,thus allowing to estimate and compare poses.PoseEstimation allows some very interesting applications like full-body gesture control, quantifying physical exercises, and applications in AR-VR by overlaying digital content guided by the pose.

Architecture is based on the mobileNetV2 as the backbone followed by a simple encoder-decoder architecture for predicting heatmap for each of the keypoints.

Since this model expects to have an image/frame with a single person in it ,it either has to be coupled with a person detector model or to be provided a video/frame with a single person.


  • Model runs at very high-speed,so concatenating this model after a person detector model has a very low cost.
  • Works great with videos having a single person in the frame.
  • Highly portable being a compiled python module.
  • Minimal dependencies (all dependencies are standard DLLs).
  • Only expected python dependencies are numpy and opencv( for reading/displaying images purpose ).


  • Person Tracker is not built in, hence only useful if a single person with relatively stable background is present in the corresponding video.

Benchmarks (show me the stats!)

Architecture OS Time*(ms)
Intel i5-8300H Cpu @ 2.30GHz Windows ~ 5ms
  • *Time measured in python runtime averaging over 100 loops.
  • *Time measured with inputs upto full-HD resolutions.
  • (All resolutions are resized to fixed input-size specified by model before further processing.)*


joint_pairs = [[0, 1], [1, 3], [0, 2], [2, 4],
                   [5, 6], [5, 7], [7, 9], [6, 8], 
                   [8, 10],[5, 11], [6, 12], [11, 12],
                   [11, 13], [12, 14], [13, 15], [14, 16]]

#python dependencies
import numpy as np
import cv2

frame = cv2.imread("./test.jpg")            #read the frame/image,[BGR UINT8 data]

import posedetection_python_module as pd    
pd.load_model("./posedetection.bin")        #Initialize the model and load weights

keypoints = pd.detect_pose(frame,keypoints)

#plot keypoints on the frame. 
keypoint_threshold = 0.2
for pair in joint_pairs:
    y1_k = int(keypoints[pair[0],0])
    x1_k = int(keypoints[pair[0],1])
    confidence_1 = keypoints[pair[0],2]

    y2_k = int(keypoints[pair[1],0])
    x2_k = int(keypoints[pair[1],1])
    confidence_2 = keypoints[pair[1],2]

    if  confidence_1 > keypoint_threshold and confidence_2 > keypoint_threshold:
        cv2.line(frame,(x1_k,y1_k), (x2_k,y2_k), (255,255,0), 2), (x1_k,y1_k), 1, (0,255,255), 2), (x2_k,y2_k), 1, (0,255,255), 2)



How to install

Once SDK has been downloaded into a local directory. Follow these steps.

  • cd into directory. i.e at the root of directory.
  • Make sure all the requirements have been fulfilled as stated in requirements.
  • On Windows run following command at the root of the directory :

    pip install .

    On Linux run following command at the root of the directory :

    pip install .


def detect_pose(frame, in_center=True)

Detect keypoints in a given frame to estimate pose of a person. Expects BGR format UINT8 data,generally resulting from cv2.imread or cv2.VideoCapture() based sources.


frame: [Height,Width,3] numpy array containing UINT8 data in BGR format,although may work with RGB data but expected BGR.
in_center: boolean   If true, assumes person is relatively in the center of frame. Default True.


keypoints: numpy array FLOAT32 data of shape [17,3],each row would be in format [y,x,confidence], (y,x) being the coordinates on frame/image passed.
def load_model(weightsPath)

Initialize the model and load weights from the path specified by weightsPath argument.


weightsPath:str  /path/to/weights  ,path to file to load weights from, generally a .bin extension