Package `object_detection_python_module`

ObjectDetection Model:

Overview

ObjectDetection model is supposed to detect objects for categories present in COCO dataset. Architecture is based on the mobilenetV2 as the backbone and YOLO architecture for anchors and bounding box generations. For any image model will predict the bounding boxes for all the objects detected in the image space coordinates.

Model handles all the preprocessing and postprocessing steps allowing user to pass imageData from any source,hence becoming a plug&play model. High speed allows it to be quite useful for cases where object detector is a first step in a pipeline like pose-detection or fast image labelling for objects already present in database on which object detector was trained on.

Features

works in real-time even with many object categories present.
Work with all versions of python3 (tested with python3 only).
Highly portable being a compiled python module.
Minimal dependencies (all dependencies are standard DLLs).
Only expected python dependencies are numpy and opencv(for reading/displaying images purpose).

Limitations

High speed is obtained at the cost of accuracy hence may not be useful where high accuracy is must.

Benchmarks

Architecture	OS	Time*(ms)
Intel i5-8300H Cpu @ 2.30GHz	Windows	< 20ms
NVIDIA GTX 1050	Windows	~ 8ms

*Time measured in python runtime averaging over 100 loops.
*Time measured with inputs upto HD resolutions with a number of objects present.
(All resolutions are resized to fixed input-size specified by model before further processing.)

Usage


import numpy as np
import cv2

frame = cv2.imread("./test.jpg")                  #read the frame/image, Format: [BGR Uint8]

import object_detection_python_module as od     #For CPU
#import object_detection_cuda_python_module     #For GPU/Cuda.

od.load_model("./objectDetection.bin")          #initialize Model and load weights.
obj_bboxes = od.detect_object(frame,conf_threshold=0.85,nms_threshold=0.45)   # detect objects.

#display
for count in range(bboxes_count):
    x1,y1,x2,y2,confidence,class_id = obj_bboxes[count]
    cv2.rectangle(frame,(int(x1),int(y1)),(int(x2),int(y2)),(0,255,0),1)

cv2.imshow("frame",frame)
cv2.waitKey(0)

Resources:

MobileNet: https://arxiv.org/abs/1704.04861
YOLO: https://arxiv.org/abs/1506.02640

How to install

Once SDK has been downloaded into a local directory. Follow these steps.

cd into directory. i.e at the root of directory.
Make sure all the requirements have been fulfilled as stated in requirements.
On Windows run following command at the root of the directory :

pip install .

On Linux run following command at the root of the directory :

pip install .

Functions

def detect_object(frame, conf_threshold=0.85, nms_threshold=0.45, class_index=-1, max_count=50)

Detect Objects from COCO dataset categories in a given frame. Expects BGR format UINT8 data,generally resulting from cv2.imread or cv2.VideoCapture() based sources.

Inputs:

frame: [Height,Width,3] numpy array containing UINT8 data in BGR format,although may work with RGB data but expected BGR.

conf_threshold:float confidence threshold,only objects with confidence above this threshold would be returned.

nms_threshold:float threshold used to suppress overlapping bboxes during Non Maximum Supression.

class_index:float a value in interval [0-79] ,to detect only objects of a specific category. like providing this value as 0,would only detect PERSON in the given frame.

max_count:int Total number  of maximum objects  that could be detected in a single frame/image.

Returns:

predictions: Numpy array [?,6] float32 data, where each row in format [x1,y1,x2,y2,confidence,class_id] aka (top,left,right,bottom) corners.

def load_model(weightsPath)

Initialize the model and load weights from the path specified by weightsPath argument.

Inputs:

weightsPath:str  /path/to/weights, path to file to load weights from, generally a .bin extension