Package object_detection_cuda_python_module
ObjectDetection Model:
Overview
ObjectDetection model is supposed to detect objects for categories present in COCO dataset.
Architecture is based on the mobilenetV2 as the backbone and YOLO architecture for anchors and bounding box generations.
For any image model will predict the bounding boxes for all the objects detected in the image space coordinates.
Model handles all the preprocessing and postprocessing steps allowing user to pass imageData from any source,hence becoming a plug&play model.
High speed allows it to be quite useful for cases where object detector is a first step in a pipeline like pose-detection or fast image labelling for objects already present in database on which object detector was trained on.
Features
- works in real-time even with many object categories present.
- Work with all versions of python3 (tested with python3 only).
- Highly portable being a compiled python module.
- Minimal dependencies (all dependencies are standard DLLs).
- Only expected python dependencies are
numpyandopencv(for reading/displaying images purpose).
Limitations
- High speed is obtained at the cost of accuracy hence may not be useful where high accuracy is must.
Benchmarks
| Architecture | OS | Time*(ms) |
|---|---|---|
| Intel i5-8300H Cpu @ 2.30GHz | Windows | < 20ms |
| NVIDIA GTX 1050 | Windows | ~ 8ms |
- *Time measured in python runtime averaging over 100 loops.
- *Time measured with inputs upto HD resolutions with a number of objects present.
- (All resolutions are resized to fixed input-size specified by model before further processing.)
Usage
import numpy as np
import cv2
frame = cv2.imread("./test.jpg") #read the frame/image, Format: [BGR Uint8]
#import object_detection_python_module as od #For CPU
import object_detection_cuda_python_module as od #For GPU/Cuda.
od.load_model("./objectDetection.bin") #initialize Model and load weights.
obj_bboxes = od.detect_object(frame,obj_bboxes,conf_threshold=0.85,nms_threshold=0.45) # detect objects.
#display
for count in range(bboxes_count):
x1,y1,x2,y2,confidence,class_id = obj_bboxes[count]
cv2.rectangle(frame,(int(x1),int(y1)),(int(x2),int(y2)),(0,255,0),1)
cv2.imshow("frame",frame)
cv2.waitKey(0)
Resources:
- MobileNet: https://arxiv.org/abs/1704.04861
- YOLO: https://arxiv.org/abs/1506.02640
How to install
Once SDK has been downloaded into a local directory. Follow these steps.
cdinto directory. i.e at the root of directory.- Make sure all the requirements have been fulfilled as stated in requirements.
-
On
Windowsrun following command at the root of the directory :pip install .On
Linuxrun following command at the root of the directory :pip install .* copylibobject_detection_cuda_python_module_cuda.dllinto the system's Shared Library PATH[^execution]. ^[execution]: Since it is ashared library, on Linux copy this file to either/usr/lib/or/usr/local/lib64/forlinker/loaderto be able to find it, depending on the linux flavour.
Functions
def detect_object(frame, conf_threshold=0.85, nms_threshold=0.45, class_index=-1, max_count=50)-
Detect Objects from COCO dataset categories in a given frame. Expects BGR format UINT8 data,generally resulting from
cv2.imreadorcv2.VideoCapture() based sources.Inputs:
frame: [Height,Width,3] numpy array containing UINT8 data in BGR format,although may work with RGB data but expected BGR. conf_threshold:float confidence threshold,only objects with confidence above this threshold would be returned. nms_threshold:float threshold used to suppress overlapping bboxes during Non Maximum Supression. class_index:float a value in interval [0-79] ,to detect only objects of a specific category. like providing this value as 0,would only detect PERSON in the given frame. max_count:int Total number of maximum objects that could be detected in a single frame/image.Returns:
predictions: Numpy array [?,6] float32 data, where each row in format [x1,y1,x2,y2,confidence,class_id] aka (top,left,right,bottom) corners. def load_model(weightsPath)-
Initialize the model and load weights from the path specified by weightsPath argument.
Inputs:
weightsPath:str /path/to/weights, path to file to load weights from, generally a .bin extension