Package object_detection_python_module
ObjectDetection Model:
Overview
ObjectDetection
model is supposed to detect objects for categories present in COCO
dataset.
Architecture is based on the mobilenetV2
as the backbone and YOLO
architecture for anchors
and bounding box
generations.
For any image model will predict the bounding boxes for all the objects detected in the image space coordinates.
Model handles all the preprocessing
and postprocessing
steps allowing user to pass imageData from any source,hence becoming a plug&play
model.
High speed allows it to be quite useful for cases where object detector is a first step in a pipeline like pose-detection
or fast image labelling for objects already present in database on which object detector was trained on.
Features
- works in real-time even with many object categories present.
- Work with all versions of python3 (tested with python3 only).
- Highly portable being a compiled python module.
- Minimal dependencies (all dependencies are standard DLLs).
- Only expected python dependencies are
numpy
andopencv
(for reading/displaying images purpose).
Limitations
- High speed is obtained at the cost of accuracy hence may not be useful where high accuracy is must.
Benchmarks
Architecture | OS | Time*(ms) |
---|---|---|
Intel i5-8300H Cpu @ 2.30GHz | Windows | < 20ms |
NVIDIA GTX 1050 | Windows | ~ 8ms |
- *Time measured in python runtime averaging over 100 loops.
- *Time measured with inputs upto HD resolutions with a number of objects present.
- (All resolutions are resized to fixed input-size specified by model before further processing.)
Usage
import numpy as np
import cv2
frame = cv2.imread("./test.jpg") #read the frame/image, Format: [BGR Uint8]
import object_detection_python_module as od #For CPU
#import object_detection_cuda_python_module #For GPU/Cuda.
od.load_model("./objectDetection.bin") #initialize Model and load weights.
obj_bboxes = od.detect_object(frame,conf_threshold=0.85,nms_threshold=0.45) # detect objects.
#display
for count in range(bboxes_count):
x1,y1,x2,y2,confidence,class_id = obj_bboxes[count]
cv2.rectangle(frame,(int(x1),int(y1)),(int(x2),int(y2)),(0,255,0),1)
cv2.imshow("frame",frame)
cv2.waitKey(0)
Resources:
- MobileNet: https://arxiv.org/abs/1704.04861
- YOLO: https://arxiv.org/abs/1506.02640
How to install
Once SDK has been downloaded into a local directory. Follow these steps.
cd
into directory. i.e at the root of directory.- Make sure all the requirements have been fulfilled as stated in requirements.
-
On
Windows
run following command at the root of the directory :pip install .
On
Linux
run following command at the root of the directory :pip install .
Functions
def detect_object(frame, conf_threshold=0.85, nms_threshold=0.45, class_index=-1, max_count=50)
-
Detect Objects from COCO dataset categories in a given frame. Expects BGR format UINT8 data,generally resulting from
cv2.imread
orcv2.VideoCapture() based sources
.Inputs:
frame: [Height,Width,3] numpy array containing UINT8 data in BGR format,although may work with RGB data but expected BGR. conf_threshold:float confidence threshold,only objects with confidence above this threshold would be returned. nms_threshold:float threshold used to suppress overlapping bboxes during Non Maximum Supression. class_index:float a value in interval [0-79] ,to detect only objects of a specific category. like providing this value as 0,would only detect PERSON in the given frame. max_count:int Total number of maximum objects that could be detected in a single frame/image.
Returns:
predictions: Numpy array [?,6] float32 data, where each row in format [x1,y1,x2,y2,confidence,class_id] aka (top,left,right,bottom) corners.
def load_model(weightsPath)
-
Initialize the model and load weights from the path specified by weightsPath argument.
Inputs:
weightsPath:str /path/to/weights, path to file to load weights from, generally a .bin extension