Object Detection

The object detection API locates and classifies 80 different kinds of objects in a single image.

To use this API, you need to enable the detection API when starting DeepStack

Starting DeepStack

Run the command below as it applies to the version you have installed

bash
docker run -e VISION-DETECTION=True -v localstorage:/datastore -p 80:5000 deepquestai/deepstack

Basic Parameters

-e VISION-DETECTION=True This enables the object detection API.

-v localstorage:/datastore This specifies the local volume where DeepStack will store all data.

-p 80:5000 This makes DeepStack accessible via port 80 of the machine.

Example

../_images/family-and-dog.jpg
python
import requests

image_data = open("test-image3.jpg","rb").read()

response = requests.post("http://localhost:80/v1/vision/detection",files={"image":image_data}).json()

for object in response["predictions"]:
    print(object["label"])

print(response)

Response

text
dog
person
person
{'predictions': [{'x_max': 819, 'x_min': 633, 'y_min': 354, 'confidence': 99, 'label': 'dog', 'y_max': 546}, {'x_max': 601, 'x_min': 440, 'y_min': 116, 'confidence': 99, 'label': 'person', 'y_max': 516}, {'x_max': 445, 'x_min': 295, 'y_min': 84, 'confidence': 99, 'label': 'person', 'y_max': 514}], 'success': True}

We can use the coordinates returned to extract the objects

python
import requests
from PIL import Image

image_data = open("test-image3.jpg","rb").read()
image = Image.open("test-image3.jpg").convert("RGB")

response = requests.post("http://localhost:80/v1/vision/detection",files={"image":image_data}).json()
i = 0
for object in response["predictions"]:

    label = object["label"]
    y_max = int(object["y_max"])
    y_min = int(object["y_min"])
    x_max = int(object["x_max"])
    x_min = int(object["x_min"])
    cropped = image.crop((x_min,y_min,x_max,y_max))
    cropped.save("image{}_{}.jpg".format(i,label))

    i += 1
../_images/dog.jpg
../_images/man.jpg
../_images/woman.jpg

Setting Minimum Confidence

By default, the minimum confidence for detecting objects is 0.45. The confidence ranges between 0 and 1. If the confidence level for an object falls below the min_confidence, no object is detected.

The min_confidence parameter allows you to increase or reduce the minimum confidence.

We lower the confidence allowed below.

python
import requests

image_data = open("test-image3.jpg","rb").read()

response = requests.post("http://localhost:80/v1/vision/detection",
files={"image":image_data},data={"min_confidence":0.30}).json()

CLASSES

The following are the classes of objects DeepStack can detect in images

text
person,   bicycle,   car,   motorcycle,   airplane,
bus,   train,   truck,   boat,   traffic light,   fire hydrant,   stop_sign,
parking meter,   bench,   bird,   cat,   dog,   horse,   sheep,   cow,   elephant,
bear,   zebra, giraffe,   backpack,   umbrella,   handbag,   tie,   suitcase,
frisbee,   skis,   snowboard, sports ball,   kite,   baseball bat,   baseball glove,
skateboard,   surfboard,   tennis racket, bottle,   wine glass,   cup,   fork,
knife,   spoon,   bowl,   banana,   apple,   sandwich,   orange, broccoli,   carrot,
hot dog,   pizza,   donot,   cake,   chair,   couch,   potted plant,   bed, dining table,
toilet,   tv,   laptop,   mouse,   remote,   keyboard,   cell phone,   microwave,
oven,   toaster,   sink,   refrigerator,   book,   clock,   vase,   scissors,   teddy bear,
hair dryer, toothbrush.

Performance

DeepStack offers three modes allowing you to tradeoff speed for performance. During startup, you can specify performance mode to be , High , Medium and Low.

The default mode is Medium.

You can specify a different mode during startup as seen below as seen below

bash
docker run -e VISION-DETECTION=True -e MODE=High -v localstorage:/datastore -p 80:5000 deepquestai/deepstack

Speed Modes are not available on the Raspberry PI Version