Setting Up Object Detection on Linux

This guide provides step-by-step instructions for setting up an Object Detection system on Ubuntu using a pre-trained model from PyTorch's torchvision library. Object detection involves identifying and locating objects in an image and drawing bounding boxes around them.

1. Install System Prerequisites

Make sure your Ubuntu system is updated and install the necessary dependencies. Open a terminal and run the following commands:

sudo apt update
sudo apt upgrade
sudo apt install python3 python3-pip git
    

2. Install PyTorch and Torchvision

Install PyTorch and Torchvision for object detection. If you have a CUDA-compatible GPU, you can install the version with CUDA support. If you do not have a GPU, install the CPU version:

pip install torch torchvision torchaudio
    

3. Install Additional Libraries

Install the following libraries to handle image processing and display:

pip install opencv-python matplotlib
    

OpenCV will handle image loading and manipulation, and Matplotlib will visualize the results.

4. Download Pre-trained Object Detection Model

PyTorch provides pre-trained object detection models like Faster R-CNN. These models are trained on the COCO dataset and can be easily used for inference:

from torchvision.models.detection import fasterrcnn_resnet50_fpn
    

5. Create Python Script for Object Detection

Create a new Python script, object_detection.py, to load the pre-trained Faster R-CNN model and perform object detection:

nano object_detection.py
    

Paste the following code into the file:

import torch
import torchvision
import cv2
import matplotlib.pyplot as plt
from torchvision.transforms import functional as F

# Load the pre-trained Faster R-CNN model
def load_model():
    model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
    model.eval()  # Set the model to evaluation mode
    return model

# Preprocess the image
def preprocess_image(image_path):
    image = cv2.imread(image_path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)  # Convert BGR to RGB
    image_tensor = F.to_tensor(image).unsqueeze(0)  # Convert to tensor and add batch dimension
    return image, image_tensor

# Perform object detection
def detect_objects(model, image_tensor):
    with torch.no_grad():
        detections = model(image_tensor)
    return detections[0]

# Draw bounding boxes and labels
def draw_boxes(detections, image, threshold=0.5):
    for idx, box in enumerate(detections['boxes']):
        score = detections['scores'][idx].item()
        if score > threshold:
            x1, y1, x2, y2 = map(int, box)
            cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
            label = f"{detections['labels'][idx].item()}: {score:.2f}"
            cv2.putText(image, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    plt.imshow(image)
    plt.show()

# Main function
if __name__ == "__main__":
    model = load_model()
    
    image_path = "path/to/your/image.jpg"  # Replace with your image path
    image, image_tensor = preprocess_image(image_path)
    
    detections = detect_objects(model, image_tensor)
    
    draw_boxes(detections, image)
    

This script performs the following steps:

  • Loads a pre-trained Faster R-CNN model from PyTorch's torchvision library.
  • Preprocesses the input image to match the model's input size and format.
  • Performs object detection on the image and returns bounding boxes, labels, and confidence scores.
  • Draws bounding boxes and displays the results using Matplotlib.

6. Download Pre-trained Model Weights

The first time you run the script, the pre-trained model weights will be automatically downloaded. Ensure you have an active internet connection during the first run.

7. Run the Object Detection Script

To execute the script, run the following command in your terminal:

python3 object_detection.py
    

Replace path/to/your/image.jpg with the path to the image you want to perform object detection on. The script will display the image with bounding boxes around the detected objects.

8. Adjust Confidence Threshold

The default confidence threshold is set to 0.5, meaning that objects with detection confidence above 50% will be displayed. You can adjust this threshold by modifying the value in the draw_boxes function:

draw_boxes(detections, image, threshold=0.7)  # Set threshold to 0.7
    

9. Troubleshooting

If you encounter issues, check the following:

  • Ensure all required Python libraries are installed correctly.
  • Ensure the image path is correct and that the image exists.
  • Ensure that your system meets the hardware and software requirements for PyTorch and OpenCV.

10. Conclusion

You have successfully set up an Object Detection system on Ubuntu using a pre-trained Faster R-CNN model. This setup allows you to detect objects in images and display the results with bounding boxes and labels.