Setting Up Object Detection on Linux
This guide provides step-by-step instructions for setting up an Object Detection system on Ubuntu using a pre-trained model from PyTorch's torchvision
library. Object detection involves identifying and locating objects in an image and drawing bounding boxes around them.
1. Install System Prerequisites
Make sure your Ubuntu system is updated and install the necessary dependencies. Open a terminal and run the following commands:
sudo apt update
sudo apt upgrade
sudo apt install python3 python3-pip git
2. Install PyTorch and Torchvision
Install PyTorch and Torchvision for object detection. If you have a CUDA-compatible GPU, you can install the version with CUDA support. If you do not have a GPU, install the CPU version:
pip install torch torchvision torchaudio
3. Install Additional Libraries
Install the following libraries to handle image processing and display:
pip install opencv-python matplotlib
OpenCV will handle image loading and manipulation, and Matplotlib will visualize the results.
4. Download Pre-trained Object Detection Model
PyTorch provides pre-trained object detection models like Faster R-CNN. These models are trained on the COCO dataset and can be easily used for inference:
from torchvision.models.detection import fasterrcnn_resnet50_fpn
5. Create Python Script for Object Detection
Create a new Python script, object_detection.py
, to load the pre-trained Faster R-CNN model and perform object detection:
nano object_detection.py
Paste the following code into the file:
import torch
import torchvision
import cv2
import matplotlib.pyplot as plt
from torchvision.transforms import functional as F
# Load the pre-trained Faster R-CNN model
def load_model():
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
model.eval() # Set the model to evaluation mode
return model
# Preprocess the image
def preprocess_image(image_path):
image = cv2.imread(image_path)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # Convert BGR to RGB
image_tensor = F.to_tensor(image).unsqueeze(0) # Convert to tensor and add batch dimension
return image, image_tensor
# Perform object detection
def detect_objects(model, image_tensor):
with torch.no_grad():
detections = model(image_tensor)
return detections[0]
# Draw bounding boxes and labels
def draw_boxes(detections, image, threshold=0.5):
for idx, box in enumerate(detections['boxes']):
score = detections['scores'][idx].item()
if score > threshold:
x1, y1, x2, y2 = map(int, box)
cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
label = f"{detections['labels'][idx].item()}: {score:.2f}"
cv2.putText(image, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
plt.imshow(image)
plt.show()
# Main function
if __name__ == "__main__":
model = load_model()
image_path = "path/to/your/image.jpg" # Replace with your image path
image, image_tensor = preprocess_image(image_path)
detections = detect_objects(model, image_tensor)
draw_boxes(detections, image)
This script performs the following steps:
- Loads a pre-trained Faster R-CNN model from PyTorch's
torchvision
library. - Preprocesses the input image to match the model's input size and format.
- Performs object detection on the image and returns bounding boxes, labels, and confidence scores.
- Draws bounding boxes and displays the results using Matplotlib.
6. Download Pre-trained Model Weights
The first time you run the script, the pre-trained model weights will be automatically downloaded. Ensure you have an active internet connection during the first run.
7. Run the Object Detection Script
To execute the script, run the following command in your terminal:
python3 object_detection.py
Replace path/to/your/image.jpg
with the path to the image you want to perform object detection on. The script will display the image with bounding boxes around the detected objects.
8. Adjust Confidence Threshold
The default confidence threshold is set to 0.5, meaning that objects with detection confidence above 50% will be displayed. You can adjust this threshold by modifying the value in the draw_boxes
function:
draw_boxes(detections, image, threshold=0.7) # Set threshold to 0.7
9. Troubleshooting
If you encounter issues, check the following:
- Ensure all required Python libraries are installed correctly.
- Ensure the image path is correct and that the image exists.
- Ensure that your system meets the hardware and software requirements for PyTorch and OpenCV.
10. Conclusion
You have successfully set up an Object Detection system on Ubuntu using a pre-trained Faster R-CNN model. This setup allows you to detect objects in images and display the results with bounding boxes and labels.