InThings Technologies Pvt Ltd, Palakkad +91- 96336 87401 , +91-8978390303 info@inthings.tech

Real-Time Detection of Potholes, Speed Signs, Traffic Lights, Pedestrians, and Crosswalks Using YOLOv8 and DeepStream on Jetson Nano with CARLA Simulator

Category : Computer Vision | Sub Category : Posted on 2025-02-06 13:14:19


Real-Time Detection of Potholes, Speed Signs, Traffic Lights, Pedestrians, and Crosswalks Using YOLOv8 and DeepStream on Jetson Nano with CARLA Simulator

Introduction 

In the ever-evolving landscape of road safety and autonomous vehicles, real-time object detection plays a crucial role in identifying critical road elements. For this project, we have focused on building a robust detection system capable of identifying potholes, speed signs, traffic signals, pedestrians, and crosswalks in real-time. These elements are vital for ensuring safer navigation and enhancing decision-making for self-driving systems. 

To achieve this, we utilized YOLOv8, a state-of-the-art object detection model renowned for its speed and accuracy, and NVIDIA DeepStream on Jetson Nano, a high-performance framework designed for real-time video analytics. This combination provides an efficient solution for deploying a detection system that meets the demands of real-world scenarios. 

In this blog, we will share the entire process, starting from training the YOLOv8 model, converting it to ONNX format, and configuring NVIDIA DeepStream for deployment. Additionally, we will explain how the system was tested using RTSP streams from Carla Simulator to simulate real-world driving conditions. This guide is designed to help those interested in building similar detection pipelines or deploying YOLO models with DeepStream. 

 

Training YOLOv8 Model 

For training the YOLOv8 model, we prepared the dataset with two main folders: train and valid. Each folder contained images in .jpg format and corresponding label files in .txt format. we also created a YAML file specifying the paths to the image directories and the class names with their corresponding IDs. 

Once the dataset was ready, we used a training script to train the model for 160 epochs, with an image size of 640. After the training process, the model's best weights were saved as best.pt in the runs/ directory, ready for conversion into ONNX format for deployment. 

Setting Up NVIDIA DeepStream on Jetson Nano with JetPack SDK 

For deploying the YOLOv8 model in a real-time environment, we chose to run the setup on the NVIDIA Jetson Nano. This low-power device is capable of accelerating deep learning models through its GPU, making it ideal for real-time inference tasks in edge applications such as autonomous vehicles or smart city monitoring systems. Here's how we set up NVIDIA DeepStream and YOLOv8 on the Jetson Nano with the JetPack SDK: 

Overview of Jetson Nano 

The Jetson Nano is an affordable and compact device that brings powerful GPU capabilities to edge devices. With its 128-core Maxwell GPU, it provides the necessary processing power for running deep learning models in real-time. we utilized the JetPack SDK, which comes pre-configured with Ubuntu 18.04, CUDA, TensorRT, and other essential libraries optimized for the Jetson platform. This SDK enabled easy installation and configuration of DeepStream and deep learning models on the device. 

Installing JetPack SDK on Jetson Nano 

We used the JetPack SDK to install all necessary components on the Jetson Nano. JetPack simplifies the process of setting up the device by bundling the Ubuntu OS, CUDA, TensorRT, and DeepStream SDK, specifically tailored for NVIDIA's Jetson platform. 

Steps for installation: 

1. Download JetPack SDK: We downloaded the JetPack SDK from NVIDIA’s official website. JetPack includes everything needed to run deep learning models on Jetson devices, including Ubuntu 18.04, CUDA, TensorRT, and DeepStream. 

2. Flash the microSD Card: We used Balena Etcher to flash the JetPack image onto a microSD card (minimum 32GB recommended). The flashing process sets up Ubuntu 18.04 and prepares the Jetson Nano for development. 

3. Boot the Jetson Nano: After flashing, we inserted the microSD card into the Jetson Nano and powered it on. The system boots directly into Ubuntu 18.04, and the JetPack SDK initializes, setting up the necessary environment for GPU-accelerated development. 

Installing DeepStream on Jetson Nano 

Once JetPack SDK was successfully installed, it included everything necessary to install DeepStream. DeepStream is a framework by NVIDIA designed for high-performance video analytics, which is perfect for deploying real-time object detection models like YOLOv8. 

Steps to install DeepStream: 

1. Install DeepStream SDK: DeepStream is integrated into JetPack, but we had to install additional components specific to DeepStream. we followed the provided steps in the DeepStream SDK documentation to install the necessary packages. 

2. Test DeepStream: After installation, we verified the setup by running sample applications provided in the DeepStream SDK. This ensured that DeepStream was functioning correctly and could be used to run inference tasks. 

Setting Up DeepStream for YOLOv8 on Jetson Nano 

With DeepStream installed, the next step was configuring it to work with the YOLOv8 model. This step is particularly important as Ultralytics provides a detailed guide for setting up YOLOv8 on Jetson Nano using DeepStream. 

Key configuration steps: 

1. Convert YOLOv8 Model to ONNX Format:

We used the provided Python script inside the DeepStream YOLOv8 setup folder to convert the trained YOLOv8 model from .pt to .onnx format. The conversion command included the batch size of 1 and the simplification option for optimization:

python3 export_yoloV8.py -w optimized_best.pt --batch 1 --simplify 

--batch 1: Ensures the model is optimized for real-time inference with a batch size of 1, which is perfect for edge devices like the Jetson Nano. 

--simplify: Simplifies the model to improve performance by reducing size and enhancing inference speed. 

2. Configure DeepStream for YOLOv8 

Once our YOLOv8 model was converted to ONNX, we updated the DeepStream configuration files to seamlessly link the model, labels, and other parameters for deployment.  

Key Configuration Changes 

Here’s a breakdown of the critical updates we made: 

1. Updating config_infer_primary_yoloV8.txt 

This file contains all the details for loading and running the YOLOv8 model. The key changes were: 

Model Files: 
onnx-file: Specified the path to the ONNX model (trafficlight_person.onnx).  model-engine-file: Referenced the TensorRT engine
(
model_b1_gpu0_fp32.engine).  labelfile-path: Linked the label file (trafficlight_person_label.txt).  Classes and Detection Parameters:  num-detected-classes: Set to 6, matching the number of classes in the model.  nms-iou-threshold: Set to 0.45 for non-maximum suppression.  pre-cluster-threshold: Set to 0.25 for filtering low-confidence detections.  Custom Parsing Logic:  custom-lib-path: Included the path to the YOLO custom library
(
libnvdsinfer_custom_impl_Yolo.so).  parse-bbox-func-name: Configured for YOLO parsing with NvDsInferParseYolo. 

    2. Updating deepstream_app_config.txt

    This file governs the entire DeepStream pipeline, from video input to display output. Here's what we configured:  RTSP Source:

    Defined the video input as an RTSP stream(rtsp://192.168.1.37:8554/carla_stream) with a resolution of 1280x720 and 30 FPS.  Inference Settings: Linked the updated YOLOv8 configuration (config_infer_primary_yoloV8.txt) in the [primary-gie] section. On-Screen Display (OSD):  Enabled bounding boxes, text annotations, and custom styles for better visualization.  Output Sink:  Configured a display sink (type=2) to render the processed video in real time, with frame synchronization disabled for smoother playback.  Stream Multiplexer:  Set the batch size to 1 and matched the dimensions to the input stream (1280x720). 


    Pipeline Flow 

    Here’s how the pipeline operates:  1. Input: The RTSP source streams video into the pipeline.  2. Preprocessing: The Stream Multiplexer (Streammux) prepares the video for inference.  3. Inference: The Primary GIE processes the video using the YOLOv8 model for object detection.  4. Visualization: The On-Screen Display overlays bounding boxes and labels.  5. Output: The Sink renders the final output video in real time. 

    Streaming CARLA RGB Camera Feed to DeepStream Pipeline Using RTSP 

    In this blog, we will explain how we set up an RTSP stream from CARLA’s RGB camera feed and used it with a DeepStream pipeline for real-time processing. Below, we break down the process into key steps for clarity. 

    1. Connecting to the CARLA Simulation 

    The first step involves connecting to the CARLA server and accessing its simulation world. This connection allows us to interact with the environment, spawn vehicles, and attach sensors. Setting a reasonable timeout ensures that communication with the server is reliable, especially when working with large or complex simulations. 

    2. Setting Up an RTSP Server 

    We used GStreamer to create an RTSP server that streams video from CARLA’s RGB camera. The server encodes the camera feed into the H.264 format, which is a widely used video compression standard. 

    The RTSP server processes frames from the camera and streams them over the network, making them accessible via an RTSP URL. The stream is designed to be low-latency, ensuring it’s suitable for real-time applications like DeepStream. 

    3. Adding a Vehicle to the Simulation 

    Once the world is initialized, we selected a vehicle blueprint from CARLA’s library and spawned it at a specific location in the simulation. This vehicle serves as the dynamic platform for attaching the RGB camera. 

    4. Configuring and Attaching the RGB Camera 

    We configured the RGB camera to capture high-resolution video with a wide field of view, ensuring good coverage of the surroundings. The camera was then attached to the vehicle at a suitable position (e.g., on the roof or near the windshield) to provide an optimal perspective. 

    5. Enabling Autopilot for the Vehicle 

    To simulate realistic driving scenarios, we enabled autopilot mode for the vehicle. This feature allows the vehicle to navigate through the simulation autonomously, capturing dynamic scenes that can be streamed in real-time. 

    6. Capturing and Processing Camera Frames 

    The camera captures frames in real-time, and we processed these frames to prepare them for streaming. The raw image data was converted into a format compatible with GStreamer, ensuring smooth integration with the RTSP pipeline. 

    7. Starting the RTSP Stream 

    The RTSP server was then started, making the video stream available over the network. The stream can be accessed via an RTSP URL, allowing integration with DeepStream or other applications that process video streams. 

    8. Cleaning Up Resources 

    Finally, when stopping the script, we ensured the autopilot mode was disabled and the camera feed was stopped. This step prevents any unnecessary resource usage or background processes. 

    Overview of the Workflow  CARLA Simulation: Provides a realistic environment with vehicles and sensors.  RGB Camera: Captures high-quality video from the vehicle’s perspective.  RTSP Server: Streams the video feed over the network using the H.264 format.  RTSP URL: Makes the stream accessible for integration with DeepStream or other processing pipelines. 

    This setup allows us to stream CARLA’s RGB camera feed efficiently and use it in real-time applications. The RTSP pipeline ensures low-latency streaming, making it ideal for tasks like object detection or scene analysis with DeepStream. 

    Testing with RTSP Stream from CARLA Simulator 

    Now that the RTSP stream from CARLA is set up, the next step is to test it using NVIDIA DeepStream on the Jetson Nano. The process is simple and involves running the simulation, starting the RTSP stream, and using DeepStream for inference. 

    Step 1: Start the Simulation in CARLA 

    First, launch the CARLA simulator and press Play to start the simulation. Once the simulation is running, execute the Python script provided earlier: 

    python carla_rtsp.py 

    This will:  Spawn a vehicle in the simulation.  Attach a camera sensor to the vehicle.  Enable autopilot for the vehicle.  Start an RTSP stream at rtsp://192.168.1.37:8554/carla_stream.  Ensure the script is running without errors before proceeding. 

    Step 2: Run Inference with DeepStream 

    On the Jetson Nano, open a terminal and execute the following command to start DeepStream with the RTSP stream as input: 

    deepstream-app -c deepstream_app_config.txt

    The deepstream_app_config.txt file should already include the IP address of the PC running the CARLA simulator, which we configured earlier. This command initializes the inference pipeline and starts processing the RTSP stream. 

    We saw the video feed from the CARLA RGB camera on our screen, displaying real-time object detection results. 

    In the next part of this blog, We will share an image of the results we achieved with this setup! 

    Results 

    After successfully setting up the RTSP stream and running the DeepStream pipeline, we obtained real-time object detection results from the CARLA simulator. Below are two images showcasing the detection output:  1. Crosswalk Detection 

     

    This image demonstrates the model’s ability to detect crosswalks in the CARLA  simulation environment, highlighting its potential for traffic scene analysis.  2. Red Light Detection  

     

    Here, the model accurately detects a red traffic light, showcasing its utility in autonomous driving scenarios for traffic signal recognition. 

    These results validate the seamless integration of CARLA simulation, RTSP streaming, and NVIDIA DeepStream SDK for real-time object detection. 

    Conclusion 

    In this blog, we demonstrated how to set up an RTSP stream from the CARLA simulator’s RGB camera feed and integrate it with the DeepStream pipeline for real-time object detection. we covered the steps from configuring the CARLA simulation to running the DeepStream inference pipeline on the Jetson Nano. 

    The results highlight the effectiveness of combining CARLA’s simulation with NVIDIA’s DeepStream SDK for real-time detection, offering a solid foundation for autonomous driving research. This setup can be expanded for additional detection tasks or sensors, enabling further development in autonomous systems. 

     


    For inquiries regarding the development of Nvidia Jetson,Gstreamer,Openpilot and Carla, please contact us at info@inthings.tech today

     

    Leave a Comment: