Pytorch-TensorRT-Detection

PyTorch - TensorRT Custom Image Processing

Welcome to my instructional guide for inference and realtime DNN vision library for NVIDIA Jetson Nano/TX1/TX2/Xavier NX/AGX Xavier/AGX Orin.

This repo uses NVIDIA TensorRT for efficiently deploying neural networks onto the embedded Jetson platform, improving performance and power efficiency using graph optimizations, kernel fusion, and FP16/INT8 precision.

Vision primitives, such as imageNet for image recognition, detectNet for object detection, segNet for semantic segmentation, and poseNet for pose estimation inherit from the shared tensorNet object. Examples are provided for streaming from live camera feed and processing images. See the API Reference section for detailed reference documentation of the C++ and Python libraries.

Follow the Hello AI World tutorial for running inference and transfer learning onboard your Jetson, including collecting your own datasets and training your own models. It covers image classification, object detection, semantic segmentation, pose estimation, and mono depth.

Table of Contents

>   JetPack 5.0 is now supported, along with Jetson AGX Orin.
>   Try the new Pose Estimation and Mono Depth tutorials!
>   See the Change Log for the latest updates and new features.

Hello AI World

Hello AI World can be run completely onboard your Jetson, including inferencing with TensorRT and transfer learning with PyTorch. The inference portion of Hello AI World - which includes coding your own image classification and object detection applications for Python or C++, and live camera demos - can be run on your Jetson in roughly two hours or less, while transfer learning is best left to leave running overnight.

System Setup

Inference

Training

Appendix

API Reference

Below are links to reference documentation for the C++ and Python libraries from the repo:

jetson-inference

  C++ Python
Image Recognition imageNet imageNet
Object Detection detectNet detectNet
Segmentation segNet segNet
Pose Estimation poseNet poseNet
Monocular Depth depthNet depthNet

jetson-utils

These libraries are able to be used in external projects by linking to libjetson-inference and libjetson-utils.

Code Examples

Introductory code walkthroughs of using the library are covered during these steps of the Hello AI World tutorial:

Additional C++ and Python samples for running the networks on static images and live camera streams can be found here:

  C++ Python
   Image Recognition imagenet.cpp imagenet.py
   Object Detection detectnet.cpp detectnet.py
   Segmentation segnet.cpp segnet.py
   Pose Estimation posenet.cpp posenet.py
   Monocular Depth depthnet.cpp depthnet.py

note: for working with numpy arrays, see Converting to Numpy Arrays and Converting from Numpy Arrays

These examples will automatically be compiled while Building the Project from Source, and are able to run the pre-trained models listed below in addition to custom models provided by the user. Launch each example with --help for usage info.

Pre-Trained Models

The project comes with a number of pre-trained models that are available through the Model Downloader tool:

Image Recognition

Network CLI argument NetworkType enum
AlexNet alexnet ALEXNET
GoogleNet googlenet GOOGLENET
GoogleNet-12 googlenet-12 GOOGLENET_12
ResNet-18 resnet-18 RESNET_18
ResNet-50 resnet-50 RESNET_50
ResNet-101 resnet-101 RESNET_101
ResNet-152 resnet-152 RESNET_152
VGG-16 vgg-16 VGG-16
VGG-19 vgg-19 VGG-19
Inception-v4 inception-v4 INCEPTION_V4

Object Detection

Network CLI argument NetworkType enum Object classes
SSD-Mobilenet-v1 ssd-mobilenet-v1 SSD_MOBILENET_V1 91 (COCO classes)
SSD-Mobilenet-v2 ssd-mobilenet-v2 SSD_MOBILENET_V2 91 (COCO classes)
SSD-Inception-v2 ssd-inception-v2 SSD_INCEPTION_V2 91 (COCO classes)
DetectNet-COCO-Dog coco-dog COCO_DOG dogs
DetectNet-COCO-Bottle coco-bottle COCO_BOTTLE bottles
DetectNet-COCO-Chair coco-chair COCO_CHAIR chairs
DetectNet-COCO-Airplane coco-airplane COCO_AIRPLANE airplanes
ped-100 pednet PEDNET pedestrians
multiped-500 multiped PEDNET_MULTI pedestrians, luggage
facenet-120 facenet FACENET faces

Semantic Segmentation

Dataset Resolution CLI Argument Accuracy Jetson Nano Jetson Xavier
Cityscapes 512x256 fcn-resnet18-cityscapes-512x256 83.3% 48 FPS 480 FPS
Cityscapes 1024x512 fcn-resnet18-cityscapes-1024x512 87.3% 12 FPS 175 FPS
Cityscapes 2048x1024 fcn-resnet18-cityscapes-2048x1024 89.6% 3 FPS 47 FPS
DeepScene 576x320 fcn-resnet18-deepscene-576x320 96.4% 26 FPS 360 FPS
DeepScene 864x480 fcn-resnet18-deepscene-864x480 96.9% 14 FPS 190 FPS
Multi-Human 512x320 fcn-resnet18-mhp-512x320 86.5% 34 FPS 370 FPS
Multi-Human 640x360 fcn-resnet18-mhp-512x320 87.1% 23 FPS 325 FPS
Pascal VOC 320x320 fcn-resnet18-voc-320x320 85.9% 45 FPS 508 FPS
Pascal VOC 512x320 fcn-resnet18-voc-512x320 88.5% 34 FPS 375 FPS
SUN RGB-D 512x400 fcn-resnet18-sun-512x400 64.3% 28 FPS 340 FPS
SUN RGB-D 640x512 fcn-resnet18-sun-640x512 65.1% 17 FPS 224 FPS
Legacy Segmentation Models | Network | CLI Argument | NetworkType enum | Classes | | ------------------------|---------------------------------|---------------------------------|---------| | Cityscapes (2048x2048) | `fcn-alexnet-cityscapes-hd` | `FCN_ALEXNET_CITYSCAPES_HD` | 21 | | Cityscapes (1024x1024) | `fcn-alexnet-cityscapes-sd` | `FCN_ALEXNET_CITYSCAPES_SD` | 21 | | Pascal VOC (500x356) | `fcn-alexnet-pascal-voc` | `FCN_ALEXNET_PASCAL_VOC` | 21 | | Synthia (CVPR16) | `fcn-alexnet-synthia-cvpr` | `FCN_ALEXNET_SYNTHIA_CVPR` | 14 | | Synthia (Summer-HD) | `fcn-alexnet-synthia-summer-hd` | `FCN_ALEXNET_SYNTHIA_SUMMER_HD` | 14 | | Synthia (Summer-SD) | `fcn-alexnet-synthia-summer-sd` | `FCN_ALEXNET_SYNTHIA_SUMMER_SD` | 14 | | Aerial-FPV (1280x720) | `fcn-alexnet-aerial-fpv-720p` | `FCN_ALEXNET_AERIAL_FPV_720p` | 2 |

Pose Estimation

Model CLI argument NetworkType enum Keypoints
Pose-ResNet18-Body resnet18-body RESNET18_BODY 18
Pose-ResNet18-Hand resnet18-hand RESNET18_HAND 21
Pose-DenseNet121-Body densenet121-body DENSENET121_BODY 18

The Transfer Learning with PyTorch section of the tutorial speaks from the perspective of running PyTorch onboard Jetson for training DNNs, however the same PyTorch code can be used on a PC, server, or cloud instance with an NVIDIA discrete GPU for faster training.

Extra Resources

In this area, links and resources for deep learning are listed: