Pytorch-TensorRT-Detection

PyTorch - TensorRT Custom Image Processing

Welcome to my instructional guide for inference and realtime DNN vision library for NVIDIA Jetson Nano/TX1/TX2/Xavier NX/AGX Xavier/AGX Orin.

This repo uses NVIDIA TensorRT for efficiently deploying neural networks onto the embedded Jetson platform, improving performance and power efficiency using graph optimizations, kernel fusion, and FP16/INT8 precision.

Vision primitives, such as imageNet for image recognition, detectNet for object detection, segNet for semantic segmentation, and poseNet for pose estimation inherit from the shared tensorNet object. Examples are provided for streaming from live camera feed and processing images. See the API Reference section for detailed reference documentation of the C++ and Python libraries.

Follow the Hello AI World tutorial for running inference and transfer learning onboard your Jetson, including collecting your own datasets and training your own models. It covers image classification, object detection, semantic segmentation, pose estimation, and mono depth.

Hello AI World
Video Walkthroughs
API Reference
Code Examples
Pre-Trained Models
System Requirements
Change Log

> JetPack 5.0 is now supported, along with Jetson AGX Orin.
> Try the new Pose Estimation and Mono Depth tutorials!
> See the Change Log for the latest updates and new features.

Hello AI World

Hello AI World can be run completely onboard your Jetson, including inferencing with TensorRT and transfer learning with PyTorch. The inference portion of Hello AI World - which includes coding your own image classification and object detection applications for Python or C++, and live camera demos - can be run on your Jetson in roughly two hours or less, while transfer learning is best left to leave running overnight.

System Setup

Inference

Training

Transfer Learning with PyTorch
Classification/Recognition (ResNet-18)
Object Detection (SSD-Mobilenet)
- Re-training SSD-Mobilenet
- Collecting your own Detection Datasets

Appendix

API Reference

Below are links to reference documentation for the C++ and Python libraries from the repo:

jetson-inference

	C++	Python
Image Recognition	`imageNet`	`imageNet`
Object Detection	`detectNet`	`detectNet`
Segmentation	`segNet`	`segNet`
Pose Estimation	`poseNet`	`poseNet`
Monocular Depth	`depthNet`	`depthNet`

jetson-utils

C++
Python

These libraries are able to be used in external projects by linking to libjetson-inference and libjetson-utils.

Code Examples

Introductory code walkthroughs of using the library are covered during these steps of the Hello AI World tutorial:

Additional C++ and Python samples for running the networks on static images and live camera streams can be found here:

	C++	Python
Image Recognition	`imagenet.cpp`	`imagenet.py`
Object Detection	`detectnet.cpp`	`detectnet.py`
Segmentation	`segnet.cpp`	`segnet.py`
Pose Estimation	`posenet.cpp`	`posenet.py`
Monocular Depth	`depthnet.cpp`	`depthnet.py`

note: for working with numpy arrays, see Converting to Numpy Arrays and Converting from Numpy Arrays

These examples will automatically be compiled while Building the Project from Source, and are able to run the pre-trained models listed below in addition to custom models provided by the user. Launch each example with --help for usage info.

Pre-Trained Models

The project comes with a number of pre-trained models that are available through the Model Downloader tool:

Image Recognition

Network	CLI argument	NetworkType enum
AlexNet	`alexnet`	`ALEXNET`
GoogleNet	`googlenet`	`GOOGLENET`
GoogleNet-12	`googlenet-12`	`GOOGLENET_12`
ResNet-18	`resnet-18`	`RESNET_18`
ResNet-50	`resnet-50`	`RESNET_50`
ResNet-101	`resnet-101`	`RESNET_101`
ResNet-152	`resnet-152`	`RESNET_152`
VGG-16	`vgg-16`	`VGG-16`
VGG-19	`vgg-19`	`VGG-19`
Inception-v4	`inception-v4`	`INCEPTION_V4`

Object Detection

Network	CLI argument	NetworkType enum	Object classes
SSD-Mobilenet-v1	`ssd-mobilenet-v1`	`SSD_MOBILENET_V1`	91 (COCO classes)
SSD-Mobilenet-v2	`ssd-mobilenet-v2`	`SSD_MOBILENET_V2`	91 (COCO classes)
SSD-Inception-v2	`ssd-inception-v2`	`SSD_INCEPTION_V2`	91 (COCO classes)
DetectNet-COCO-Dog	`coco-dog`	`COCO_DOG`	dogs
DetectNet-COCO-Bottle	`coco-bottle`	`COCO_BOTTLE`	bottles
DetectNet-COCO-Chair	`coco-chair`	`COCO_CHAIR`	chairs
DetectNet-COCO-Airplane	`coco-airplane`	`COCO_AIRPLANE`	airplanes
ped-100	`pednet`	`PEDNET`	pedestrians
multiped-500	`multiped`	`PEDNET_MULTI`	pedestrians, luggage
facenet-120	`facenet`	`FACENET`	faces

Semantic Segmentation

Dataset	Resolution	CLI Argument	Accuracy	Jetson Nano	Jetson Xavier
Cityscapes	512x256	`fcn-resnet18-cityscapes-512x256`	83.3%	48 FPS	480 FPS
Cityscapes	1024x512	`fcn-resnet18-cityscapes-1024x512`	87.3%	12 FPS	175 FPS
Cityscapes	2048x1024	`fcn-resnet18-cityscapes-2048x1024`	89.6%	3 FPS	47 FPS
DeepScene	576x320	`fcn-resnet18-deepscene-576x320`	96.4%	26 FPS	360 FPS
DeepScene	864x480	`fcn-resnet18-deepscene-864x480`	96.9%	14 FPS	190 FPS
Multi-Human	512x320	`fcn-resnet18-mhp-512x320`	86.5%	34 FPS	370 FPS
Multi-Human	640x360	`fcn-resnet18-mhp-512x320`	87.1%	23 FPS	325 FPS
Pascal VOC	320x320	`fcn-resnet18-voc-320x320`	85.9%	45 FPS	508 FPS
Pascal VOC	512x320	`fcn-resnet18-voc-512x320`	88.5%	34 FPS	375 FPS
SUN RGB-D	512x400	`fcn-resnet18-sun-512x400`	64.3%	28 FPS	340 FPS
SUN RGB-D	640x512	`fcn-resnet18-sun-640x512`	65.1%	17 FPS	224 FPS

If the resolution is omitted from the CLI argument, the lowest resolution model is loaded
Accuracy indicates the pixel classification accuracy across the model’s validation dataset
Performance is measured for GPU FP16 mode with JetPack 4.2.1, nvpmodel 0 (MAX-N)

Legacy Segmentation Models

| Network | CLI Argument | NetworkType enum | Classes | | ------------------------|---------------------------------|---------------------------------|---------| | Cityscapes (2048x2048) | `fcn-alexnet-cityscapes-hd` | `FCN_ALEXNET_CITYSCAPES_HD` | 21 | | Cityscapes (1024x1024) | `fcn-alexnet-cityscapes-sd` | `FCN_ALEXNET_CITYSCAPES_SD` | 21 | | Pascal VOC (500x356) | `fcn-alexnet-pascal-voc` | `FCN_ALEXNET_PASCAL_VOC` | 21 | | Synthia (CVPR16) | `fcn-alexnet-synthia-cvpr` | `FCN_ALEXNET_SYNTHIA_CVPR` | 14 | | Synthia (Summer-HD) | `fcn-alexnet-synthia-summer-hd` | `FCN_ALEXNET_SYNTHIA_SUMMER_HD` | 14 | | Synthia (Summer-SD) | `fcn-alexnet-synthia-summer-sd` | `FCN_ALEXNET_SYNTHIA_SUMMER_SD` | 14 | | Aerial-FPV (1280x720) | `fcn-alexnet-aerial-fpv-720p` | `FCN_ALEXNET_AERIAL_FPV_720p` | 2 |

Pose Estimation

Model	CLI argument	NetworkType enum	Keypoints
Pose-ResNet18-Body	`resnet18-body`	`RESNET18_BODY`	18
Pose-ResNet18-Hand	`resnet18-hand`	`RESNET18_HAND`	21
Pose-DenseNet121-Body	`densenet121-body`	`DENSENET121_BODY`	18

Recommended System Requirements

Jetson Nano Developer Kit with JetPack 4.2 or newer (Ubuntu 18.04 aarch64).
Jetson Nano 2GB Developer Kit with JetPack 4.4.1 or newer (Ubuntu 18.04 aarch64).
Jetson Xavier NX Developer Kit with JetPack 4.4 or newer (Ubuntu 18.04 aarch64).
Jetson AGX Xavier Developer Kit with JetPack 4.0 or newer (Ubuntu 18.04 aarch64).
Jetson TX2 Developer Kit with JetPack 3.0 or newer (Ubuntu 16.04 aarch64).
Jetson TX1 Developer Kit with JetPack 2.3 or newer (Ubuntu 16.04 aarch64).

The Transfer Learning with PyTorch section of the tutorial speaks from the perspective of running PyTorch onboard Jetson for training DNNs, however the same PyTorch code can be used on a PC, server, or cloud instance with an NVIDIA discrete GPU for faster training.

Extra Resources

In this area, links and resources for deep learning are listed:

ros_deep_learning - TensorRT inference ROS nodes
NVIDIA AI IoT - NVIDIA Jetson GitHub repositories
Jetson eLinux Wiki - Jetson eLinux Wiki

This site is open source. Improve this page.