Welcome to my instructional guide for inference and realtime DNN vision library for NVIDIA Jetson Nano/TX1/TX2/Xavier NX/AGX Xavier/AGX Orin.
This repo uses NVIDIA TensorRT for efficiently deploying neural networks onto the embedded Jetson platform, improving performance and power efficiency using graph optimizations, kernel fusion, and FP16/INT8 precision.
Vision primitives, such as imageNet
for image recognition, detectNet
for object detection, segNet
for semantic segmentation, and poseNet
for pose estimation inherit from the shared tensorNet
object. Examples are provided for streaming from live camera feed and processing images. See the API Reference section for detailed reference documentation of the C++ and Python libraries.
Follow the Hello AI World tutorial for running inference and transfer learning onboard your Jetson, including collecting your own datasets and training your own models. It covers image classification, object detection, semantic segmentation, pose estimation, and mono depth.
> JetPack 5.0 is now supported, along with Jetson AGX Orin.
> Try the new Pose Estimation and Mono Depth tutorials!
> See the Change Log for the latest updates and new features.
Hello AI World can be run completely onboard your Jetson, including inferencing with TensorRT and transfer learning with PyTorch. The inference portion of Hello AI World - which includes coding your own image classification and object detection applications for Python or C++, and live camera demos - can be run on your Jetson in roughly two hours or less, while transfer learning is best left to leave running overnight.
Below are links to reference documentation for the C++ and Python libraries from the repo:
C++ | Python | |
---|---|---|
Image Recognition | imageNet |
imageNet |
Object Detection | detectNet |
detectNet |
Segmentation | segNet |
segNet |
Pose Estimation | poseNet |
poseNet |
Monocular Depth | depthNet |
depthNet |
These libraries are able to be used in external projects by linking to libjetson-inference
and libjetson-utils
.
Introductory code walkthroughs of using the library are covered during these steps of the Hello AI World tutorial:
Additional C++ and Python samples for running the networks on static images and live camera streams can be found here:
C++ | Python | |
---|---|---|
Image Recognition | imagenet.cpp |
imagenet.py |
Object Detection | detectnet.cpp |
detectnet.py |
Segmentation | segnet.cpp |
segnet.py |
Pose Estimation | posenet.cpp |
posenet.py |
Monocular Depth | depthnet.cpp |
depthnet.py |
note: for working with numpy arrays, see Converting to Numpy Arrays and Converting from Numpy Arrays
These examples will automatically be compiled while Building the Project from Source, and are able to run the pre-trained models listed below in addition to custom models provided by the user. Launch each example with --help
for usage info.
The project comes with a number of pre-trained models that are available through the Model Downloader tool:
Network | CLI argument | NetworkType enum |
---|---|---|
AlexNet | alexnet |
ALEXNET |
GoogleNet | googlenet |
GOOGLENET |
GoogleNet-12 | googlenet-12 |
GOOGLENET_12 |
ResNet-18 | resnet-18 |
RESNET_18 |
ResNet-50 | resnet-50 |
RESNET_50 |
ResNet-101 | resnet-101 |
RESNET_101 |
ResNet-152 | resnet-152 |
RESNET_152 |
VGG-16 | vgg-16 |
VGG-16 |
VGG-19 | vgg-19 |
VGG-19 |
Inception-v4 | inception-v4 |
INCEPTION_V4 |
Network | CLI argument | NetworkType enum | Object classes |
---|---|---|---|
SSD-Mobilenet-v1 | ssd-mobilenet-v1 |
SSD_MOBILENET_V1 |
91 (COCO classes) |
SSD-Mobilenet-v2 | ssd-mobilenet-v2 |
SSD_MOBILENET_V2 |
91 (COCO classes) |
SSD-Inception-v2 | ssd-inception-v2 |
SSD_INCEPTION_V2 |
91 (COCO classes) |
DetectNet-COCO-Dog | coco-dog |
COCO_DOG |
dogs |
DetectNet-COCO-Bottle | coco-bottle |
COCO_BOTTLE |
bottles |
DetectNet-COCO-Chair | coco-chair |
COCO_CHAIR |
chairs |
DetectNet-COCO-Airplane | coco-airplane |
COCO_AIRPLANE |
airplanes |
ped-100 | pednet |
PEDNET |
pedestrians |
multiped-500 | multiped |
PEDNET_MULTI |
pedestrians, luggage |
facenet-120 | facenet |
FACENET |
faces |
Dataset | Resolution | CLI Argument | Accuracy | Jetson Nano | Jetson Xavier |
---|---|---|---|---|---|
Cityscapes | 512x256 | fcn-resnet18-cityscapes-512x256 |
83.3% | 48 FPS | 480 FPS |
Cityscapes | 1024x512 | fcn-resnet18-cityscapes-1024x512 |
87.3% | 12 FPS | 175 FPS |
Cityscapes | 2048x1024 | fcn-resnet18-cityscapes-2048x1024 |
89.6% | 3 FPS | 47 FPS |
DeepScene | 576x320 | fcn-resnet18-deepscene-576x320 |
96.4% | 26 FPS | 360 FPS |
DeepScene | 864x480 | fcn-resnet18-deepscene-864x480 |
96.9% | 14 FPS | 190 FPS |
Multi-Human | 512x320 | fcn-resnet18-mhp-512x320 |
86.5% | 34 FPS | 370 FPS |
Multi-Human | 640x360 | fcn-resnet18-mhp-512x320 |
87.1% | 23 FPS | 325 FPS |
Pascal VOC | 320x320 | fcn-resnet18-voc-320x320 |
85.9% | 45 FPS | 508 FPS |
Pascal VOC | 512x320 | fcn-resnet18-voc-512x320 |
88.5% | 34 FPS | 375 FPS |
SUN RGB-D | 512x400 | fcn-resnet18-sun-512x400 |
64.3% | 28 FPS | 340 FPS |
SUN RGB-D | 640x512 | fcn-resnet18-sun-640x512 |
65.1% | 17 FPS | 224 FPS |
nvpmodel 0
(MAX-N)Model | CLI argument | NetworkType enum | Keypoints |
---|---|---|---|
Pose-ResNet18-Body | resnet18-body |
RESNET18_BODY |
18 |
Pose-ResNet18-Hand | resnet18-hand |
RESNET18_HAND |
21 |
Pose-DenseNet121-Body | densenet121-body |
DENSENET121_BODY |
18 |
The Transfer Learning with PyTorch section of the tutorial speaks from the perspective of running PyTorch onboard Jetson for training DNNs, however the same PyTorch code can be used on a PC, server, or cloud instance with an NVIDIA discrete GPU for faster training.
In this area, links and resources for deep learning are listed: