NGC | Catalog
CatalogModelsLicense Plate Recognition

License Plate Recognition

Logo for License Plate Recognition
Features
Description
Model to recognize characters from the image crop of a License Plate.
Publisher
NVIDIA
Latest Version
trainable_v1.0
Modified
July 24, 2023
Size
221.06 MB

License Plate Recognition (LPRNet) Model Card

Model Overview

The model described in this card is license plate recognition network, which aims to recognize characters in license plates from cropped RGB license plate images. Two pretrained LPRNet models are delivered --- one is trained on a NVIDIA-owned US license plate dataset and another is trained on a Chinese license plate dataset.

Model Architecture

This model is a sequence classification model with a ResNet backbone. And it will take the image as network input and produce sequence output.

Training

The training algorithm optimizes the network to minimize the connectionist temporal classification (CTC) loss between a ground truth characters sequence of a license plate and a predicted characters sequence. Then the license plate will be decoded from the sequence output of the model through best path decoding method (greedy decoding).

Training Data

LPRNet model for US license plates was trained on a proprietary dataset with over 310000 US license plates images. The license plates images are taken at various angle and illumination. The images are collected from dash camera and side camera of a vehicle. Then the license plates in the images are labeled and cropped. The US dataset statistics:

  • Characters distribution:

    character number
    0 100688
    1 117499
    2 98599
    3 111220
    4 127387
    5 148325
    6 175541
    7 231298
    8 105170
    9 111234
    A 36350
    B 33677
    C 40292
    D 39447
    E 36787
    F 34734
    G 40474
    H 38751
    I 12645
    J 34155
    K 37397
    L 40900
    M 36544
    N 40431
    P 38198
    Q 11086
    R 40899
    S 38820
    T 41155
    U 45471
    V 35998
    W 38096
    X 37468
    Y 34454
    Z 31963
  • Illumination: sunny, cloudy, rainy, bright, dim.

  • Locations of dataset collection: US roads and parking lots, mainly in California.

  • Camera mounting location: mainly the dash camera and the side camera in cars.

  • Camera angles: Assume camera sensor is in the camera coordinate center. The X-axis is horizontal and points to the right, the Y-axis is vertical and points up and the Z-axis points towards the outside. In this coordinate system, the license plates in following position are choosen:

    • Roll: within -30 degree to +30 degree
    • Pitch: within -30 degree to +30 degree
    • Yaw: within -15 degree to +15 degree
    • Distance to license plate: From the distances that the license plates in images are larger than 16x16 pixels
  • License plates images shapes:

    min max avg
    height 17 1924 54
    width 35 3896 109
    aspect-ratio (width/height) 0.9 3.6 2.0
    Some sample images (before cropping the license plates) can be found in output annotated images section of LPD's model card.

LPRNet model for Chinese license plates was trained on a public dataset CCPD (Chinese City Parking dataset) with 100000 images. The details of this dataset can be found in "Towards end-to-end license plate detection and recognition: A large dataset and baseline."(ECCV 2018)

Data Format

The data format must be in the following format.

/Dataset_01
    /images
        0000.jpg
        0001.jpg
        0002.jpg
        ...
        ...
        ...
        N.jpg
    /labels 
        0000.txt
        0001.txt
        0002.txt
        ...
        ...
        ...
        N.txt
/characters_list.txt

Each cropped license plate image has a corresponding label text file which contains one line of characters in the specific license plate. There is a characters_list.txt which has all the characters found in license plate dataset. Each character takes one line.

Performance

Evaluation Data

The evaluation dataset for US LPRNet contains 1951 images which are obtained through the same way as training dataset. The images are picked from the raw images manually to be diversed at different angles, illumination and sharpness.

The evaluation dataset for Chinese LPRNet is the validation split of Chinese City Parking Dataset (CCPD) which contains 99996 images.

Methodology and KPI

The key performance indicator is the accuracy of license plate recognition. The accurate recognition means all the characters in a license plate are recognized correctly.The KPI for the evaluation data are reported below.

model dataset accuracy
us_lprnet_baseline18_unpruned NVIDIA LPR eval dataset 97.49%
ch_lprnet_baseline18_unpruned CCPD_base_val 99.67%

Real-time Inference Performance

The inference uses FP16 precision. The inference performance runs with trtexec on Jetson Nano, Xavier NX, AGX Xavier and NVIDIA T4 GPU. The Jetson devices run at Max-N configuration for maximum system performance. The data is the inference only performance. The end-to-end performance with streaming video data might slightly vary depending on use cases of applications.

Device precision batch_size FPS
Jetson Nano FP16 32 16
Jetson NX FP16 32 600
Jetson Xavier FP16 64 1021
T4 FP16 128 3821

How to use this model

This model needs to be used with NVIDIA Hardware and Software. For Hardware, the model can run on any NVIDIA GPU including NVIDIA Jetson devices. This model can only be used with Transfer Learning Toolkit (TLT), DeepStream SDK or TensorRT.

Primary use case intended for this model is to recognize the license plate from the cropped RGB license plate image.

There are two models provided:

  • us_lprnet_baseline18
  • ch_lprnet_baseline18

They are intended for training and fine-tune using Transfer Learning Toolkit and the users' dataset of license plates in United States of America or China. High fidelity models can be trained to the new use cases. The Jupyter notebook available as a part of TLT container can be used to re-train.

These models are also intended for easy deployment to the edge using DeepStream SDK or TensorRT. They accept 3x48x96 dimension input tensors and output the predicted sequence characters id. DeepStream provides facility to create efficient video analytic pipelines to capture, decode and pre-process the data before running inference.

The models are encrypted and can be decrypted with the following key:

  • Model load key: nvidia_tlt

Please make sure to use this as the key for all TLT commands that require a model load key.

Input

RGB Images of 3 X 48 X 96 (C H W)

Output

characters id sequence. (DeepStream post-process plugin is needed to get the final license plate)

Instructions to use the model with TLT

In order to use these models as pretrained weights for transfer learning, please use the snippet below as a template for the model_config component of the experiment spec file to train a LPRNet model. For more information on experiment spec file, please refer to the Transfer Learning Toolkit User Guide.

lpr_config {
  hidden_units: 512
  max_label_length: 8
  arch: "baseline"
  nlayers: 18 
}

Instructions to deploy the model with DeepStream

To create the entire end-to-end video analytic application, deploy this model with DeepStream SDK. DeepStream SDK is a streaming analytic toolkit to accelerate building AI-based video analytic applications. DeepStream supports direct integration of this model into the deepstream sample app.

To deploy this model with DeepStream 5.1, please follow the instructions in this repository.

Limitations

Restricted usage in different regions:

NVIDIA LPRNet model for US is trained on license plates collected in California. So for license plates in other states, the model will not be expected to reach the same level of accuracy as in California.

NVIDIA LPRNet model for Chinese model is trained on license plates collected in Anhui province.

In general, to get better accuracy in a region other than US-California / China-Anhui in pretrain dataset, more data is needed in this region to finetune the pretrained model through TAO Toolkit.

Truncated license plates images

NVIDIA LPRNet models may not work well with truncated license plates in which the characters shapes are not complete. LPRNet models' accuracies rely on the license plate detection's quality.

The license plate's angle

NVIDIA LPRNet model for US is trained on almost horizontal license plates. If the license plate's angle with horizontal line is larger than 30 degrees, the characters in it may not be recognized.

Model versions:

  • trainable_v1.0 - Pre-trained models for US and China license plates.
  • deployable_v1.0 - Models for US and China license plates deployable to deepstream.

Reference

Citations

  • Graves, Alex, et al. "Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks." In: Proceedings of the 23rd international conference on Machine learning (2006)
  • He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. In: CVPR (2015)

Using TAO Pre-trained Models

License

License to use these models is covered by the Model EULA. By downloading the unpruned or pruned version of the model, you accept the terms and conditions of these licenses.

Technical blogs

Suggested reading

Ethical AI

NVIDIA LPRNet model recognizes license plates.

NVIDIA’s platforms and application frameworks enable developers to build a wide array of AI applications. Consider potential algorithmic bias when choosing or creating the models being deployed. Work with the model’s developer to ensure that it meets the requirements for the relevant industry and use case; that the necessary instruction and documentation are provided to understand error rates, confidence intervals, and results; and that the model is being used under the conditions and in the manner intended.