NGC Catalog Collections - GTC 2020 Demo

This notebook accompanies the NGC GTC Fall 2020 Demo and walks you through the following steps:

Using the NGC CLI to work with models and resources from the NGC & Private Registry
Transfer learning with TLT
Deploying the retrained model via EGX Manager

We'll go into loads more detail about each of these steps in the rest of this notebook! Let's jump right into it.

NGC Catalog Collections

All the assets we're using from the NGC Catalog in this notebook have been curated into this Collection. Collections in the NGC Catalog give you a single entry point for AI content, whatever you're trying to build. We let you focus on building the apps and we put the tools in one place.

TLT and Face Mask Detection

In this demo we'll be using TLT to train a face-mask-detector, based on this example from the Transfer Learning Toolkits team at NVIDIA. Once we've retrained our model we'll be able to take a live stream from a camera and determine if people are wearing appropriate PPE.

FACEMASK SCREENSHOT

The NGC CLI

To start with we're going to be using the NGC CLI to get models and resources from the NGC Catalog. Because we're be using the TLT container, the NGC CLI is already installed which makes things really easy.

However, if you're interested in the steps needed to install the CLI for yourself take a look at our documentation.

Ok! So if the CLI is installed you may be wondering why we're taking the time to tell you all about it.

Well, in this walkthrough we'll be using the NGC Private Registry to store and manage our retrained models as part of a deployment pipeline. We know that sounds complicated, but it really isnt. We promise by the end of the notebook it'll feel like second nature to you.

If you're not sure what a private registry is, please take a look at our blog where we explain more. You can even book a demo from one of our team.

If you're not interested in using the private registry, that's absolutely no problem, simply skip the next code cell and carry on! The demo will still work, but you'll not be able to complete the last few steps and create a deployment pipeline.

In order to configure the CLI and get access to our private registry we'll need to authenticate with NGC. This proves we are who we say we are and enables us to use a whole bunch of extra tools.

If you don't know what your API key is, you can generate a new one here.

Let's do this! In the code cell below we run the NGC CLI setup. This allows us to supply an API key for authentication, set our context (org/team) and determine what output type we'd like from the CLI.

Valid API Key - This allows us to authenticate with NGC. Follow the steps above to generate a key if you don't already have one.
Output Format - What output format would we like the CLI to return? This can be ASCII, CSV or JSON.
ORG Context - NGC Private Registry has orgs and teams which allow you to manage an entire enterprise building AI. You'll need to specify the org here.
TEAM Context - If you're not using teams that's fine! Simply say no-team. Otherwise supply your team context here.

We've provided example values to help you get started. Update the values below to configure the CLI:

In [1]:

%env NGC_KEY=c3JsNzBpaDQ1bmtqMDJkZGF2cnVxdXFpNnA6ODgzYjkzYTQtMzg1MC00MTM0LWI1OTEtNWJlNGZkNmEzODZh

Out[1]:

env: NGC_KEY=c3JsNzBpaDQ1bmtqMDJkZGF2cnVxdXFpNnA6ODgzYjkzYTQtMzg1MC00MTM0LWI1OTEtNWJlNGZkNmEzODZh

In [2]:

%%bash

# Call NGC CLI Configurator
ngc config set
c3JsNzBpaDQ1bmtqMDJkZGF2cnVxdXFpNnA6ODgzYjkzYTQtMzg1MC00MTM0LWI1OTEtNWJlNGZkNmEzODZh
ascii
ngcvideos
no-team

Out[2]:

Enter API key [no-apikey]. Choices: [<VALID_APIKEY>, 'no-apikey']: Enter CLI output format type [ascii]. Choices: [ascii, csv, json]: Enter org [no-org]. Choices: ['ea-jarvis', 'ea-jarvis-megatron', 'ea-ngcdemos', 'egxdefault', 'ngc-partner-tools', 'ngcmodelstage', 'ngcvideos', 'nks_demo', 'nv-computer-vision', 'nv-uxresearch', 'nvidian', 'nvstaging']: Enter team [no-team]. Choices: ['no-team']: Successfully saved NGC configuration to /root/.ngc/config

Now let's run the CLI Configurator and provide our options we just set:

That's the NGC CLI set-up complete! If you want to verify the process was completed successfully. Run the code cell below:

In [3]:

!cat ~/.ngc/config

Out[3]:

;WARNING - This is a machine generated file.  Do not edit manually.

;WARNING - To update local config settings, see "ngc config set -h"

[CURRENT]

apikey = c3JsNzBpaDQ1bmtqMDJkZGF2cnVxdXFpNnA6ODgzYjkzYTQtMzg1MC00MTM0LWI1OTEtNWJlNGZkNmEzODZh

format_type = ascii

org = ngcvideos

Using TLT to train a face-mask detector

So if you're skipping the NGC CLI step, this is where you'll pick things back up! Specifically, we'll be following this example which walks us through the end to end process of retraining models with TLT.

Step 1 - Setup Environment Variables

When using the purpose-built pretrained models from the NGC Catalog, please make sure to set the $KEY environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

Note: Please make sure to remove any stray artifacts/files from the $USER_EXPERIMENT_DIR or $DATA_DOWNLOAD_DIR paths as mentioned below, that may have been generated from previous experiments. Having checkpoint files etc may interfere with creating a training graph for a new experiment.

KEY - key required for the model
USER_EXPERIMENT_DIR - directory for storing the results of the users experiments and pretrained models
DATA_DOWNLOAD_DIR - repository for downloaded data
SPECS_DIR - location for any specs files

In [4]:

# Setting up env variables for cleaner command line commands.

%env KEY=tlt_encode
%env USER_EXPERIMENT_DIR=/workspace/detectnet_v2 
%env DATA_DOWNLOAD_DIR=/workspace/data           
%env SPECS_DIR=/workspace/detectnet_v2/specs

Out[4]:

env: KEY=tlt_encode

env: USER_EXPERIMENT_DIR=/workspace/detectnet_v2

env: DATA_DOWNLOAD_DIR=/workspace/data

env: SPECS_DIR=/workspace/detectnet_v2/specs

Step 2 - Get Dataset Preprocessing Scripts From GitHub

As TLT requires the datasets to be in the KITTI format we need to do some conversion! Let's grab the helper scripts from the TLT GitHub page. Run the code cell below:

In [5]:

!git clone https://github.com/NVIDIA-AI-IOT/face-mask-detection.git

Out[5]:

Cloning into 'face-mask-detection'...

remote: Enumerating objects: 91, done.[K

remote: Counting objects: 100% (91/91), done.[K

remote: Compressing objects: 100% (63/63), done.[K

remote: Total 91 (delta 49), reused 62 (delta 25), pack-reused 0[K

Unpacking objects: 100% (91/91), done.

Checking connectivity... done.

Now we have the helper scripts on our machine! Let's install the dependencies..

In [6]:

#Fixes the fact that pip doesn't work in Ubuntu right now.. 
!curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
!python3 get-pip.py --force-reinstall

# Install dependencies 
!cd face-mask-detection && python3 -m pip install -r requirements.txt

Out[6]:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

100 1841k  100 1841k    0     0  4700k      0 --:--:-- --:--:-- --:--:-- 4709k

[33mDEPRECATION: Python 3.5 reached the end of its life on September 13th, 2020. Please upgrade your Python as Python 3.5 is no longer maintained. pip 21.0 will drop support for Python 3.5 in January 2021. pip 21.0 will remove support for this functionality.[0m

Collecting pip

  Downloading pip-20.2.3-py2.py3-none-any.whl (1.5 MB)

[K     |████████████████████████████████| 1.5 MB 3.3 MB/s eta 0:00:01

[?25hCollecting setuptools

  Downloading setuptools-50.3.0-py3-none-any.whl (785 kB)

[K     |████████████████████████████████| 785 kB 21.4 MB/s eta 0:00:01

[?25hInstalling collected packages: pip, setuptools

Successfully installed pip-20.2.3 setuptools-50.3.0

[33mDEPRECATION: Python 3.5 reached the end of its life on September 13th, 2020. Please upgrade your Python as Python 3.5 is no longer maintained. pip 21.0 will drop support for Python 3.5 in January 2021. pip 21.0 will remove support for this functionality.[0m

Collecting numpy

  Downloading numpy-1.18.5-cp35-cp35m-manylinux1_x86_64.whl (19.9 MB)

[K     |████████████████████████████████| 19.9 MB 3.3 MB/s eta 0:00:01

[?25hCollecting scipy

  Downloading scipy-1.4.1-cp35-cp35m-manylinux1_x86_64.whl (26.0 MB)

[K     |████████████████████████████████| 26.0 MB 69.0 MB/s eta 0:00:01

[?25hCollecting Pillow

  Downloading Pillow-7.2.0-cp35-cp35m-manylinux1_x86_64.whl (2.2 MB)

[K     |████████████████████████████████| 2.2 MB 72.3 MB/s eta 0:00:01

[?25hInstalling collected packages: numpy, scipy, Pillow

Successfully installed Pillow-7.2.0 numpy-1.18.5 scipy-1.4.1

Step 3 - Download The Datasets

In this experiment we will be using 3 different datasets:

Faces with a mask:
- Kaggle Mask Dataset
- MAFA - MAsked FAces: Pass Code: 4fz6
Faces without a mask:
- FDDB Dataset Download Link
- WiderFace Dataset Download Link

While we can't download the dataset(s) for you - we can tell you the exact format we need the datasets to be in for this experiment to work. Take a look at this data tree to see what structure we need.

Assuming you've downloaded the required datasets to your machine in the required structure, the following code cell can be modified to copy them to your GPU cluster:

scp -r <downloaded_data> </home/workspace>

In [7]:

!scp -r ./downloaded_data -i key-pair.pem ubuntu@<system>:home/ubuntu

Depending on how you've configured/run docker the above SCP may have moved the files to a shared location which can be accessed by both Docker and your notebook instance. You can verify that is the case by running the code cell below. You should see the four directories in the required structure.

If you don't see the files listed out, don't worry. You just need to copy them from your remote host to the container itself. Some dockercp magic should get this fixed and those files into the TLT container where we can process them. Take a look at this example below:

docker cp -r ./downloaded_data <container>:/downloaded_data

In [8]:

# List out the download data
!ls downloaded_data

Out[8]:

FDDB Dataset  Kaggle Medical Mask Dataset  MAFA Dataset  Wider Face Dataset

Step 4 - Prepare the Dataset

To use the datasets we've just donwloaded with TLT we need to convert them to the KITTI format. The GitHub repository we donwloaded earlier provides helper scripts for us to convert each of the four datasets.

Here's the template command we'll need to execute:

python3 data2kitti.py --kaggle-dataset-path <kaggle dataset absolute directory path> \
                         --mafa-dataset-path <mafa dataset absolute  directory path> \
                         --fddb-dataset-path < FDDB dataset absolute  directory path> \
                         --widerface-dataset-path <widerface dataset absolute  directory path> \
                         --kitti-base-path < Out directory for storing KITTI formatted annotations > \
                         --category-limit < Category Limit for Masked and No-Mask Faces > \
                         --tlt-input-dims_width < tlt input width > \
                         --tlt-input-dims_height <tlt input height > \
                         --train < for generating training dataset >

First - let's set up those absolute paths. If you followed the structure from the SCP command above you shouldn't need to change anything in the next code cell. The TLT_INPUT_DIMS properties are model-dependent. We've used the default values for DetectNet-V2 (used in this example). If you'd like to see the values for yourself, you can view the documentation here. Simply search for the model architecture you need and update the values accordingly.

DetectNet_v2

Input size: C * W * H (where C = 1 or 3, W > =960, H >=544 and W, H are multiples of 16)
Image format: JPG, JPEG, PNG
Label format: KITTI detection

In [9]:

# Env variables for the KITTI datasets 

%env KAGGLE_DATASET_PATH=/workspace/downloaded_data/Kaggle*Medical*Mask*Dataset
%env MAFA_DATASET_PATH=/workspace/downloaded_data/MAFA*Dataset
%env FDDB_DATASET_PATH=/workspace/downloaded_data/FDDB*Dataset
%env WIDERFACE_DATASET_PATH=/workspace/downloaded_data/Wider*Face*Dataset
%env KITTI_BASE_PATH=/workspace/converted_datasets
%env CATEGORY_LIMIT=6000
%env TLT_INPUT_DIMS_WIDTH=960
%env TLT_INPUT_DIMS_HEIGHT=544

Out[9]:

env: KAGGLE_DATASET_PATH=/workspace/downloaded_data/Kaggle*Medical*Mask*Dataset

env: MAFA_DATASET_PATH=/workspace/downloaded_data/MAFA*Dataset

env: FDDB_DATASET_PATH=/workspace/downloaded_data/FDDB*Dataset

env: WIDERFACE_DATASET_PATH=/workspace/downloaded_data/Wider*Face*Dataset

env: KITTI_BASE_PATH=/workspace/converted_datasets

env: CATEGORY_LIMIT=6000

env: TLT_INPUT_DIMS_WIDTH=960

env: TLT_INPUT_DIMS_HEIGHT=544

Now let's convert the datasets to KITTI format:

In [10]:

!python3 /workspace/face-mask-detection/data2kitti.py --kaggle-dataset-path $KAGGLE_DATASET_PATH \
                         --mafa-dataset-path $MAFA_DATASET_PATH \
                         --fddb-dataset-path $FDDB_DATASET_PATH \
                         --widerface-dataset-path $WIDERFACE_DATASET_PATH \
                         --kitti-base-path $KITTI_BASE_PATH \
                         --category-limit $CATEGORY_LIMIT \
                         --tlt-input-dims_width $TLT_INPUT_DIMS_WIDTH\
                         --tlt-input-dims_height $TLT_INPUT_DIMS_HEIGHT \
                         --train

Out[10]:

Kaggle Dataset: Total Mask faces: 4154 and No-Mask faces:790

Total Mask Labelled:4154 and No-Mask Labelled:790

Directory Already Exists

Directory Already Exists

/workspace/face-mask-detection/data_utils/mafa2kitti.py:51: RuntimeWarning: overflow encountered in ubyte_scalars

  bbox = [_bbox_label[0], _bbox_label[1], _bbox_label[0]+_bbox_label[2], _bbox_label[1]+_bbox_label[3]]

MAFA Dataset: Total Mask faces: 1846 and No-Mask faces:232

Total Mask Labelled:6000 and No-Mask Labelled:1022

Directory Already Exists

Directory Already Exists

FDDB Dataset: Mask Labelled:0 and No-Mask Labelled:2845

Total Mask Labelled:6000 and No-Mask Labelled:3867

WideFace: Total Mask Labelled:0 and No-Mask Labelled:2134

----------------------------

Final: Total Mask Labelled:6000

Total No-Mask Labelled:6001

----------------------------

Step 5 - Prepare TF Records form KITTI format datasets

With our data processed we now need to generate the TFRecords - the final step before we begin working with our models and training.

Run the code cells below to:

Update the TFRecords spec file to take in our KITTI format dataset
Create TFRecords using tlt-dataset-convert

First let's move the template spec files we downloaded from GitHub to our Spec directory:

In [11]:

!mkdir detectnet_v2
!mkdir $SPECS_DIR
!mv face-mask-detection/tlt_specs/* $SPECS_DIR
!ls $SPECS_DIR

Out[11]:

detectnet_v2_inference_kitti_etlt.txt

detectnet_v2_inference_kitti_tlt.txt

detectnet_v2_retrain_resnet18_kitti.txt

detectnet_v2_tfrecords_kitti_trainval.txt

detectnet_v2_tfrecords_kitti_val.txt

detectnet_v2_train_resnet18_kitti.txt

Now we need to edit the $SPECS_DIR/detectnet_v2_tfrecords_kitti_trainval.txt file to point to our converted images.

Run the code cell below to update the file:

In [12]:

%env KITTI_CONFIG=kitti_config {\
  root_directory_path: "/workspace/converted_datasets/train/"\
  image_dir_name: "images"\
  label_dir_name: "labels"\
  image_extension: ".jpg"\
  partition_mode: "random"\
  num_partitions: 2\
  val_split: 20\
  num_shards: 10 }

!echo $KITTI_CONFIG > $SPECS_DIR/detectnet_v2_tfrecords_kitti_trainval.txt

Out[12]:

env: KITTI_CONFIG=kitti_config {   root_directory_path: "/workspace/converted_datasets/train/"   image_dir_name: "images"   label_dir_name: "labels"   image_extension: ".jpg"   partition_mode: "random"   num_partitions: 2   val_split: 20   num_shards: 10 }

In [13]:

print("TFRecords conversion spec file for KITTI training")
!cat $SPECS_DIR/detectnet_v2_tfrecords_kitti_train.txt

Out[13]:

TFRecords conversion spec file for KITTI training

cat: /workspace/detectnet_v2/specs/detectnet_v2_tfrecords_kitti_train.txt: No such file or directory

In [14]:

# Creating a new directory for the output tfrecords dump.
print("Converting TFRecords for KITTI trainval dataset")
!tlt-dataset-convert -d $SPECS_DIR/detectnet_v2_tfrecords_kitti_trainval.txt \
                     -o $DATA_DOWNLOAD_DIR/tfrecords/kitti_trainval/

Out[14]:

Converting TFRecords for KITTI trainval dataset

2020-09-24 16:59:48.419264: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

Using TensorFlow backend.

2020-09-24 16:59:55,266 - iva.detectnet_v2.dataio.build_converter - INFO - Instantiating a kitti converter

2020-09-24 16:59:55,277 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Num images in

Train: 2827	Val: 706

2020-09-24 16:59:55,277 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0.

2020-09-24 16:59:55,280 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 0

WARNING:tensorflow:From /home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataio/dataset_converter_lib.py:142: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.

2020-09-24 16:59:55,280 - tensorflow - WARNING - From /home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataio/dataset_converter_lib.py:142: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.

/usr/local/lib/python3.6/dist-packages/iva/detectnet_v2/dataio/kitti_converter_lib.py:273: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default.

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/403_mask.txt.

Coordinates: x1 = 667, x2 = 506, y1: 181, y2: 341

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/73_mask.txt.

Coordinates: x1 = 858, x2 = 640, y1: 136, y2: 354

Skipping this object

Top left coordinate must be less than bottom right.Error in object 1 of label_file /workspace/converted_datasets/train/labels/73_mask.txt.

Coordinates: x1 = 418, x2 = 173, y1: 162, y2: 359

Skipping this object

2020-09-24 16:59:55,376 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 1

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/93_mask.txt.

Coordinates: x1 = 534, x2 = 464, y1: 129, y2: 194

Skipping this object

Top left coordinate must be less than bottom right.Error in object 1 of label_file /workspace/converted_datasets/train/labels/93_mask.txt.

Coordinates: x1 = 654, x2 = 612, y1: 199, y2: 255

Skipping this object

Top left coordinate must be less than bottom right.Error in object 2 of label_file /workspace/converted_datasets/train/labels/93_mask.txt.

Coordinates: x1 = 332, x2 = 264, y1: 158, y2: 235

Skipping this object

2020-09-24 16:59:55,449 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 2

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/train_00000650.txt.

Coordinates: x1 = 412, x2 = 50, y1: 133, y2: 509

Skipping this object

2020-09-24 16:59:55,521 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 3

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/141_mask.txt.

Coordinates: x1 = 565, x2 = 378, y1: 102, y2: 321

Skipping this object

2020-09-24 16:59:55,600 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 4

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/113_mask.txt.

Coordinates: x1 = 535, x2 = 384, y1: 164, y2: 355

Skipping this object

2020-09-24 16:59:55,678 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 5

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/227_mask.txt.

Coordinates: x1 = 755, x2 = 535, y1: 139, y2: 408

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/428_mask.txt.

Coordinates: x1 = 583, x2 = 185, y1: 128, y2: 514

Skipping this object

2020-09-24 16:59:55,750 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 6

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/74_mask.txt.

Coordinates: x1 = 574, x2 = 326, y1: 118, y2: 374

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/25_mask.txt.

Coordinates: x1 = 424, x2 = 218, y1: 219, y2: 420

Skipping this object

Top left coordinate must be less than bottom right.Error in object 1 of label_file /workspace/converted_datasets/train/labels/25_mask.txt.

Coordinates: x1 = 778, x2 = 572, y1: 141, y2: 364

Skipping this object

2020-09-24 16:59:55,828 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 7

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/208_mask.txt.

Coordinates: x1 = 655, x2 = 119, y1: 71, y2: 544

Skipping this object

2020-09-24 16:59:55,900 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 8

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/350_mask.txt.

Coordinates: x1 = 732, x2 = 525, y1: 83, y2: 264

Skipping this object

2020-09-24 16:59:55,974 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 9

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/117_mask.txt.

Coordinates: x1 = 640, x2 = 435, y1: 108, y2: 341

Skipping this object

2020-09-24 16:59:56,056 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO -

Wrote the following numbers of objects:

b'no-mask': 749

b'mask': 947

2020-09-24 16:59:56,056 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 0

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/85_mask.txt.

Coordinates: x1 = 862, x2 = 645, y1: 82, y2: 292

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/165_mask.txt.

Coordinates: x1 = 360, x2 = 249, y1: 194, y2: 325

Skipping this object

Top left coordinate must be less than bottom right.Error in object 1 of label_file /workspace/converted_datasets/train/labels/165_mask.txt.

Coordinates: x1 = 714, x2 = 584, y1: 108, y2: 257

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/230_mask.txt.

Coordinates: x1 = 357, x2 = 175, y1: 127, y2: 342

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/train_00000526.txt.

Coordinates: x1 = 560, x2 = 31, y1: 161, y2: 397

Skipping this object

2020-09-24 16:59:56,356 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 1

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/325_mask.txt.

Coordinates: x1 = 729, x2 = 215, y1: 168, y2: 444

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/train_00000304.txt.

Coordinates: x1 = 434, x2 = 60, y1: 96, y2: 464

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/train_00000572.txt.

Coordinates: x1 = 639, x2 = 57, y1: 117, y2: 307

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/140_mask.txt.

Coordinates: x1 = 683, x2 = 508, y1: 138, y2: 377

Skipping this object

2020-09-24 16:59:56,660 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 2

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/420_mask.txt.

Coordinates: x1 = 639, x2 = 318, y1: 232, y2: 506

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/76_mask.txt.

Coordinates: x1 = 543, x2 = 425, y1: 170, y2: 321

Skipping this object

Top left coordinate must be less than bottom right.Error in object 1 of label_file /workspace/converted_datasets/train/labels/76_mask.txt.

Coordinates: x1 = 364, x2 = 202, y1: 95, y2: 250

Skipping this object

2020-09-24 16:59:56,962 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 3

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/114_mask.txt.

Coordinates: x1 = 739, x2 = 478, y1: 90, y2: 388

Skipping this object

2020-09-24 16:59:57,269 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 4

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/396_mask.txt.

Coordinates: x1 = 635, x2 = 330, y1: 185, y2: 508

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/train_00005401.txt.

Coordinates: x1 = 340, x2 = 72, y1: 175, y2: 360

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/109_mask.txt.

Coordinates: x1 = 818, x2 = 586, y1: 139, y2: 413

Skipping this object

2020-09-24 16:59:57,566 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 5

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/55_mask.txt.

Coordinates: x1 = 363, x2 = 348, y1: 68, y2: 82

Skipping this object

Top left coordinate must be less than bottom right.Error in object 1 of label_file /workspace/converted_datasets/train/labels/55_mask.txt.

Coordinates: x1 = 566, x2 = 530, y1: 133, y2: 176

Skipping this object

Top left coordinate must be less than bottom right.Error in object 2 of label_file /workspace/converted_datasets/train/labels/55_mask.txt.

Coordinates: x1 = 462, x2 = 420, y1: 127, y2: 172

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/335_mask.txt.

Coordinates: x1 = 716, x2 = 316, y1: 100, y2: 279

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/5_mask.txt.

Coordinates: x1 = 408, x2 = 283, y1: 151, y2: 311

Skipping this object

Top left coordinate must be less than bottom right.Error in object 1 of label_file /workspace/converted_datasets/train/labels/5_mask.txt.

Coordinates: x1 = 742, x2 = 667, y1: 194, y2: 271

Skipping this object

Top left coordinate must be less than bottom right.Error in object 2 of label_file /workspace/converted_datasets/train/labels/5_mask.txt.

Coordinates: x1 = 899, x2 = 791, y1: 273, y2: 389

Skipping this object

Top left coordinate must be less than bottom right.Error in object 3 of label_file /workspace/converted_datasets/train/labels/5_mask.txt.

Coordinates: x1 = 770, x2 = 704, y1: 112, y2: 180

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/136_mask.txt.

Coordinates: x1 = 751, x2 = 537, y1: 77, y2: 307

Skipping this object

2020-09-24 16:59:57,872 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 6

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/57_mask.txt.

Coordinates: x1 = 604, x2 = 499, y1: 81, y2: 214

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/249_mask.txt.

Coordinates: x1 = 697, x2 = 342, y1: 66, y2: 233

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/61_mask.txt.

Coordinates: x1 = 633, x2 = 441, y1: 183, y2: 338

Skipping this object

2020-09-24 16:59:58,173 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 7

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/493_mask.txt.

Coordinates: x1 = 725, x2 = 385, y1: 71, y2: 371

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/56_mask.txt.

Coordinates: x1 = 357, x2 = 306, y1: 241, y2: 295

Skipping this object

Top left coordinate must be less than bottom right.Error in object 1 of label_file /workspace/converted_datasets/train/labels/56_mask.txt.

Coordinates: x1 = 162, x2 = 100, y1: 261, y2: 310

Skipping this object

Top left coordinate must be less than bottom right.Error in object 2 of label_file /workspace/converted_datasets/train/labels/56_mask.txt.

Coordinates: x1 = 487, x2 = 412, y1: 178, y2: 259

Skipping this object

Top left coordinate must be less than bottom right.Error in object 3 of label_file /workspace/converted_datasets/train/labels/56_mask.txt.

Coordinates: x1 = 97, x2 = 69, y1: 217, y2: 244

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/59_mask.txt.

Coordinates: x1 = 664, x2 = 532, y1: 167, y2: 328

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/103_mask.txt.

Coordinates: x1 = 423, x2 = 216, y1: 122, y2: 324

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/226_mask.txt.

Coordinates: x1 = 907, x2 = 422, y1: 139, y2: 544

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/123_mask.txt.

Coordinates: x1 = 749, x2 = 516, y1: 169, y2: 411

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/331_mask.txt.

Coordinates: x1 = 833, x2 = 545, y1: 162, y2: 472

Skipping this object

2020-09-24 16:59:58,481 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 8

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/142_mask.txt.

Coordinates: x1 = 586, x2 = 468, y1: 228, y2: 353

Skipping this object

Top left coordinate must be less than bottom right.Error in object 1 of label_file /workspace/converted_datasets/train/labels/142_mask.txt.

Coordinates: x1 = 759, x2 = 655, y1: 134, y2: 255

Skipping this object

Top left coordinate must be less than bottom right.Error in object 2 of label_file /workspace/converted_datasets/train/labels/142_mask.txt.

Coordinates: x1 = 237, x2 = 127, y1: 282, y2: 416

Skipping this object

Top left coordinate must be less than bottom right.Error in object 3 of label_file /workspace/converted_datasets/train/labels/142_mask.txt.

Coordinates: x1 = 419, x2 = 323, y1: 119, y2: 239

Skipping this object

Top left coordinate must be less than bottom right.Error in object 4 of label_file /workspace/converted_datasets/train/labels/142_mask.txt.

Coordinates: x1 = 154, x2 = 51, y1: 157, y2: 271

Skipping this object

Top left coordinate must be less than bottom right.Error in object 5 of label_file /workspace/converted_datasets/train/labels/142_mask.txt.

Coordinates: x1 = 929, x2 = 827, y1: 278, y2: 399

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/63_mask.txt.

Coordinates: x1 = 669, x2 = 440, y1: 131, y2: 354

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/445_mask.txt.

Coordinates: x1 = 741, x2 = 451, y1: 103, y2: 392

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/418_mask.txt.

Coordinates: x1 = 726, x2 = 343, y1: 114, y2: 329

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/346_mask.txt.

Coordinates: x1 = 573, x2 = 343, y1: 117, y2: 447

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/58_mask.txt.

Coordinates: x1 = 509, x2 = 443, y1: 136, y2: 205

Skipping this object

Top left coordinate must be less than bottom right.Error in object 1 of label_file /workspace/converted_datasets/train/labels/58_mask.txt.

Coordinates: x1 = 247, x2 = 169, y1: 45, y2: 123

Skipping this object

Top left coordinate must be less than bottom right.Error in object 2 of label_file /workspace/converted_datasets/train/labels/58_mask.txt.

Coordinates: x1 = 814, x2 = 735, y1: 82, y2: 170

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/469_mask.txt.

Coordinates: x1 = 732, x2 = 226, y1: 116, y2: 504

Skipping this object

2020-09-24 16:59:58,779 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 9

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/7_mask.txt.

Coordinates: x1 = 773, x2 = 482, y1: 85, y2: 386

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/154_mask.txt.

Coordinates: x1 = 766, x2 = 573, y1: 84, y2: 323

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/82_mask.txt.

Coordinates: x1 = 884, x2 = 828, y1: 229, y2: 302

Skipping this object

Top left coordinate must be less than bottom right.Error in object 1 of label_file /workspace/converted_datasets/train/labels/82_mask.txt.

Coordinates: x1 = 466, x2 = 288, y1: 113, y2: 359

Skipping this object

Top left coordinate must be less than bottom right.Error in object 2 of label_file /workspace/converted_datasets/train/labels/82_mask.txt.

Coordinates: x1 = 676, x2 = 578, y1: 227, y2: 361

Skipping this object

Top left coordinate must be less than bottom right.Error in object 3 of label_file /workspace/converted_datasets/train/labels/82_mask.txt.

Coordinates: x1 = 574, x2 = 530, y1: 229, y2: 300

Skipping this object

Top left coordinate must be less than bottom right.Error in object 4 of label_file /workspace/converted_datasets/train/labels/82_mask.txt.

Coordinates: x1 = 90, x2 = 62, y1: 216, y2: 252

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/83_mask.txt.

Coordinates: x1 = 813, x2 = 675, y1: 160, y2: 276

Skipping this object

Top left coordinate must be less than bottom right.Error in object 1 of label_file /workspace/converted_datasets/train/labels/83_mask.txt.

Coordinates: x1 = 514, x2 = 459, y1: 191, y2: 253

Skipping this object

Top left coordinate must be less than bottom right.Error in object 2 of label_file /workspace/converted_datasets/train/labels/83_mask.txt.

Coordinates: x1 = 270, x2 = 141, y1: 183, y2: 292

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/72_mask.txt.

Coordinates: x1 = 152, x2 = 98, y1: 133, y2: 192

Skipping this object

Top left coordinate must be less than bottom right.Error in object 1 of label_file /workspace/converted_datasets/train/labels/72_mask.txt.

Coordinates: x1 = 530, x2 = 468, y1: 131, y2: 205

Skipping this object

Top left coordinate must be less than bottom right.Error in object 2 of label_file /workspace/converted_datasets/train/labels/72_mask.txt.

Coordinates: x1 = 451, x2 = 408, y1: 145, y2: 200

Skipping this object

Top left coordinate must be less than bottom right.Error in object 3 of label_file /workspace/converted_datasets/train/labels/72_mask.txt.

Coordinates: x1 = 278, x2 = 224, y1: 170, y2: 231

Skipping this object

Top left coordinate must be less than bottom right.Error in object 4 of label_file /workspace/converted_datasets/train/labels/72_mask.txt.

Coordinates: x1 = 793, x2 = 721, y1: 129, y2: 214

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/68_mask.txt.

Coordinates: x1 = 848, x2 = 783, y1: 78, y2: 155

Skipping this object

Top left coordinate must be less than bottom right.Error in object 1 of label_file /workspace/converted_datasets/train/labels/68_mask.txt.

Coordinates: x1 = 243, x2 = 159, y1: 85, y2: 178

Skipping this object

Top left coordinate must be less than bottom right.Error in object 2 of label_file /workspace/converted_datasets/train/labels/68_mask.txt.

Coordinates: x1 = 636, x2 = 560, y1: 38, y2: 133

Skipping this object

Top left coordinate must be less than bottom right.Error in object 3 of label_file /workspace/converted_datasets/train/labels/68_mask.txt.

Coordinates: x1 = 527, x2 = 462, y1: 93, y2: 162

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/66_mask.txt.

Coordinates: x1 = 465, x2 = 236, y1: 20, y2: 304

Skipping this object

Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/converted_datasets/train/labels/8_mask.txt.

Coordinates: x1 = 743, x2 = 685, y1: 192, y2: 260

Skipping this object

Top left coordinate must be less than bottom right.Error in object 1 of label_file /workspace/converted_datasets/train/labels/8_mask.txt.

Coordinates: x1 = 194, x2 = 128, y1: 124, y2: 192

Skipping this object

Top left coordinate must be less than bottom right.Error in object 2 of label_file /workspace/converted_datasets/train/labels/8_mask.txt.

Coordinates: x1 = 575, x2 = 511, y1: 161, y2: 225

Skipping this object

Top left coordinate must be less than bottom right.Error in object 3 of label_file /workspace/converted_datasets/train/labels/8_mask.txt.

Coordinates: x1 = 435, x2 = 380, y1: 125, y2: 177

Skipping this object

Top left coordinate must be less than bottom right.Error in object 4 of label_file /workspace/converted_datasets/train/labels/8_mask.txt.

Coordinates: x1 = 888, x2 = 830, y1: 154, y2: 218

Skipping this object

Top left coordinate must be less than bottom right.Error in object 5 of label_file /workspace/converted_datasets/train/labels/8_mask.txt.

Coordinates: x1 = 223, x2 = 197, y1: 165, y2: 198

Skipping this object

Top left coordinate must be less than bottom right.Error in object 6 of label_file /workspace/converted_datasets/train/labels/8_mask.txt.

Coordinates: x1 = 355, x2 = 289, y1: 124, y2: 182

Skipping this object

2020-09-24 16:59:59,084 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO -

Wrote the following numbers of objects:

b'no-mask': 3005

b'mask': 4008

2020-09-24 16:59:59,084 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Cumulative object statistics

2020-09-24 16:59:59,084 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO -

Wrote the following numbers of objects:

b'no-mask': 3754

b'mask': 4955

2020-09-24 16:59:59,085 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Class map.

Label in GT: Label in tfrecords file

b'No-Mask': b'no-mask'

b'Mask': b'mask'

For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap.

2020-09-24 16:59:59,085 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Tfrecords generation complete.

In [15]:

!ls -rlt $DATA_DOWNLOAD_DIR/tfrecords/kitti_trainval/

Out[15]:

total 2448

-rw-r--r-- 1 root root  48642 Sep 24 16:59 -fold-000-of-002-shard-00000-of-00010

-rw-r--r-- 1 root root  48250 Sep 24 16:59 -fold-000-of-002-shard-00001-of-00010

-rw-r--r-- 1 root root  46971 Sep 24 16:59 -fold-000-of-002-shard-00002-of-00010

-rw-r--r-- 1 root root  51023 Sep 24 16:59 -fold-000-of-002-shard-00003-of-00010

-rw-r--r-- 1 root root  50996 Sep 24 16:59 -fold-000-of-002-shard-00004-of-00010

-rw-r--r-- 1 root root  45608 Sep 24 16:59 -fold-000-of-002-shard-00005-of-00010

-rw-r--r-- 1 root root  50902 Sep 24 16:59 -fold-000-of-002-shard-00006-of-00010

-rw-r--r-- 1 root root  46945 Sep 24 16:59 -fold-000-of-002-shard-00007-of-00010

-rw-r--r-- 1 root root  48181 Sep 24 16:59 -fold-000-of-002-shard-00008-of-00010

-rw-r--r-- 1 root root  52662 Sep 24 16:59 -fold-000-of-002-shard-00009-of-00010

-rw-r--r-- 1 root root 190986 Sep 24 16:59 -fold-001-of-002-shard-00000-of-00010

-rw-r--r-- 1 root root 201998 Sep 24 16:59 -fold-001-of-002-shard-00001-of-00010

-rw-r--r-- 1 root root 195912 Sep 24 16:59 -fold-001-of-002-shard-00002-of-00010

-rw-r--r-- 1 root root 206971 Sep 24 16:59 -fold-001-of-002-shard-00003-of-00010

-rw-r--r-- 1 root root 193771 Sep 24 16:59 -fold-001-of-002-shard-00004-of-00010

-rw-r--r-- 1 root root 200812 Sep 24 16:59 -fold-001-of-002-shard-00005-of-00010

-rw-r--r-- 1 root root 192832 Sep 24 16:59 -fold-001-of-002-shard-00006-of-00010

-rw-r--r-- 1 root root 198145 Sep 24 16:59 -fold-001-of-002-shard-00007-of-00010

-rw-r--r-- 1 root root 192654 Sep 24 16:59 -fold-001-of-002-shard-00008-of-00010

-rw-r--r-- 1 root root 200646 Sep 24 16:59 -fold-001-of-002-shard-00009-of-00010

Step 6 - Download Pre-Trained Model

We're nearly there! Ready to retrain using TLT and get our our face-mask detector model! Let's start by heading to the NGC Catalog and grabbing the latest Object Detection model.

You can view all of the TLT Object Detection models on the NGC Catalog here.

Our DetectNet models can be seen here.

Note

For DetectNet_v2, the input is expected to be 0-1 normalized with input channels in RGB order. Therefore, for optimum results please download models with *_detectnet_v2 in their name string. All other models expect input preprocessing with mean subtraction and input channels in BGR order. Thus, using them as pretrained weights may result in suboptimal performance.

In [16]:

# List available detectnet models in NGC
!ngc registry model list nvidia/tlt_pretrained_detectnet_v2:*

Out[16]:

+-------+-------+-------+-------+-------+-------+-------+-------+-------+

| Versi | Accur | Epoch | Batch | GPU   | Memor | File  | Statu | Creat |

| on    | acy   | s     | Size  | Model | y Foo | Size  | s     | ed    |

|       |       |       |       |       | tprin |       |       | Date  |

|       |       |       |       |       | t     |       |       |       |

+-------+-------+-------+-------+-------+-------+-------+-------+-------+

| resne | 79.5  | 80    | 1     | V100  | 163.6 | 163.5 | UPLOA | Aug   |

| t34   |       |       |       |       |       | 5 MB  | D_COM | 03,   |

|       |       |       |       |       |       |       | PLETE | 2020  |

| resne | 79.2  | 80    | 1     | V100  | 38.3  | 38.34 | UPLOA | Apr   |

| t10   |       |       |       |       |       | MB    | D_COM | 29,   |

|       |       |       |       |       |       |       | PLETE | 2020  |

| resne | 79.0  | 80    | 1     | V100  | 89.0  | 89.02 | UPLOA | Apr   |

| t18   |       |       |       |       |       | MB    | D_COM | 29,   |

|       |       |       |       |       |       |       | PLETE | 2020  |

| resne | 82.7  | 80    | 1     | V100  | 294.5 | 294.5 | UPLOA | Apr   |

| t50   |       |       |       |       |       | 3 MB  | D_COM | 29,   |

|       |       |       |       |       |       |       | PLETE | 2020  |

| vgg16 | 82.2  | 80    | 1     | V100  | 113.2 | 113.2 | UPLOA | Apr   |

|       |       |       |       |       |       | MB    | D_COM | 29,   |

|       |       |       |       |       |       |       | PLETE | 2020  |

| vgg19 | 82.6  | 80    | 1     | V100  | 153.8 | 153.7 | UPLOA | Apr   |

|       |       |       |       |       |       | 7 MB  | D_COM | 29,   |

|       |       |       |       |       |       |       | PLETE | 2020  |

| mobil | 79.5  | 80    | 1     | V100  | 13.4  | 13.37 | UPLOA | Apr   |

| enet_ |       |       |       |       |       | MB    | D_COM | 29,   |

| v1    |       |       |       |       |       |       | PLETE | 2020  |

| mobil | 77.5  | 80    | 1     | V100  | 5.1   | 5.1   | UPLOA | Apr   |

| enet_ |       |       |       |       |       | MB    | D_COM | 29,   |

| v2    |       |       |       |       |       |       | PLETE | 2020  |

| googl | 82.2  | 80    | 1     | V100  | 47.7  | 47.74 | UPLOA | Apr   |

| enet  |       |       |       |       |       | MB    | D_COM | 29,   |

|       |       |       |       |       |       |       | PLETE | 2020  |

| squee | 65.67 | 80    | 1     | V100  | 6.5   | 6.46  | UPLOA | Apr   |

| zenet |       |       |       |       |       | MB    | D_COM | 29,   |

|       |       |       |       |       |       |       | PLETE | 2020  |

| darkn | 76.44 | 80    | 1     | V100  | 467.3 | 467.3 | UPLOA | Apr   |

| et53  |       |       |       |       |       | 2 MB  | D_COM | 29,   |

|       |       |       |       |       |       |       | PLETE | 2020  |

| darkn | 77.52 | 80    | 1     | V100  | 229.1 | 229.1 | UPLOA | Apr   |

| et19  |       |       |       |       |       | 5 MB  | D_COM | 29,   |

|       |       |       |       |       |       |       | PLETE | 2020  |

+-------+-------+-------+-------+-------+-------+-------+-------+-------+

In [17]:

# Create the target destination to download the model.
!mkdir -p $USER_EXPERIMENT_DIR/pretrained_resnet18/

In [18]:

# Download the pretrained model from NGC
!ngc registry model download-version nvidia/tlt_pretrained_detectnet_v2:resnet18 \
    --dest $USER_EXPERIMENT_DIR/pretrained_resnet18

Out[18]:

Downloaded 82.28 MB in 7s, Download speed: 11.74 MB/s

----------------------------------------------------

Transfer id: tlt_pretrained_detectnet_v2_vresnet18 Download status: Completed.

Downloaded local path: /workspace/detectnet_v2/pretrained_resnet18/tlt_pretrained_detectnet_v2_vresnet18

Total files downloaded: 1

Total downloaded size: 82.28 MB

Started at: 2020-09-24 17:00:40.111703

Completed at: 2020-09-24 17:00:47.122209

Duration taken: 7s

----------------------------------------------------

In [19]:

!ls -rlt $USER_EXPERIMENT_DIR/pretrained_resnet18/tlt_pretrained_detectnet_v2_vresnet18

Out[19]:

total 91160

-rw------- 1 root root 93345248 Sep 24 17:00 resnet18.hdf5

Step 7 - Provide Training Specs File

We must supply a training specification file to TLT in order to start retraining our model. You can see that the specs file contains the following information:

TFRecords for the training datasets
- In order to use the newly generated TFRecords, update the dataset_config parameter in the spec file
- Update the fold number to use for evaluation. In case of random data split, please use fold 0 only
For sequence-wise split, you may use any fold generated from the dataset convert tool
Pre-trained models
Augmentation parameters for on the fly data augmentation
Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.

We need to update our training spec file to point to the model weights we just downloaded from the NGC Catalog.

Let's change the following parameters by editing the file workspace/detectnet_v2/specs/detectnet_v2_train_resnet18_kitti.txt directly:

tfrecords_path = "/workspace/data/tfrecords/kitti_trainval/*"
image_directory_path = "/workspace/converted_datasets/train"
pretrained_model_file = "/workspace/detectnet_v2/pretrained_resnet18/tlt_pretrained_detectnet_v2_vresnet18/resnet18.hdf5"

Let's take a look at the contents of our specs file by running the code cell below:

In [20]:

!cat $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt

Out[20]:

random_seed: 42

dataset_config {

  data_sources {

    tfrecords_path: "/workspace/data/tfrecords/kitti_trainval/*"

    image_directory_path: "/workspace/converted_datasets/train"

  image_extension: "jpg"

  target_class_mapping {

    key: "mask"

    value: "mask"

  target_class_mapping {

    key: "no-mask"

    value: "no-mask"

  validation_fold: 0

  #validation_data_source: {

    #tfrecords_path: "/home/data/tfrecords/kitti_val/*"

    #image_directory_path: "/home/data/test"

#}

augmentation_config {

  preprocessing {

    output_image_width: 960

    output_image_height: 544

    min_bbox_width: 1.0

    min_bbox_height: 1.0

    output_image_channel: 3

  spatial_augmentation {

    hflip_probability: 0.5

    vflip_probability: 0.0

    zoom_min: 1.0

    zoom_max: 1.0

    translate_max_x: 8.0

    translate_max_y: 8.0

  color_augmentation {

    hue_rotation_max: 25.0

    saturation_shift_max: 0.20000000298

    contrast_scale_max: 0.10000000149

    contrast_center: 0.5

postprocessing_config {

  target_class_config {

    key: "mask"

    value {

      clustering_config {

        coverage_threshold: 0.00499999988824

        dbscan_eps: 0.20000000298

        dbscan_min_samples: 0.0500000007451

        minimum_bounding_box_height: 20

  target_class_config {

    key: "no-mask"

    value {

      clustering_config {

        coverage_threshold: 0.00499999988824

        dbscan_eps: 0.15000000596

        dbscan_min_samples: 0.0500000007451

        minimum_bounding_box_height: 20

model_config {

  pretrained_model_file: "/workspace/detectnet_v2/pretrained_resnet18/tlt_pretrained_detectnet_v2_vresnet18/resnet18.hdf5"

  num_layers: 18

  use_batch_norm: true

  objective_set {

    bbox {

      scale: 35.0

      offset: 0.5

    cov {

  training_precision {

    backend_floatx: FLOAT32

  arch: "resnet"

evaluation_config {

  validation_period_during_training: 10

  first_validation_epoch: 10

  minimum_detection_ground_truth_overlap {

    key: "mask"

    value: 0.5

  minimum_detection_ground_truth_overlap {

    key: "no-mask"

    value: 0.5

  evaluation_box_config {

    key: "mask"

    value {

      minimum_height: 20

      maximum_height: 9999

      minimum_width: 10

      maximum_width: 9999

  evaluation_box_config {

    key: "no-mask"

    value {

      minimum_height: 20

      maximum_height: 9999

      minimum_width: 10

      maximum_width: 9999

  average_precision_mode: INTEGRATE

cost_function_config {

  target_classes {

    name: "mask"

    class_weight: 1.0

    coverage_foreground_weight: 0.0500000007451

    objectives {

      name: "cov"

      initial_weight: 1.0

      weight_target: 1.0

    objectives {

      name: "bbox"

      initial_weight: 10.0

      weight_target: 10.0

  target_classes {

    name: "no-mask"

    class_weight: 8.0

    coverage_foreground_weight: 0.0500000007451

    objectives {

      name: "cov"

      initial_weight: 1.0

      weight_target: 1.0

    objectives {

      name: "bbox"

      initial_weight: 10.0

      weight_target: 1.0

  enable_autoweighting: true

  max_objective_weight: 0.999899983406

  min_objective_weight: 9.99999974738e-05

training_config {

  batch_size_per_gpu: 24

  num_epochs: 120

  learning_rate {

    soft_start_annealing_schedule {

      min_learning_rate: 5e-06

      max_learning_rate: 5e-04

      soft_start: 0.10000000149

      annealing: 0.699999988079

  regularizer {

    type: L1

    weight: 3.00000002618e-09

  optimizer {

    adam {

      epsilon: 9.99999993923e-09

      beta1: 0.899999976158

      beta2: 0.999000012875

  cost_scaling {

    initial_exponent: 20.0

    increment: 0.005

    decrement: 1.0

  checkpoint_interval: 10

bbox_rasterizer_config {

  target_class_config {

    key: "mask"

    value {

      cov_center_x: 0.5

      cov_center_y: 0.5

      cov_radius_x: 0.40000000596

      cov_radius_y: 0.40000000596

      bbox_min_radius: 1.0

  target_class_config {

    key: "no-mask"

    value {

      cov_center_x: 0.5

      cov_center_y: 0.5

      cov_radius_x: 1.0

      cov_radius_y: 1.0

      bbox_min_radius: 1.0

  deadzone_radius: 0.400000154972

Step 8 - Run TLT Training

We're finally here!!!!

It's time to train our model.

We need to provide the following information to TLT to start our experiment:

Training Specification File (viewed above)
Output directory location for the retrained models
The model's key we specified earlier
The network type we're training against.

note The training may take hours to complete. Also, the remaining notebook, assumes that the training was done in single-GPU mode.

In [21]:

!tlt-train detectnet_v2 -e $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt \
                        -r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \
                        -k $KEY \
                        -n resnet18_detector

Out[21]:

Using TensorFlow backend.

2020-09-24 17:02:19.084587: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

--------------------------------------------------------------------------

[[60279,1],0]: A high-performance Open MPI point-to-point messaging module

was unable to find any relevant network interfaces:

Module: OpenFabrics (openib)

  Host: 28071f6116a0

Another transport will be used instead, although this may result in

lower performance.

NOTE: You can disable this warning by setting the MCA parameter

btl_base_warn_component_unused to 0.

--------------------------------------------------------------------------

2020-09-24 17:02:23.617020: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1

2020-09-24 17:02:23.660704: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 17:02:23.661689: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 17:02:23.661736: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 17:02:23.661809: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 17:02:23.700268: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 17:02:23.714969: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 17:02:23.806550: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 17:02:23.878853: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 17:02:23.879044: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 17:02:23.879219: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 17:02:23.880282: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 17:02:23.881209: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 17:02:23.882085: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 17:02:27.823575: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-09-24 17:02:27.823635: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0

2020-09-24 17:02:27.823654: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N

2020-09-24 17:02:27.823930: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 17:02:27.825006: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 17:02:27.826035: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 17:02:27.826964: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14636 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0)

2020-09-24 17:02:27,829 [INFO] iva.detectnet_v2.scripts.train: Loading experiment spec at /workspace/detectnet_v2/specs/detectnet_v2_train_resnet18_kitti.txt.

2020-09-24 17:02:27,831 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /workspace/detectnet_v2/specs/detectnet_v2_train_resnet18_kitti.txt

2020-09-24 17:02:28,184 [INFO] iva.detectnet_v2.scripts.train: Cannot iterate over exactly 2827 samples with a batch size of 24; each epoch will therefore take one extra step.

__________________________________________________________________________________________________

Layer (type)                    Output Shape         Param #     Connected to

==================================================================================================

input_1 (InputLayer)            (None, 3, 544, 960)  0

__________________________________________________________________________________________________

conv1 (Conv2D)                  (None, 64, 272, 480) 9472        input_1[0][0]

__________________________________________________________________________________________________

bn_conv1 (BatchNormalization)   (None, 64, 272, 480) 256         conv1[0][0]

__________________________________________________________________________________________________

activation_1 (Activation)       (None, 64, 272, 480) 0           bn_conv1[0][0]

__________________________________________________________________________________________________

block_1a_conv_1 (Conv2D)        (None, 64, 136, 240) 36928       activation_1[0][0]

__________________________________________________________________________________________________

block_1a_bn_1 (BatchNormalizati (None, 64, 136, 240) 256         block_1a_conv_1[0][0]

__________________________________________________________________________________________________

block_1a_relu_1 (Activation)    (None, 64, 136, 240) 0           block_1a_bn_1[0][0]

__________________________________________________________________________________________________

block_1a_conv_2 (Conv2D)        (None, 64, 136, 240) 36928       block_1a_relu_1[0][0]

__________________________________________________________________________________________________

block_1a_conv_shortcut (Conv2D) (None, 64, 136, 240) 4160        activation_1[0][0]

__________________________________________________________________________________________________

block_1a_bn_2 (BatchNormalizati (None, 64, 136, 240) 256         block_1a_conv_2[0][0]

__________________________________________________________________________________________________

block_1a_bn_shortcut (BatchNorm (None, 64, 136, 240) 256         block_1a_conv_shortcut[0][0]

__________________________________________________________________________________________________

add_1 (Add)                     (None, 64, 136, 240) 0           block_1a_bn_2[0][0]

                                                                 block_1a_bn_shortcut[0][0]

__________________________________________________________________________________________________

block_1a_relu (Activation)      (None, 64, 136, 240) 0           add_1[0][0]

__________________________________________________________________________________________________

block_1b_conv_1 (Conv2D)        (None, 64, 136, 240) 36928       block_1a_relu[0][0]

__________________________________________________________________________________________________

block_1b_bn_1 (BatchNormalizati (None, 64, 136, 240) 256         block_1b_conv_1[0][0]

__________________________________________________________________________________________________

block_1b_relu_1 (Activation)    (None, 64, 136, 240) 0           block_1b_bn_1[0][0]

__________________________________________________________________________________________________

block_1b_conv_2 (Conv2D)        (None, 64, 136, 240) 36928       block_1b_relu_1[0][0]

__________________________________________________________________________________________________

block_1b_bn_2 (BatchNormalizati (None, 64, 136, 240) 256         block_1b_conv_2[0][0]

__________________________________________________________________________________________________

add_2 (Add)                     (None, 64, 136, 240) 0           block_1b_bn_2[0][0]

                                                                 block_1a_relu[0][0]

__________________________________________________________________________________________________

block_1b_relu (Activation)      (None, 64, 136, 240) 0           add_2[0][0]

__________________________________________________________________________________________________

block_2a_conv_1 (Conv2D)        (None, 128, 68, 120) 73856       block_1b_relu[0][0]

__________________________________________________________________________________________________

block_2a_bn_1 (BatchNormalizati (None, 128, 68, 120) 512         block_2a_conv_1[0][0]

__________________________________________________________________________________________________

block_2a_relu_1 (Activation)    (None, 128, 68, 120) 0           block_2a_bn_1[0][0]

__________________________________________________________________________________________________

block_2a_conv_2 (Conv2D)        (None, 128, 68, 120) 147584      block_2a_relu_1[0][0]

__________________________________________________________________________________________________

block_2a_conv_shortcut (Conv2D) (None, 128, 68, 120) 8320        block_1b_relu[0][0]

__________________________________________________________________________________________________

block_2a_bn_2 (BatchNormalizati (None, 128, 68, 120) 512         block_2a_conv_2[0][0]

__________________________________________________________________________________________________

block_2a_bn_shortcut (BatchNorm (None, 128, 68, 120) 512         block_2a_conv_shortcut[0][0]

__________________________________________________________________________________________________

add_3 (Add)                     (None, 128, 68, 120) 0           block_2a_bn_2[0][0]

                                                                 block_2a_bn_shortcut[0][0]

__________________________________________________________________________________________________

block_2a_relu (Activation)      (None, 128, 68, 120) 0           add_3[0][0]

__________________________________________________________________________________________________

block_2b_conv_1 (Conv2D)        (None, 128, 68, 120) 147584      block_2a_relu[0][0]

__________________________________________________________________________________________________

block_2b_bn_1 (BatchNormalizati (None, 128, 68, 120) 512         block_2b_conv_1[0][0]

__________________________________________________________________________________________________

block_2b_relu_1 (Activation)    (None, 128, 68, 120) 0           block_2b_bn_1[0][0]

__________________________________________________________________________________________________

block_2b_conv_2 (Conv2D)        (None, 128, 68, 120) 147584      block_2b_relu_1[0][0]

__________________________________________________________________________________________________

block_2b_bn_2 (BatchNormalizati (None, 128, 68, 120) 512         block_2b_conv_2[0][0]

__________________________________________________________________________________________________

add_4 (Add)                     (None, 128, 68, 120) 0           block_2b_bn_2[0][0]

                                                                 block_2a_relu[0][0]

__________________________________________________________________________________________________

block_2b_relu (Activation)      (None, 128, 68, 120) 0           add_4[0][0]

__________________________________________________________________________________________________

block_3a_conv_1 (Conv2D)        (None, 256, 34, 60)  295168      block_2b_relu[0][0]

__________________________________________________________________________________________________

block_3a_bn_1 (BatchNormalizati (None, 256, 34, 60)  1024        block_3a_conv_1[0][0]

__________________________________________________________________________________________________

block_3a_relu_1 (Activation)    (None, 256, 34, 60)  0           block_3a_bn_1[0][0]

__________________________________________________________________________________________________

block_3a_conv_2 (Conv2D)        (None, 256, 34, 60)  590080      block_3a_relu_1[0][0]

__________________________________________________________________________________________________

block_3a_conv_shortcut (Conv2D) (None, 256, 34, 60)  33024       block_2b_relu[0][0]

__________________________________________________________________________________________________

block_3a_bn_2 (BatchNormalizati (None, 256, 34, 60)  1024        block_3a_conv_2[0][0]

__________________________________________________________________________________________________

block_3a_bn_shortcut (BatchNorm (None, 256, 34, 60)  1024        block_3a_conv_shortcut[0][0]

__________________________________________________________________________________________________

add_5 (Add)                     (None, 256, 34, 60)  0           block_3a_bn_2[0][0]

                                                                 block_3a_bn_shortcut[0][0]

__________________________________________________________________________________________________

block_3a_relu (Activation)      (None, 256, 34, 60)  0           add_5[0][0]

__________________________________________________________________________________________________

block_3b_conv_1 (Conv2D)        (None, 256, 34, 60)  590080      block_3a_relu[0][0]

__________________________________________________________________________________________________

block_3b_bn_1 (BatchNormalizati (None, 256, 34, 60)  1024        block_3b_conv_1[0][0]

__________________________________________________________________________________________________

block_3b_relu_1 (Activation)    (None, 256, 34, 60)  0           block_3b_bn_1[0][0]

__________________________________________________________________________________________________

block_3b_conv_2 (Conv2D)        (None, 256, 34, 60)  590080      block_3b_relu_1[0][0]

__________________________________________________________________________________________________

block_3b_bn_2 (BatchNormalizati (None, 256, 34, 60)  1024        block_3b_conv_2[0][0]

__________________________________________________________________________________________________

add_6 (Add)                     (None, 256, 34, 60)  0           block_3b_bn_2[0][0]

                                                                 block_3a_relu[0][0]

__________________________________________________________________________________________________

block_3b_relu (Activation)      (None, 256, 34, 60)  0           add_6[0][0]

__________________________________________________________________________________________________

block_4a_conv_1 (Conv2D)        (None, 512, 34, 60)  1180160     block_3b_relu[0][0]

__________________________________________________________________________________________________

block_4a_bn_1 (BatchNormalizati (None, 512, 34, 60)  2048        block_4a_conv_1[0][0]

__________________________________________________________________________________________________

block_4a_relu_1 (Activation)    (None, 512, 34, 60)  0           block_4a_bn_1[0][0]

__________________________________________________________________________________________________

block_4a_conv_2 (Conv2D)        (None, 512, 34, 60)  2359808     block_4a_relu_1[0][0]

__________________________________________________________________________________________________

block_4a_conv_shortcut (Conv2D) (None, 512, 34, 60)  131584      block_3b_relu[0][0]

__________________________________________________________________________________________________

block_4a_bn_2 (BatchNormalizati (None, 512, 34, 60)  2048        block_4a_conv_2[0][0]

__________________________________________________________________________________________________

block_4a_bn_shortcut (BatchNorm (None, 512, 34, 60)  2048        block_4a_conv_shortcut[0][0]

__________________________________________________________________________________________________

add_7 (Add)                     (None, 512, 34, 60)  0           block_4a_bn_2[0][0]

                                                                 block_4a_bn_shortcut[0][0]

__________________________________________________________________________________________________

block_4a_relu (Activation)      (None, 512, 34, 60)  0           add_7[0][0]

__________________________________________________________________________________________________

block_4b_conv_1 (Conv2D)        (None, 512, 34, 60)  2359808     block_4a_relu[0][0]

__________________________________________________________________________________________________

block_4b_bn_1 (BatchNormalizati (None, 512, 34, 60)  2048        block_4b_conv_1[0][0]

__________________________________________________________________________________________________

block_4b_relu_1 (Activation)    (None, 512, 34, 60)  0           block_4b_bn_1[0][0]

__________________________________________________________________________________________________

block_4b_conv_2 (Conv2D)        (None, 512, 34, 60)  2359808     block_4b_relu_1[0][0]

__________________________________________________________________________________________________

block_4b_bn_2 (BatchNormalizati (None, 512, 34, 60)  2048        block_4b_conv_2[0][0]

__________________________________________________________________________________________________

add_8 (Add)                     (None, 512, 34, 60)  0           block_4b_bn_2[0][0]

                                                                 block_4a_relu[0][0]

__________________________________________________________________________________________________

block_4b_relu (Activation)      (None, 512, 34, 60)  0           add_8[0][0]

__________________________________________________________________________________________________

output_bbox (Conv2D)            (None, 8, 34, 60)    4104        block_4b_relu[0][0]

__________________________________________________________________________________________________

output_cov (Conv2D)             (None, 2, 34, 60)    1026        block_4b_relu[0][0]

==================================================================================================

Total params: 11,200,458

Trainable params: 11,190,730

Non-trainable params: 9,728

__________________________________________________________________________________________________

2020-09-24 17:02:38,635 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False

2020-09-24 17:02:38,635 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False

2020-09-24 17:02:38,635 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0)

2020-09-24 17:02:38,635 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 8, io threads: 16, compute threads: 8, buffered batches: 4

2020-09-24 17:02:38,635 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 2827, number of sources: 1, batch size per gpu: 24, steps: 118

2020-09-24 17:02:38,757 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.

2020-09-24 17:02:38.801419: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 17:02:38.802415: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 17:02:38.802460: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 17:02:38.802515: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 17:02:38.802566: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 17:02:38.802613: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 17:02:38.802659: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 17:02:38.802705: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 17:02:38.802746: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 17:02:38.802843: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 17:02:38.803804: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 17:02:38.804689: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 17:02:39,095 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: shuffle: True - shard 0 of 1

2020-09-24 17:02:39,103 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: sampling 1 datasets with weights:

2020-09-24 17:02:39,103 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: source: 0 weight: 1.000000

2020-09-24 17:02:39,774 [INFO] iva.detectnet_v2.scripts.train: Found 2827 samples in training set

2020-09-24 17:02:42,832 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False

2020-09-24 17:02:42,832 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False

2020-09-24 17:02:42,833 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0)

2020-09-24 17:02:42,833 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 8, io threads: 16, compute threads: 8, buffered batches: 4

2020-09-24 17:02:42,833 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 706, number of sources: 1, batch size per gpu: 24, steps: 30

2020-09-24 17:02:42,871 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.

2020-09-24 17:02:43,191 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: shuffle: False - shard 0 of 1

2020-09-24 17:02:43,198 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: sampling 1 datasets with weights:

2020-09-24 17:02:43,198 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: source: 0 weight: 1.000000

2020-09-24 17:02:43,636 [INFO] iva.detectnet_v2.scripts.train: Found 706 samples in validation set

2020-09-24 17:02:47.320385: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 17:02:47.321350: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 17:02:47.321404: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 17:02:47.321462: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 17:02:47.321513: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 17:02:47.321553: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 17:02:47.321592: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 17:02:47.321631: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 17:02:47.321666: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 17:02:47.321763: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 17:02:47.322751: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 17:02:47.323636: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 17:02:47.877989: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-09-24 17:02:47.878065: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0

2020-09-24 17:02:47.878082: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N

2020-09-24 17:02:47.878370: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 17:02:47.879372: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 17:02:47.880277: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14636 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0)

2020-09-24 17:03:17.093301: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 17:03:18.468223: I tensorflow/core/kernels/cuda_solvers.cc:159] Creating CudaSolver handles for stream 0x85f3690

2020-09-24 17:03:18.468401: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 17:03:19.535393: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 17:03:19.573023: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 17:03:25,062 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 0/120: loss: 0.09433 Time taken: 0:00:00 ETA: 0:00:00

2020-09-24 17:03:25,062 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 2.355

2020-09-24 17:03:38,533 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 25.354

2020-09-24 17:03:46,819 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.414

2020-09-24 17:03:54,937 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.916

2020-09-24 17:04:03,148 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.077

2020-09-24 17:04:09,687 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 1/120: loss: 0.00244 Time taken: 0:00:54.483825 ETA: 1:48:03.575143

2020-09-24 17:04:11,668 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 70.421

2020-09-24 17:04:19,920 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.713

2020-09-24 17:04:28,133 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.057

2020-09-24 17:04:36,351 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.021

2020-09-24 17:04:44,554 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.142

2020-09-24 17:04:48,456 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 2/120: loss: 0.00255 Time taken: 0:00:38.779485 ETA: 1:16:15.979172

2020-09-24 17:04:52,708 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.592

2020-09-24 17:05:00,946 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.828

2020-09-24 17:05:09,212 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.595

2020-09-24 17:05:17,447 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.861

2020-09-24 17:05:25,643 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.208

2020-09-24 17:05:27,288 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 3/120: loss: 0.00206 Time taken: 0:00:38.833195 ETA: 1:15:43.483840

2020-09-24 17:05:33,938 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.340

2020-09-24 17:05:42,148 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.082

2020-09-24 17:05:50,407 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.651

2020-09-24 17:05:58,730 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.089

2020-09-24 17:06:06,334 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 4/120: loss: 0.00176 Time taken: 0:00:39.047256 ETA: 1:15:29.481668

2020-09-24 17:06:06,988 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.665

2020-09-24 17:06:15,259 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.547

2020-09-24 17:06:23,558 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.298

2020-09-24 17:06:31,811 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.699

2020-09-24 17:06:40,177 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 71.728

2020-09-24 17:06:45,480 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 5/120: loss: 0.00150 Time taken: 0:00:39.133967 ETA: 1:15:00.406196

2020-09-24 17:06:48,470 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.347

2020-09-24 17:06:56,782 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.190

2020-09-24 17:07:05,095 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.176

2020-09-24 17:07:13,312 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.024

2020-09-24 17:07:21,660 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 71.878

2020-09-24 17:07:24,628 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 6/120: loss: 0.00133 Time taken: 0:00:39.154360 ETA: 1:14:23.597019

2020-09-24 17:07:29,969 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.216

2020-09-24 17:07:38,236 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.576

2020-09-24 17:07:46,431 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.221

2020-09-24 17:07:54,634 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.149

2020-09-24 17:08:02,910 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.500

2020-09-24 17:08:03,591 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 7/120: loss: 0.00162 Time taken: 0:00:38.943508 ETA: 1:13:20.616421

2020-09-24 17:08:11,211 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.285

2020-09-24 17:08:19,414 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.147

2020-09-24 17:08:27,685 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.543

2020-09-24 17:08:35,893 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.100

2020-09-24 17:08:42,469 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 8/120: loss: 0.00115 Time taken: 0:00:38.899752 ETA: 1:12:36.772213

2020-09-24 17:08:44,099 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.122

2020-09-24 17:08:52,352 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.707

2020-09-24 17:09:00,602 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.727

2020-09-24 17:09:08,853 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.724

2020-09-24 17:09:17,074 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.986

2020-09-24 17:09:21,306 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 9/120: loss: 0.00163 Time taken: 0:00:38.831959 ETA: 1:11:50.347476

2020-09-24 17:09:25,254 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.346

2020-09-24 17:09:33,523 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.566

2020-09-24 17:09:41,816 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.356

2020-09-24 17:09:50,105 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.383

2020-09-24 17:09:58,341 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.854

2020-09-24 17:10:04,074 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 17:11:29,618 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 8.55s/step

2020-09-24 17:12:53,633 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 8.40s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 548038/548038 [00:27<00:00, 20288.62it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 80085/80085 [00:05<00:00, 15328.87it/s]

Epoch 10/120

=========================

Validation cost: 0.000846

Mean average_precision (in %): 27.2559

class name      average precision (in %)

------------  --------------------------

mask                            0.682597

no-mask                        53.8293

Median Inference Time: 0.005846

2020-09-24 17:14:52,092 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 10/120: loss: 0.00144 Time taken: 0:05:30.767063 ETA: 10:06:24.376945

2020-09-24 17:14:58,277 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 2.000

2020-09-24 17:15:06,511 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.873

2020-09-24 17:15:14,756 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.776

2020-09-24 17:15:23,040 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.435

2020-09-24 17:15:30,991 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 11/120: loss: 0.00107 Time taken: 0:00:38.891130 ETA: 1:10:39.133193

2020-09-24 17:15:31,316 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.500

2020-09-24 17:15:39,537 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.985

2020-09-24 17:15:47,807 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.555

2020-09-24 17:15:56,054 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.753

2020-09-24 17:16:04,215 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.526

2020-09-24 17:16:09,795 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 12/120: loss: 0.00100 Time taken: 0:00:38.819295 ETA: 1:09:52.483852

2020-09-24 17:16:12,453 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.834

2020-09-24 17:16:20,679 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.945

2020-09-24 17:16:28,891 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.069

2020-09-24 17:16:37,096 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.129

2020-09-24 17:16:45,377 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.456

2020-09-24 17:16:48,688 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 13/120: loss: 0.00108 Time taken: 0:00:38.899411 ETA: 1:09:22.236999

2020-09-24 17:16:53,654 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.492

2020-09-24 17:17:01,878 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.958

2020-09-24 17:17:10,104 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.948

2020-09-24 17:17:18,329 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.950

2020-09-24 17:17:26,637 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.221

2020-09-24 17:17:27,605 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 14/120: loss: 0.00092 Time taken: 0:00:38.919037 ETA: 1:08:45.417933

2020-09-24 17:17:34,886 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.742

2020-09-24 17:17:43,168 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.448

2020-09-24 17:17:51,416 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.750

2020-09-24 17:17:59,585 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.445

2020-09-24 17:18:06,620 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 15/120: loss: 0.00108 Time taken: 0:00:39.005614 ETA: 1:08:15.589449

2020-09-24 17:18:07,952 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 71.714

2020-09-24 17:18:16,262 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.209

2020-09-24 17:18:24,504 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.800

2020-09-24 17:18:32,738 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.873

2020-09-24 17:18:41,049 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.196

2020-09-24 17:18:45,708 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 16/120: loss: 0.00093 Time taken: 0:00:39.087231 ETA: 1:07:45.072065

2020-09-24 17:18:49,369 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.114

2020-09-24 17:18:57,578 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.096

2020-09-24 17:19:05,792 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.046

2020-09-24 17:19:14,025 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.883

2020-09-24 17:19:22,346 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.110

2020-09-24 17:19:24,703 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 17/120: loss: 0.00107 Time taken: 0:00:38.990195 ETA: 1:06:55.990089

2020-09-24 17:19:30,686 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 71.941

2020-09-24 17:19:38,947 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.637

2020-09-24 17:19:47,193 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.765

2020-09-24 17:19:55,434 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.813

2020-09-24 17:20:03,682 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 18/120: loss: 0.00090 Time taken: 0:00:38.979744 ETA: 1:06:15.933884

2020-09-24 17:20:03,682 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.740

2020-09-24 17:20:11,946 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.611

2020-09-24 17:20:20,203 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.668

2020-09-24 17:20:28,449 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.764

2020-09-24 17:20:36,698 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.742

2020-09-24 17:20:42,627 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 19/120: loss: 0.00102 Time taken: 0:00:38.945780 ETA: 1:05:33.523736

2020-09-24 17:20:44,911 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.053

2020-09-24 17:20:53,184 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.529

2020-09-24 17:21:01,408 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.960

2020-09-24 17:21:09,575 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.472

2020-09-24 17:21:17,763 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.284

2020-09-24 17:21:25,228 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 17:21:53,679 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 2.85s/step

2020-09-24 17:22:22,208 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 2.85s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 36421/36421 [00:01<00:00, 21329.48it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 9594/9594 [00:00<00:00, 13567.82it/s]

Epoch 20/120

=========================

Validation cost: 0.000397

Mean average_precision (in %): 38.4669

class name      average precision (in %)

------------  --------------------------

mask                             8.25485

no-mask                         68.679

Median Inference Time: 0.005963

2020-09-24 17:22:51,241 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 20/120: loss: 0.00100 Time taken: 0:02:08.616228 ETA: 3:34:21.622834

2020-09-24 17:22:55,841 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 6.118

2020-09-24 17:23:04,019 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.368

2020-09-24 17:23:12,251 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.892

2020-09-24 17:23:20,446 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.217

2020-09-24 17:23:28,616 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.441

2020-09-24 17:23:29,951 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 21/120: loss: 0.00111 Time taken: 0:00:38.701308 ETA: 1:03:51.429493

2020-09-24 17:23:36,862 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.769

2020-09-24 17:23:45,127 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.599

2020-09-24 17:23:53,284 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.561

2020-09-24 17:24:01,513 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.916

2020-09-24 17:24:08,733 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 22/120: loss: 0.00100 Time taken: 0:00:38.788082 ETA: 1:03:21.232001

2020-09-24 17:24:09,714 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.160

2020-09-24 17:24:17,919 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.132

2020-09-24 17:24:26,205 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.411

2020-09-24 17:24:34,438 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.882

2020-09-24 17:24:42,682 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.782

2020-09-24 17:24:47,590 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 23/120: loss: 0.00100 Time taken: 0:00:38.869050 ETA: 1:02:50.297876

2020-09-24 17:24:50,894 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.066

2020-09-24 17:24:59,188 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.347

2020-09-24 17:25:07,473 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.423

2020-09-24 17:25:15,646 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.416

2020-09-24 17:25:23,904 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.661

2020-09-24 17:25:26,601 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 24/120: loss: 0.00084 Time taken: 0:00:38.991041 ETA: 1:02:23.139908

2020-09-24 17:25:32,281 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 71.625

2020-09-24 17:25:40,502 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.989

2020-09-24 17:25:48,743 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.808

2020-09-24 17:25:57,050 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.233

2020-09-24 17:26:05,301 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.723

2020-09-24 17:26:05,621 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 25/120: loss: 0.00109 Time taken: 0:00:39.034534 ETA: 1:01:48.280705

2020-09-24 17:26:13,557 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.684

2020-09-24 17:26:21,773 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.029

2020-09-24 17:26:30,018 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.773

2020-09-24 17:26:38,243 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.956

2020-09-24 17:26:44,478 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 26/120: loss: 0.00090 Time taken: 0:00:38.841996 ETA: 1:00:51.147642

2020-09-24 17:26:46,497 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.693

2020-09-24 17:26:54,716 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.006

2020-09-24 17:27:02,964 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.743

2020-09-24 17:27:11,204 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.820

2020-09-24 17:27:19,447 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.792

2020-09-24 17:27:23,389 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 27/120: loss: 0.00076 Time taken: 0:00:38.914922 ETA: 1:00:19.087724

2020-09-24 17:27:27,637 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.259

2020-09-24 17:27:35,860 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.974

2020-09-24 17:27:44,096 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.851

2020-09-24 17:27:52,336 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.824

2020-09-24 17:28:00,615 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.471

2020-09-24 17:28:02,269 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 28/120: loss: 0.00086 Time taken: 0:00:38.874354 ETA: 0:59:36.440557

2020-09-24 17:28:08,812 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.199

2020-09-24 17:28:17,051 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.826

2020-09-24 17:28:25,325 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.518

2020-09-24 17:28:33,555 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.911

2020-09-24 17:28:41,155 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 29/120: loss: 0.00087 Time taken: 0:00:38.887003 ETA: 0:58:58.717311

2020-09-24 17:28:41,828 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.528

2020-09-24 17:28:50,092 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.607

2020-09-24 17:28:58,358 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.585

2020-09-24 17:29:06,688 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.034

2020-09-24 17:29:14,963 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.514

2020-09-24 17:29:24,024 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 17:29:29,397 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.54s/step

2020-09-24 17:29:34,825 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.54s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 6711/6711 [00:00<00:00, 20595.50it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 4132/4132 [00:00<00:00, 13392.21it/s]

Epoch 30/120

=========================

Validation cost: 0.000351

Mean average_precision (in %): 57.5522

class name      average precision (in %)

------------  --------------------------

mask                             42.0818

no-mask                          73.0226

Median Inference Time: 0.005855

2020-09-24 17:29:41,025 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 30/120: loss: 0.00098 Time taken: 0:00:59.868751 ETA: 1:29:48.187573

2020-09-24 17:29:44,003 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 20.661

2020-09-24 17:29:52,225 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.981

2020-09-24 17:30:00,453 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.920

2020-09-24 17:30:08,715 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.626

2020-09-24 17:30:16,913 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.189

2020-09-24 17:30:19,885 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 31/120: loss: 0.00082 Time taken: 0:00:38.848903 ETA: 0:57:37.552341

2020-09-24 17:30:25,156 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.800

2020-09-24 17:30:33,462 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.233

2020-09-24 17:30:41,706 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.790

2020-09-24 17:30:49,877 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.432

2020-09-24 17:30:58,153 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.504

2020-09-24 17:30:58,809 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 32/120: loss: 0.00073 Time taken: 0:00:38.935244 ETA: 0:57:06.301458

2020-09-24 17:31:06,415 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.619

2020-09-24 17:31:14,710 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.337

2020-09-24 17:31:22,995 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.428

2020-09-24 17:31:31,192 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.192

2020-09-24 17:31:37,831 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 33/120: loss: 0.00086 Time taken: 0:00:39.021965 ETA: 0:56:34.910937

2020-09-24 17:31:39,498 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.240

2020-09-24 17:31:47,721 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.970

2020-09-24 17:31:55,942 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.992

2020-09-24 17:32:04,225 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.440

2020-09-24 17:32:12,477 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.714

2020-09-24 17:32:16,758 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 34/120: loss: 0.00075 Time taken: 0:00:38.930300 ETA: 0:55:48.005841

2020-09-24 17:32:20,746 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.563

2020-09-24 17:32:29,059 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.176

2020-09-24 17:32:37,292 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.885

2020-09-24 17:32:45,459 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.467

2020-09-24 17:32:53,701 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.795

2020-09-24 17:32:55,689 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 35/120: loss: 0.00086 Time taken: 0:00:38.937077 ETA: 0:55:09.651549

2020-09-24 17:33:01,912 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.080

2020-09-24 17:33:10,162 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.725

2020-09-24 17:33:18,430 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.576

2020-09-24 17:33:26,659 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.912

2020-09-24 17:33:34,572 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 36/120: loss: 0.00087 Time taken: 0:00:38.868328 ETA: 0:54:24.939560

2020-09-24 17:33:34,906 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.760

2020-09-24 17:33:43,173 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.586

2020-09-24 17:33:51,577 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 71.395

2020-09-24 17:33:59,841 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.602

2020-09-24 17:34:08,094 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.708

2020-09-24 17:34:13,640 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 37/120: loss: 0.00091 Time taken: 0:00:39.078203 ETA: 0:54:03.490885

2020-09-24 17:34:16,305 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.071

2020-09-24 17:34:24,557 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.721

2020-09-24 17:34:32,727 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.437

2020-09-24 17:34:40,931 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.143

2020-09-24 17:34:49,206 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.506

2020-09-24 17:34:52,541 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 38/120: loss: 0.00080 Time taken: 0:00:38.893529 ETA: 0:53:09.269373

2020-09-24 17:34:57,477 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.548

2020-09-24 17:35:05,753 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.502

2020-09-24 17:35:13,991 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.833

2020-09-24 17:35:22,148 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.564

2020-09-24 17:35:30,399 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.720

2020-09-24 17:35:31,408 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 39/120: loss: 0.00080 Time taken: 0:00:38.867282 ETA: 0:52:28.249854

2020-09-24 17:35:38,651 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.711

2020-09-24 17:35:46,851 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.175

2020-09-24 17:35:55,050 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.183

2020-09-24 17:36:03,271 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.987

2020-09-24 17:36:14,011 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 17:36:18,659 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.46s/step

2020-09-24 17:36:23,358 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.47s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 5552/5552 [00:00<00:00, 16943.34it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 4020/4020 [00:00<00:00, 13200.59it/s]

Epoch 40/120

=========================

Validation cost: 0.000319

Mean average_precision (in %): 65.4060

class name      average precision (in %)

------------  --------------------------

mask                             54.132

no-mask                          76.6801

Median Inference Time: 0.005800

2020-09-24 17:36:29,144 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 40/120: loss: 0.00073 Time taken: 0:00:57.739409 ETA: 1:16:59.152756

2020-09-24 17:36:30,472 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 22.058

2020-09-24 17:36:38,662 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.265

2020-09-24 17:36:46,965 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.265

2020-09-24 17:36:55,174 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.092

2020-09-24 17:37:03,447 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.525

2020-09-24 17:37:08,046 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 41/120: loss: 0.00069 Time taken: 0:00:38.897078 ETA: 0:51:12.869146

2020-09-24 17:37:11,702 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.688

2020-09-24 17:37:19,929 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.934

2020-09-24 17:37:28,119 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.264

2020-09-24 17:37:36,324 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.129

2020-09-24 17:37:44,526 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.153

2020-09-24 17:37:46,868 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 42/120: loss: 0.00071 Time taken: 0:00:38.820514 ETA: 0:50:28.000126

2020-09-24 17:37:52,848 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.100

2020-09-24 17:38:01,081 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.886

2020-09-24 17:38:09,307 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.940

2020-09-24 17:38:17,585 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.481

2020-09-24 17:38:25,822 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 43/120: loss: 0.00077 Time taken: 0:00:38.948008 ETA: 0:49:58.996621

2020-09-24 17:38:25,823 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.842

2020-09-24 17:38:34,115 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.362

2020-09-24 17:38:42,309 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.227

2020-09-24 17:38:50,618 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.214

2020-09-24 17:38:58,860 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.798

2020-09-24 17:39:04,859 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 44/120: loss: 0.00068 Time taken: 0:00:39.037521 ETA: 0:49:26.851605

2020-09-24 17:39:07,167 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.233

2020-09-24 17:39:15,377 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.083

2020-09-24 17:39:23,675 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.308

2020-09-24 17:39:31,847 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.423

2020-09-24 17:39:40,046 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.184

2020-09-24 17:39:43,658 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 45/120: loss: 0.00063 Time taken: 0:00:38.811208 ETA: 0:48:30.840565

2020-09-24 17:39:48,252 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.123

2020-09-24 17:39:56,472 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.992

2020-09-24 17:40:04,671 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.182

2020-09-24 17:40:12,856 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.310

2020-09-24 17:40:21,118 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.621

2020-09-24 17:40:22,442 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 46/120: loss: 0.00082 Time taken: 0:00:38.773553 ETA: 0:47:49.242897

2020-09-24 17:40:29,287 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.456

2020-09-24 17:40:37,498 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.073

2020-09-24 17:40:45,677 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.366

2020-09-24 17:40:53,908 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.891

2020-09-24 17:41:01,205 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 47/120: loss: 0.00066 Time taken: 0:00:38.753217 ETA: 0:47:08.984857

2020-09-24 17:41:02,216 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.228

2020-09-24 17:41:10,451 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.856

2020-09-24 17:41:18,706 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.687

2020-09-24 17:41:27,005 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.307

2020-09-24 17:41:35,254 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.732

2020-09-24 17:41:40,212 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 48/120: loss: 0.00067 Time taken: 0:00:39.017455 ETA: 0:46:49.256767

2020-09-24 17:41:43,566 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.189

2020-09-24 17:41:51,814 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.749

2020-09-24 17:42:00,069 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.688

2020-09-24 17:42:08,219 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.625

2020-09-24 17:42:16,431 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.063

2020-09-24 17:42:19,074 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 49/120: loss: 0.00082 Time taken: 0:00:38.853370 ETA: 0:45:58.589250

2020-09-24 17:42:24,709 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.488

2020-09-24 17:42:32,990 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.452

2020-09-24 17:42:41,166 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.389

2020-09-24 17:42:49,459 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.354

2020-09-24 17:43:01,816 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 17:43:06,325 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.45s/step

2020-09-24 17:43:10,653 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.43s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 5115/5115 [00:00<00:00, 17699.37it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 3583/3583 [00:00<00:00, 13268.07it/s]

Epoch 50/120

=========================

Validation cost: 0.000328

Mean average_precision (in %): 69.7012

class name      average precision (in %)

------------  --------------------------

mask                             60.0365

no-mask                          79.3659

Median Inference Time: 0.005853

2020-09-24 17:43:15,922 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 22.674

2020-09-24 17:43:16,268 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 50/120: loss: 0.00066 Time taken: 0:00:57.181024 ETA: 1:06:42.671652

2020-09-24 17:43:24,148 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.956

2020-09-24 17:43:32,294 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.655

2020-09-24 17:43:40,556 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.630

2020-09-24 17:43:48,822 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.587

2020-09-24 17:43:55,074 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 51/120: loss: 0.00064 Time taken: 0:00:38.826349 ETA: 0:44:39.018066

2020-09-24 17:43:57,031 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.095

2020-09-24 17:44:05,262 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.899

2020-09-24 17:44:13,487 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.950

2020-09-24 17:44:21,793 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.239

2020-09-24 17:44:30,000 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.110

2020-09-24 17:44:33,930 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 52/120: loss: 0.00060 Time taken: 0:00:38.860963 ETA: 0:44:02.545475

2020-09-24 17:44:38,226 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.941

2020-09-24 17:44:46,443 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.030

2020-09-24 17:44:54,740 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.312

2020-09-24 17:45:03,028 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.400

2020-09-24 17:45:11,217 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.269

2020-09-24 17:45:12,879 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 53/120: loss: 0.00064 Time taken: 0:00:38.928754 ETA: 0:43:28.226540

2020-09-24 17:45:19,459 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.800

2020-09-24 17:45:27,725 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.592

2020-09-24 17:45:35,961 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.851

2020-09-24 17:45:44,283 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.108

2020-09-24 17:45:51,891 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 54/120: loss: 0.00060 Time taken: 0:00:39.012784 ETA: 0:42:54.843760

2020-09-24 17:45:52,551 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.572

2020-09-24 17:46:00,796 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.771

2020-09-24 17:46:09,068 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.537

2020-09-24 17:46:17,400 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.012

2020-09-24 17:46:25,669 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.567

2020-09-24 17:46:30,927 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 55/120: loss: 0.00059 Time taken: 0:00:39.050829 ETA: 0:42:18.303865

2020-09-24 17:46:33,898 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.918

2020-09-24 17:46:42,078 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.346

2020-09-24 17:46:50,307 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.919

2020-09-24 17:46:58,572 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.594

2020-09-24 17:47:06,802 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.913

2020-09-24 17:47:09,807 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 56/120: loss: 0.00060 Time taken: 0:00:38.878263 ETA: 0:41:28.208817

2020-09-24 17:47:15,139 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 71.968

2020-09-24 17:47:23,353 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.049

2020-09-24 17:47:31,607 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.696

2020-09-24 17:47:39,787 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.347

2020-09-24 17:47:48,024 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.846

2020-09-24 17:47:48,666 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 57/120: loss: 0.00076 Time taken: 0:00:38.858710 ETA: 0:40:48.098718

2020-09-24 17:47:56,281 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.669

2020-09-24 17:48:04,494 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.061

2020-09-24 17:48:12,726 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.887

2020-09-24 17:48:20,932 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.124

2020-09-24 17:48:27,561 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 58/120: loss: 0.00070 Time taken: 0:00:38.878722 ETA: 0:40:10.480791

2020-09-24 17:48:29,221 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.389

2020-09-24 17:48:37,506 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.422

2020-09-24 17:48:45,697 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.256

2020-09-24 17:48:53,897 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.169

2020-09-24 17:49:02,140 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.797

2020-09-24 17:49:06,432 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 59/120: loss: 0.00067 Time taken: 0:00:38.875403 ETA: 0:39:31.399608

2020-09-24 17:49:10,359 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.006

2020-09-24 17:49:18,604 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.769

2020-09-24 17:49:26,874 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.554

2020-09-24 17:49:35,183 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.218

2020-09-24 17:49:43,479 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.322

2020-09-24 17:49:49,256 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 17:49:53,257 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.40s/step

2020-09-24 17:49:57,075 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.38s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 3912/3912 [00:00<00:00, 17266.92it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 2352/2352 [00:00<00:00, 12761.67it/s]

Epoch 60/120

=========================

Validation cost: 0.000359

Mean average_precision (in %): 64.0102

class name      average precision (in %)

------------  --------------------------

mask                             56.3707

no-mask                          71.6497

Median Inference Time: 0.005851

2020-09-24 17:50:02,008 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 60/120: loss: 0.00057 Time taken: 0:00:55.557955 ETA: 0:55:33.477287

2020-09-24 17:50:08,242 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 24.231

2020-09-24 17:50:16,404 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.513

2020-09-24 17:50:24,621 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.020

2020-09-24 17:50:32,843 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.979

2020-09-24 17:50:40,831 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 61/120: loss: 0.00059 Time taken: 0:00:38.854311 ETA: 0:38:12.404320

2020-09-24 17:50:41,161 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.132

2020-09-24 17:50:49,427 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.597

2020-09-24 17:50:57,628 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.161

2020-09-24 17:51:05,884 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.673

2020-09-24 17:51:14,056 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.428

2020-09-24 17:51:19,631 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 62/120: loss: 0.00067 Time taken: 0:00:38.788584 ETA: 0:37:29.737899

2020-09-24 17:51:22,261 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.126

2020-09-24 17:51:30,565 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.262

2020-09-24 17:51:38,795 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.907

2020-09-24 17:51:47,118 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.095

2020-09-24 17:51:55,404 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.411

2020-09-24 17:51:58,714 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 63/120: loss: 0.00049 Time taken: 0:00:39.083600 ETA: 0:37:07.765203

2020-09-24 17:52:03,608 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.141

2020-09-24 17:52:11,766 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.551

2020-09-24 17:52:20,016 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.727

2020-09-24 17:52:28,357 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 71.934

2020-09-24 17:52:36,673 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.159

2020-09-24 17:52:37,664 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 64/120: loss: 0.00053 Time taken: 0:00:38.947006 ETA: 0:36:21.032349

2020-09-24 17:52:44,915 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.799

2020-09-24 17:52:53,229 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.166

2020-09-24 17:53:01,461 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.895

2020-09-24 17:53:09,752 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.371

2020-09-24 17:53:16,793 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 65/120: loss: 0.00062 Time taken: 0:00:39.108348 ETA: 0:35:50.959134

2020-09-24 17:53:18,127 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 71.644

2020-09-24 17:53:26,328 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.163

2020-09-24 17:53:34,551 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.967

2020-09-24 17:53:42,797 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.768

2020-09-24 17:53:51,070 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.523

2020-09-24 17:53:55,665 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 66/120: loss: 0.00052 Time taken: 0:00:38.897042 ETA: 0:35:00.440257

2020-09-24 17:53:59,296 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.945

2020-09-24 17:54:07,576 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.464

2020-09-24 17:54:15,835 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.655

2020-09-24 17:54:24,058 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.966

2020-09-24 17:54:32,254 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.208

2020-09-24 17:54:34,566 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 67/120: loss: 0.00057 Time taken: 0:00:38.894204 ETA: 0:34:21.392794

2020-09-24 17:54:40,440 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.304

2020-09-24 17:54:48,707 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.583

2020-09-24 17:54:56,977 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.551

2020-09-24 17:55:05,207 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.911

2020-09-24 17:55:13,441 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 68/120: loss: 0.00051 Time taken: 0:00:38.879704 ETA: 0:33:41.744583

2020-09-24 17:55:13,442 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.863

2020-09-24 17:55:21,713 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.543

2020-09-24 17:55:29,839 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.841

2020-09-24 17:55:38,063 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.961

2020-09-24 17:55:46,363 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.289

2020-09-24 17:55:52,273 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 69/120: loss: 0.00050 Time taken: 0:00:38.794140 ETA: 0:32:58.501157

2020-09-24 17:55:54,601 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.833

2020-09-24 17:56:02,808 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.111

2020-09-24 17:56:11,051 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.792

2020-09-24 17:56:19,263 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.072

2020-09-24 17:56:27,478 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.038

2020-09-24 17:56:34,933 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 17:56:38,512 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.36s/step

2020-09-24 17:56:41,885 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.34s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 2723/2723 [00:00<00:00, 16195.26it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 2388/2388 [00:00<00:00, 13698.95it/s]

Epoch 70/120

=========================

Validation cost: 0.000361

Mean average_precision (in %): 76.2096

class name      average precision (in %)

------------  --------------------------

mask                             74.1373

no-mask                          78.2819

Median Inference Time: 0.005919

2020-09-24 17:56:46,346 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 70/120: loss: 0.00050 Time taken: 0:00:54.099905 ETA: 0:45:04.995227

2020-09-24 17:56:50,932 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 25.582

2020-09-24 17:56:59,150 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.011

2020-09-24 17:57:07,405 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.683

2020-09-24 17:57:15,612 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.115

2020-09-24 17:57:23,846 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.869

2020-09-24 17:57:25,182 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 71/120: loss: 0.00049 Time taken: 0:00:38.830585 ETA: 0:31:42.698642

2020-09-24 17:57:32,142 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.333

2020-09-24 17:57:40,362 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.993

2020-09-24 17:57:48,594 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.893

2020-09-24 17:57:56,870 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.502

2020-09-24 17:58:04,169 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 72/120: loss: 0.00041 Time taken: 0:00:38.981803 ETA: 0:31:11.126564

2020-09-24 17:58:05,153 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.440

2020-09-24 17:58:13,408 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.681

2020-09-24 17:58:21,631 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.969

2020-09-24 17:58:29,886 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.685

2020-09-24 17:58:38,079 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.236

2020-09-24 17:58:43,065 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 73/120: loss: 0.00049 Time taken: 0:00:38.899410 ETA: 0:30:28.272259

2020-09-24 17:58:46,374 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.342

2020-09-24 17:58:54,541 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.470

2020-09-24 17:59:02,789 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.747

2020-09-24 17:59:10,986 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.201

2020-09-24 17:59:19,174 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.279

2020-09-24 17:59:21,842 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 74/120: loss: 0.00050 Time taken: 0:00:38.777786 ETA: 0:29:43.778179

2020-09-24 17:59:27,434 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.642

2020-09-24 17:59:35,711 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.492

2020-09-24 17:59:43,883 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.422

2020-09-24 17:59:52,150 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.587

2020-09-24 18:00:00,330 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.351

2020-09-24 18:00:00,654 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 75/120: loss: 0.00042 Time taken: 0:00:38.820855 ETA: 0:29:06.938460

2020-09-24 18:00:08,563 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.889

2020-09-24 18:00:16,749 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.297

2020-09-24 18:00:24,974 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.948

2020-09-24 18:00:33,246 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.541

2020-09-24 18:00:39,560 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 76/120: loss: 0.00045 Time taken: 0:00:38.888927 ETA: 0:28:31.112777

2020-09-24 18:00:41,531 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.425

2020-09-24 18:00:49,795 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.607

2020-09-24 18:00:58,090 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.329

2020-09-24 18:01:06,276 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.299

2020-09-24 18:01:14,535 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.654

2020-09-24 18:01:18,503 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 77/120: loss: 0.00039 Time taken: 0:00:38.960826 ETA: 0:27:55.315535

2020-09-24 18:01:22,754 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.006

2020-09-24 18:01:31,044 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.377

2020-09-24 18:01:39,334 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.376

2020-09-24 18:01:47,590 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.678

2020-09-24 18:01:55,832 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.806

2020-09-24 18:01:57,470 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 78/120: loss: 0.00042 Time taken: 0:00:38.951020 ETA: 0:27:15.942830

2020-09-24 18:02:04,053 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.986

2020-09-24 18:02:12,317 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.606

2020-09-24 18:02:20,659 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 71.923

2020-09-24 18:02:28,866 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.111

2020-09-24 18:02:36,422 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 79/120: loss: 0.00041 Time taken: 0:00:38.958478 ETA: 0:26:37.297607

2020-09-24 18:02:37,069 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.149

2020-09-24 18:02:45,272 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.151

2020-09-24 18:02:53,536 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.606

2020-09-24 18:03:01,798 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.626

2020-09-24 18:03:10,006 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.099

2020-09-24 18:03:19,119 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 18:03:23,385 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.43s/step

2020-09-24 18:03:27,314 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.39s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 4739/4739 [00:00<00:00, 14859.13it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 3681/3681 [00:00<00:00, 13849.29it/s]

Epoch 80/120

=========================

Validation cost: 0.000328

Mean average_precision (in %): 75.3696

class name      average precision (in %)

------------  --------------------------

mask                             74.7356

no-mask                          76.0036

Median Inference Time: 0.005992

2020-09-24 18:03:32,538 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 80/120: loss: 0.00052 Time taken: 0:00:56.114087 ETA: 0:37:24.563494

2020-09-24 18:03:35,454 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 23.578

2020-09-24 18:03:43,725 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.550

2020-09-24 18:03:51,931 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.117

2020-09-24 18:04:00,112 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.350

2020-09-24 18:04:08,401 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.385

2020-09-24 18:04:11,348 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 81/120: loss: 0.00048 Time taken: 0:00:38.818861 ETA: 0:25:13.935579

2020-09-24 18:04:16,639 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.832

2020-09-24 18:04:24,887 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.754

2020-09-24 18:04:33,175 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.394

2020-09-24 18:04:41,346 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.431

2020-09-24 18:04:49,596 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.737

2020-09-24 18:04:50,286 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 82/120: loss: 0.00068 Time taken: 0:00:38.913115 ETA: 0:24:38.698380

2020-09-24 18:04:57,821 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.944

2020-09-24 18:05:06,186 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 71.737

2020-09-24 18:05:14,435 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.733

2020-09-24 18:05:22,666 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.902

2020-09-24 18:05:29,317 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 83/120: loss: 0.00049 Time taken: 0:00:39.026641 ETA: 0:24:03.985704

2020-09-24 18:05:30,996 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.028

2020-09-24 18:05:39,230 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.879

2020-09-24 18:05:47,478 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.743

2020-09-24 18:05:55,717 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.832

2020-09-24 18:06:03,966 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.733

2020-09-24 18:06:08,265 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 84/120: loss: 0.00051 Time taken: 0:00:38.979883 ETA: 0:23:23.275778

2020-09-24 18:06:12,210 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.781

2020-09-24 18:06:20,371 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.525

2020-09-24 18:06:28,598 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.934

2020-09-24 18:06:36,911 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.180

2020-09-24 18:06:45,173 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.622

2020-09-24 18:06:47,172 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 85/120: loss: 0.00043 Time taken: 0:00:38.882149 ETA: 0:22:40.875214

2020-09-24 18:06:53,436 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.621

2020-09-24 18:07:01,631 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.217

2020-09-24 18:07:09,882 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.722

2020-09-24 18:07:18,147 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.600

2020-09-24 18:07:26,054 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 86/120: loss: 0.00041 Time taken: 0:00:38.897759 ETA: 0:22:02.523805

2020-09-24 18:07:26,378 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.892

2020-09-24 18:07:34,594 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.035

2020-09-24 18:07:42,820 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.941

2020-09-24 18:07:51,097 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.490

2020-09-24 18:07:59,285 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.285

2020-09-24 18:08:04,927 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 87/120: loss: 0.00039 Time taken: 0:00:38.867141 ETA: 0:21:22.615653

2020-09-24 18:08:07,570 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.419

2020-09-24 18:08:15,840 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.554

2020-09-24 18:08:24,055 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.041

2020-09-24 18:08:32,329 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.520

2020-09-24 18:08:40,563 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.871

2020-09-24 18:08:43,879 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 88/120: loss: 0.00036 Time taken: 0:00:38.945960 ETA: 0:20:46.270721

2020-09-24 18:08:48,826 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.618

2020-09-24 18:08:57,052 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.938

2020-09-24 18:09:05,343 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.371

2020-09-24 18:09:13,555 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.071

2020-09-24 18:09:21,713 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.547

2020-09-24 18:09:22,733 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 89/120: loss: 0.00033 Time taken: 0:00:38.843846 ETA: 0:20:04.159214

2020-09-24 18:09:30,105 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 71.504

2020-09-24 18:09:38,412 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.228

2020-09-24 18:09:46,620 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.101

2020-09-24 18:09:54,893 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.533

2020-09-24 18:10:05,748 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 18:10:09,209 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.35s/step

2020-09-24 18:10:12,418 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.32s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 2149/2149 [00:00<00:00, 13514.51it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 1671/1671 [00:00<00:00, 12577.15it/s]

Epoch 90/120

=========================

Validation cost: 0.000260

Mean average_precision (in %): 83.0188

class name      average precision (in %)

------------  --------------------------

mask                             82.6008

no-mask                          83.4369

Median Inference Time: 0.005855

2020-09-24 18:10:16,399 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 90/120: loss: 0.00037 Time taken: 0:00:53.677426 ETA: 0:26:50.322769

2020-09-24 18:10:17,711 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 26.295

2020-09-24 18:10:25,968 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.668

2020-09-24 18:10:34,207 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.831

2020-09-24 18:10:42,471 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.607

2020-09-24 18:10:50,832 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 71.763

2020-09-24 18:10:55,431 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 91/120: loss: 0.00034 Time taken: 0:00:39.027700 ETA: 0:18:51.803305

2020-09-24 18:10:59,073 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.803

2020-09-24 18:11:07,301 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.929

2020-09-24 18:11:15,530 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.913

2020-09-24 18:11:23,830 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.293

2020-09-24 18:11:32,079 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.739

2020-09-24 18:11:34,413 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 92/120: loss: 0.00034 Time taken: 0:00:38.966872 ETA: 0:18:11.072402

2020-09-24 18:11:40,329 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.733

2020-09-24 18:11:48,498 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.448

2020-09-24 18:11:56,715 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.023

2020-09-24 18:12:04,973 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.658

2020-09-24 18:12:13,236 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 93/120: loss: 0.00034 Time taken: 0:00:38.840823 ETA: 0:17:28.702232

2020-09-24 18:12:13,236 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.616

2020-09-24 18:12:21,449 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.055

2020-09-24 18:12:29,710 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.637

2020-09-24 18:12:37,890 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.350

2020-09-24 18:12:46,133 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.793

2020-09-24 18:12:52,065 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 94/120: loss: 0.00027 Time taken: 0:00:38.834870 ETA: 0:16:49.706623

2020-09-24 18:12:54,384 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.725

2020-09-24 18:13:02,630 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.760

2020-09-24 18:13:10,922 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.360

2020-09-24 18:13:19,245 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.093

2020-09-24 18:13:27,506 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.634

2020-09-24 18:13:31,156 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 95/120: loss: 0.00033 Time taken: 0:00:39.078854 ETA: 0:16:16.971346

2020-09-24 18:13:35,840 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.003

2020-09-24 18:13:43,984 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.672

2020-09-24 18:13:52,197 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.055

2020-09-24 18:14:00,452 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.687

2020-09-24 18:14:08,708 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.679

2020-09-24 18:14:10,058 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 96/120: loss: 0.00029 Time taken: 0:00:38.893043 ETA: 0:15:33.433027

2020-09-24 18:14:16,971 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.614

2020-09-24 18:14:25,255 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.438

2020-09-24 18:14:33,532 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.490

2020-09-24 18:14:41,799 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.583

2020-09-24 18:14:49,075 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 97/120: loss: 0.00029 Time taken: 0:00:39.032681 ETA: 0:14:57.751663

2020-09-24 18:14:50,039 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.817

2020-09-24 18:14:58,266 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.935

2020-09-24 18:15:06,491 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.945

2020-09-24 18:15:14,757 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.593

2020-09-24 18:15:22,926 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.452

2020-09-24 18:15:27,903 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 98/120: loss: 0.00033 Time taken: 0:00:38.808059 ETA: 0:14:13.777297

2020-09-24 18:15:31,199 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.527

2020-09-24 18:15:39,431 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.892

2020-09-24 18:15:47,616 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.302

2020-09-24 18:15:55,867 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.720

2020-09-24 18:16:04,120 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.704

2020-09-24 18:16:06,732 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 99/120: loss: 0.00036 Time taken: 0:00:38.850206 ETA: 0:13:35.854334

2020-09-24 18:16:12,326 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.124

2020-09-24 18:16:20,602 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.501

2020-09-24 18:16:28,753 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.615

2020-09-24 18:16:36,925 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.422

2020-09-24 18:16:49,264 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 18:16:52,875 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.36s/step

2020-09-24 18:16:55,996 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.31s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 2185/2185 [00:00<00:00, 13183.24it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 1548/1548 [00:00<00:00, 13041.88it/s]

Epoch 100/120

=========================

Validation cost: 0.000255

Mean average_precision (in %): 84.2315

class name      average precision (in %)

------------  --------------------------

mask                             86.3382

no-mask                          82.1248

Median Inference Time: 0.005855

2020-09-24 18:16:59,687 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 26.361

2020-09-24 18:17:00,018 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 100/120: loss: 0.00032 Time taken: 0:00:53.276767 ETA: 0:17:45.535336

2020-09-24 18:17:07,869 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.345

2020-09-24 18:17:16,077 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.106

2020-09-24 18:17:24,350 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.528

2020-09-24 18:17:32,633 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.439

2020-09-24 18:17:38,872 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 101/120: loss: 0.00026 Time taken: 0:00:38.859047 ETA: 0:12:18.321896

2020-09-24 18:17:40,865 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.885

2020-09-24 18:17:49,092 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.937

2020-09-24 18:17:57,378 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.417

2020-09-24 18:18:05,617 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.822

2020-09-24 18:18:13,838 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.985

2020-09-24 18:18:17,765 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 102/120: loss: 0.00026 Time taken: 0:00:38.878018 ETA: 0:11:39.804327

2020-09-24 18:18:22,060 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.983

2020-09-24 18:18:30,270 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.083

2020-09-24 18:18:38,569 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.297

2020-09-24 18:18:46,861 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.362

2020-09-24 18:18:55,172 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.202

2020-09-24 18:18:56,826 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 103/120: loss: 0.00028 Time taken: 0:00:39.063923 ETA: 0:11:04.086697

2020-09-24 18:19:03,456 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.433

2020-09-24 18:19:11,639 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.322

2020-09-24 18:19:19,820 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.342

2020-09-24 18:19:28,101 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.465

2020-09-24 18:19:35,634 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 104/120: loss: 0.00030 Time taken: 0:00:38.801921 ETA: 0:10:20.830734

2020-09-24 18:19:36,281 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.346

2020-09-24 18:19:44,498 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.021

2020-09-24 18:19:52,749 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.725

2020-09-24 18:20:01,024 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.508

2020-09-24 18:20:09,256 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.895

2020-09-24 18:20:14,545 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 105/120: loss: 0.00029 Time taken: 0:00:38.915925 ETA: 0:09:43.738872

2020-09-24 18:20:17,543 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.404

2020-09-24 18:20:25,811 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.570

2020-09-24 18:20:34,088 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.491

2020-09-24 18:20:42,282 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.232

2020-09-24 18:20:50,543 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.627

2020-09-24 18:20:53,485 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 106/120: loss: 0.00027 Time taken: 0:00:38.937352 ETA: 0:09:05.122931

2020-09-24 18:20:58,754 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.079

2020-09-24 18:21:06,996 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.800

2020-09-24 18:21:15,181 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.309

2020-09-24 18:21:23,302 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.882

2020-09-24 18:21:31,494 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.245

2020-09-24 18:21:32,145 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 107/120: loss: 0.00026 Time taken: 0:00:38.674176 ETA: 0:08:22.764291

2020-09-24 18:21:39,741 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.761

2020-09-24 18:21:47,988 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.755

2020-09-24 18:21:56,200 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.063

2020-09-24 18:22:04,410 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.090

2020-09-24 18:22:11,028 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 108/120: loss: 0.00029 Time taken: 0:00:38.864082 ETA: 0:07:46.368979

2020-09-24 18:22:12,664 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.698

2020-09-24 18:22:20,832 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.452

2020-09-24 18:22:29,026 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.228

2020-09-24 18:22:37,198 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.426

2020-09-24 18:22:45,430 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.891

2020-09-24 18:22:49,699 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 109/120: loss: 0.00023 Time taken: 0:00:38.679803 ETA: 0:07:05.477837

2020-09-24 18:22:53,660 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.905

2020-09-24 18:23:01,862 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.161

2020-09-24 18:23:10,097 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.859

2020-09-24 18:23:18,244 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.646

2020-09-24 18:23:26,466 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.984

2020-09-24 18:23:32,254 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 18:23:35,797 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.35s/step

2020-09-24 18:23:38,878 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.31s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 2115/2115 [00:00<00:00, 13057.48it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 1473/1473 [00:00<00:00, 13169.53it/s]

Epoch 110/120

=========================

Validation cost: 0.000255

Mean average_precision (in %): 84.2377

class name      average precision (in %)

------------  --------------------------

mask                             86.6117

no-mask                          81.8638

Median Inference Time: 0.005806

2020-09-24 18:23:42,802 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 110/120: loss: 0.00030 Time taken: 0:00:53.113684 ETA: 0:08:51.136839

2020-09-24 18:23:49,049 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 26.569

2020-09-24 18:23:57,257 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.097

2020-09-24 18:24:05,516 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.653

2020-09-24 18:24:13,719 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.151

2020-09-24 18:24:21,735 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 111/120: loss: 0.00025 Time taken: 0:00:38.931450 ETA: 0:05:50.383049

2020-09-24 18:24:22,095 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 71.631

2020-09-24 18:24:30,297 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.159

2020-09-24 18:24:38,548 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.723

2020-09-24 18:24:46,842 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.343

2020-09-24 18:24:55,188 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 71.891

2020-09-24 18:25:00,738 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 112/120: loss: 0.00026 Time taken: 0:00:38.993617 ETA: 0:05:11.948938

2020-09-24 18:25:03,389 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.170

2020-09-24 18:25:11,567 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.367

2020-09-24 18:25:19,831 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.604

2020-09-24 18:25:28,071 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.825

2020-09-24 18:25:36,356 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.420

2020-09-24 18:25:39,650 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 113/120: loss: 0.00020 Time taken: 0:00:38.896214 ETA: 0:04:32.273495

2020-09-24 18:25:44,630 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.517

2020-09-24 18:25:52,830 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.174

2020-09-24 18:26:01,043 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.063

2020-09-24 18:26:09,283 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.817

2020-09-24 18:26:17,474 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.253

2020-09-24 18:26:18,460 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 114/120: loss: 0.00025 Time taken: 0:00:38.829857 ETA: 0:03:52.979140

2020-09-24 18:26:25,719 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.775

2020-09-24 18:26:33,861 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.695

2020-09-24 18:26:42,089 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.924

2020-09-24 18:26:50,305 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.034

2020-09-24 18:26:57,250 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 115/120: loss: 0.00024 Time taken: 0:00:38.780357 ETA: 0:03:13.901787

2020-09-24 18:26:58,590 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.421

2020-09-24 18:27:06,918 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.047

2020-09-24 18:27:15,153 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.863

2020-09-24 18:27:23,416 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.618

2020-09-24 18:27:31,651 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.858

2020-09-24 18:27:36,303 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 116/120: loss: 0.00026 Time taken: 0:00:39.051358 ETA: 0:02:36.205434

2020-09-24 18:27:39,959 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.227

2020-09-24 18:27:48,195 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.856

2020-09-24 18:27:56,398 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.143

2020-09-24 18:28:04,635 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.844

2020-09-24 18:28:12,901 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.586

2020-09-24 18:28:15,192 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 117/120: loss: 0.00029 Time taken: 0:00:38.892842 ETA: 0:01:56.678525

2020-09-24 18:28:21,115 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.053

2020-09-24 18:28:29,377 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.630

2020-09-24 18:28:37,667 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.379

2020-09-24 18:28:45,975 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.220

2020-09-24 18:28:54,270 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 118/120: loss: 0.00024 Time taken: 0:00:39.080097 ETA: 0:01:18.160194

2020-09-24 18:28:54,270 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.333

2020-09-24 18:29:02,530 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.638

2020-09-24 18:29:10,828 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.317

2020-09-24 18:29:19,074 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.759

2020-09-24 18:29:27,272 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.197

2020-09-24 18:29:33,221 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 119/120: loss: 0.00029 Time taken: 0:00:38.949450 ETA: 0:00:38.949450

2020-09-24 18:29:35,561 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.387

2020-09-24 18:29:43,789 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.919

2020-09-24 18:29:51,968 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 73.361

2020-09-24 18:30:00,282 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.175

2020-09-24 18:30:08,608 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.064

2020-09-24 18:30:15,973 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 18:30:19,519 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.35s/step

2020-09-24 18:30:22,636 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.31s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 2140/2140 [00:00<00:00, 13313.59it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 1511/1511 [00:00<00:00, 13359.68it/s]

Epoch 120/120

=========================

Validation cost: 0.000251

Mean average_precision (in %): 84.2905

class name      average precision (in %)

------------  --------------------------

mask                             86.5548

no-mask                          82.0262

Median Inference Time: 0.005813

2020-09-24 18:30:26,863 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 72.064

Time taken to run iva.detectnet_v2.scripts.train:main: 1:28:03.683443.

Once training is complete you'll be able to see the outuput of the transfer learning for each epoch by running the code cell below:

In [22]:

print('Model for each epoch:')
print('---------------------')
!ls -lh $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights

Out[22]:

Model for each epoch:

---------------------

total 43M

-rw-r--r-- 1 root root 43M Sep 24 18:30 resnet18_detector.tlt

Step 9 - Evaluate The Trained Model

Run the code cell below to evaluate the trained model's performance! How good is our face mask detector?

-e - The experiment details from our specs file
-m - The (re)trained model
-k - The model's key we specified earlier

In [23]:

!tlt-evaluate detectnet_v2 -e $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt \
                           -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/resnet18_detector.tlt \
                           -k $KEY

Out[23]:

Using TensorFlow backend.

2020-09-24 18:32:17.102917: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 18:32:20,355 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /workspace/detectnet_v2/specs/detectnet_v2_train_resnet18_kitti.txt

2020-09-24 18:32:22.151613: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1

2020-09-24 18:32:22.203782: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:32:22.204743: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 18:32:22.204784: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 18:32:22.204850: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 18:32:22.206200: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 18:32:22.206578: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 18:32:22.208342: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 18:32:22.209661: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 18:32:22.209743: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 18:32:22.209857: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:32:22.210885: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:32:22.211811: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 18:32:22.211858: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 18:32:23.048904: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-09-24 18:32:23.048965: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0

2020-09-24 18:32:23.048986: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N

2020-09-24 18:32:23.049237: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:32:23.050266: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:32:23.051239: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:32:23.052085: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14636 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0)

2020-09-24 18:32:24,283 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False

2020-09-24 18:32:24,284 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False

2020-09-24 18:32:24,284 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0)

2020-09-24 18:32:24,284 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 8, io threads: 16, compute threads: 8, buffered batches: 4

2020-09-24 18:32:24,284 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 706, number of sources: 1, batch size per gpu: 24, steps: 30

2020-09-24 18:32:24,397 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.

2020-09-24 18:32:24.438994: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:32:24.439441: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 18:32:24.439485: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 18:32:24.439535: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 18:32:24.439578: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 18:32:24.439618: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 18:32:24.439659: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 18:32:24.439699: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 18:32:24.439740: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 18:32:24.439835: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:32:24.440265: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:32:24.440624: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 18:32:24,827 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: shuffle: False - shard 0 of 1

2020-09-24 18:32:24,834 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: sampling 1 datasets with weights:

2020-09-24 18:32:24,835 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: source: 0 weight: 1.000000

2020-09-24 18:32:25,290 [INFO] iva.detectnet_v2.evaluation.build_evaluator: Found 706 samples in validation set

__________________________________________________________________________________________________

Layer (type)                    Output Shape         Param #     Connected to

==================================================================================================

input_1 (InputLayer)            (None, 3, 544, 960)  0

__________________________________________________________________________________________________

conv1 (Conv2D)                  (None, 64, 272, 480) 9472        input_1[0][0]

__________________________________________________________________________________________________

bn_conv1 (BatchNormalization)   (None, 64, 272, 480) 256         conv1[0][0]

__________________________________________________________________________________________________

activation_1 (Activation)       (None, 64, 272, 480) 0           bn_conv1[0][0]

__________________________________________________________________________________________________

block_1a_conv_1 (Conv2D)        (None, 64, 136, 240) 36928       activation_1[0][0]

__________________________________________________________________________________________________

block_1a_bn_1 (BatchNormalizati (None, 64, 136, 240) 256         block_1a_conv_1[0][0]

__________________________________________________________________________________________________

block_1a_relu_1 (Activation)    (None, 64, 136, 240) 0           block_1a_bn_1[0][0]

__________________________________________________________________________________________________

block_1a_conv_2 (Conv2D)        (None, 64, 136, 240) 36928       block_1a_relu_1[0][0]

__________________________________________________________________________________________________

block_1a_conv_shortcut (Conv2D) (None, 64, 136, 240) 4160        activation_1[0][0]

__________________________________________________________________________________________________

block_1a_bn_2 (BatchNormalizati (None, 64, 136, 240) 256         block_1a_conv_2[0][0]

__________________________________________________________________________________________________

block_1a_bn_shortcut (BatchNorm (None, 64, 136, 240) 256         block_1a_conv_shortcut[0][0]

__________________________________________________________________________________________________

add_1 (Add)                     (None, 64, 136, 240) 0           block_1a_bn_2[0][0]

                                                                 block_1a_bn_shortcut[0][0]

__________________________________________________________________________________________________

block_1a_relu (Activation)      (None, 64, 136, 240) 0           add_1[0][0]

__________________________________________________________________________________________________

block_1b_conv_1 (Conv2D)        (None, 64, 136, 240) 36928       block_1a_relu[0][0]

__________________________________________________________________________________________________

block_1b_bn_1 (BatchNormalizati (None, 64, 136, 240) 256         block_1b_conv_1[0][0]

__________________________________________________________________________________________________

block_1b_relu_1 (Activation)    (None, 64, 136, 240) 0           block_1b_bn_1[0][0]

__________________________________________________________________________________________________

block_1b_conv_2 (Conv2D)        (None, 64, 136, 240) 36928       block_1b_relu_1[0][0]

__________________________________________________________________________________________________

block_1b_bn_2 (BatchNormalizati (None, 64, 136, 240) 256         block_1b_conv_2[0][0]

__________________________________________________________________________________________________

add_2 (Add)                     (None, 64, 136, 240) 0           block_1b_bn_2[0][0]

                                                                 block_1a_relu[0][0]

__________________________________________________________________________________________________

block_1b_relu (Activation)      (None, 64, 136, 240) 0           add_2[0][0]

__________________________________________________________________________________________________

block_2a_conv_1 (Conv2D)        (None, 128, 68, 120) 73856       block_1b_relu[0][0]

__________________________________________________________________________________________________

block_2a_bn_1 (BatchNormalizati (None, 128, 68, 120) 512         block_2a_conv_1[0][0]

__________________________________________________________________________________________________

block_2a_relu_1 (Activation)    (None, 128, 68, 120) 0           block_2a_bn_1[0][0]

__________________________________________________________________________________________________

block_2a_conv_2 (Conv2D)        (None, 128, 68, 120) 147584      block_2a_relu_1[0][0]

__________________________________________________________________________________________________

block_2a_conv_shortcut (Conv2D) (None, 128, 68, 120) 8320        block_1b_relu[0][0]

__________________________________________________________________________________________________

block_2a_bn_2 (BatchNormalizati (None, 128, 68, 120) 512         block_2a_conv_2[0][0]

__________________________________________________________________________________________________

block_2a_bn_shortcut (BatchNorm (None, 128, 68, 120) 512         block_2a_conv_shortcut[0][0]

__________________________________________________________________________________________________

add_3 (Add)                     (None, 128, 68, 120) 0           block_2a_bn_2[0][0]

                                                                 block_2a_bn_shortcut[0][0]

__________________________________________________________________________________________________

block_2a_relu (Activation)      (None, 128, 68, 120) 0           add_3[0][0]

__________________________________________________________________________________________________

block_2b_conv_1 (Conv2D)        (None, 128, 68, 120) 147584      block_2a_relu[0][0]

__________________________________________________________________________________________________

block_2b_bn_1 (BatchNormalizati (None, 128, 68, 120) 512         block_2b_conv_1[0][0]

__________________________________________________________________________________________________

block_2b_relu_1 (Activation)    (None, 128, 68, 120) 0           block_2b_bn_1[0][0]

__________________________________________________________________________________________________

block_2b_conv_2 (Conv2D)        (None, 128, 68, 120) 147584      block_2b_relu_1[0][0]

__________________________________________________________________________________________________

block_2b_bn_2 (BatchNormalizati (None, 128, 68, 120) 512         block_2b_conv_2[0][0]

__________________________________________________________________________________________________

add_4 (Add)                     (None, 128, 68, 120) 0           block_2b_bn_2[0][0]

                                                                 block_2a_relu[0][0]

__________________________________________________________________________________________________

block_2b_relu (Activation)      (None, 128, 68, 120) 0           add_4[0][0]

__________________________________________________________________________________________________

block_3a_conv_1 (Conv2D)        (None, 256, 34, 60)  295168      block_2b_relu[0][0]

__________________________________________________________________________________________________

block_3a_bn_1 (BatchNormalizati (None, 256, 34, 60)  1024        block_3a_conv_1[0][0]

__________________________________________________________________________________________________

block_3a_relu_1 (Activation)    (None, 256, 34, 60)  0           block_3a_bn_1[0][0]

__________________________________________________________________________________________________

block_3a_conv_2 (Conv2D)        (None, 256, 34, 60)  590080      block_3a_relu_1[0][0]

__________________________________________________________________________________________________

block_3a_conv_shortcut (Conv2D) (None, 256, 34, 60)  33024       block_2b_relu[0][0]

__________________________________________________________________________________________________

block_3a_bn_2 (BatchNormalizati (None, 256, 34, 60)  1024        block_3a_conv_2[0][0]

__________________________________________________________________________________________________

block_3a_bn_shortcut (BatchNorm (None, 256, 34, 60)  1024        block_3a_conv_shortcut[0][0]

__________________________________________________________________________________________________

add_5 (Add)                     (None, 256, 34, 60)  0           block_3a_bn_2[0][0]

                                                                 block_3a_bn_shortcut[0][0]

__________________________________________________________________________________________________

block_3a_relu (Activation)      (None, 256, 34, 60)  0           add_5[0][0]

__________________________________________________________________________________________________

block_3b_conv_1 (Conv2D)        (None, 256, 34, 60)  590080      block_3a_relu[0][0]

__________________________________________________________________________________________________

block_3b_bn_1 (BatchNormalizati (None, 256, 34, 60)  1024        block_3b_conv_1[0][0]

__________________________________________________________________________________________________

block_3b_relu_1 (Activation)    (None, 256, 34, 60)  0           block_3b_bn_1[0][0]

__________________________________________________________________________________________________

block_3b_conv_2 (Conv2D)        (None, 256, 34, 60)  590080      block_3b_relu_1[0][0]

__________________________________________________________________________________________________

block_3b_bn_2 (BatchNormalizati (None, 256, 34, 60)  1024        block_3b_conv_2[0][0]

__________________________________________________________________________________________________

add_6 (Add)                     (None, 256, 34, 60)  0           block_3b_bn_2[0][0]

                                                                 block_3a_relu[0][0]

__________________________________________________________________________________________________

block_3b_relu (Activation)      (None, 256, 34, 60)  0           add_6[0][0]

__________________________________________________________________________________________________

block_4a_conv_1 (Conv2D)        (None, 512, 34, 60)  1180160     block_3b_relu[0][0]

__________________________________________________________________________________________________

block_4a_bn_1 (BatchNormalizati (None, 512, 34, 60)  2048        block_4a_conv_1[0][0]

__________________________________________________________________________________________________

block_4a_relu_1 (Activation)    (None, 512, 34, 60)  0           block_4a_bn_1[0][0]

__________________________________________________________________________________________________

block_4a_conv_2 (Conv2D)        (None, 512, 34, 60)  2359808     block_4a_relu_1[0][0]

__________________________________________________________________________________________________

block_4a_conv_shortcut (Conv2D) (None, 512, 34, 60)  131584      block_3b_relu[0][0]

__________________________________________________________________________________________________

block_4a_bn_2 (BatchNormalizati (None, 512, 34, 60)  2048        block_4a_conv_2[0][0]

__________________________________________________________________________________________________

block_4a_bn_shortcut (BatchNorm (None, 512, 34, 60)  2048        block_4a_conv_shortcut[0][0]

__________________________________________________________________________________________________

add_7 (Add)                     (None, 512, 34, 60)  0           block_4a_bn_2[0][0]

                                                                 block_4a_bn_shortcut[0][0]

__________________________________________________________________________________________________

block_4a_relu (Activation)      (None, 512, 34, 60)  0           add_7[0][0]

__________________________________________________________________________________________________

block_4b_conv_1 (Conv2D)        (None, 512, 34, 60)  2359808     block_4a_relu[0][0]

__________________________________________________________________________________________________

block_4b_bn_1 (BatchNormalizati (None, 512, 34, 60)  2048        block_4b_conv_1[0][0]

__________________________________________________________________________________________________

block_4b_relu_1 (Activation)    (None, 512, 34, 60)  0           block_4b_bn_1[0][0]

__________________________________________________________________________________________________

block_4b_conv_2 (Conv2D)        (None, 512, 34, 60)  2359808     block_4b_relu_1[0][0]

__________________________________________________________________________________________________

block_4b_bn_2 (BatchNormalizati (None, 512, 34, 60)  2048        block_4b_conv_2[0][0]

__________________________________________________________________________________________________

add_8 (Add)                     (None, 512, 34, 60)  0           block_4b_bn_2[0][0]

                                                                 block_4a_relu[0][0]

__________________________________________________________________________________________________

block_4b_relu (Activation)      (None, 512, 34, 60)  0           add_8[0][0]

__________________________________________________________________________________________________

output_bbox (Conv2D)            (None, 8, 34, 60)    4104        block_4b_relu[0][0]

__________________________________________________________________________________________________

output_cov (Conv2D)             (None, 2, 34, 60)    1026        block_4b_relu[0][0]

==================================================================================================

Total params: 11,200,458

Trainable params: 11,190,730

Non-trainable params: 9,728

__________________________________________________________________________________________________

2020-09-24 18:32:26.851020: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:32:26.851461: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 18:32:26.851519: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 18:32:26.851591: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 18:32:26.851633: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 18:32:26.851673: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 18:32:26.851713: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 18:32:26.851753: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 18:32:26.851795: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 18:32:26.851891: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:32:26.852325: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:32:26.852688: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 18:32:26.854072: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-09-24 18:32:26.854104: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0

2020-09-24 18:32:26.854118: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N

2020-09-24 18:32:26.854261: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:32:26.854715: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:32:26.855095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14636 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0)

2020-09-24 18:32:28,625 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 30, 0.00s/step

2020-09-24 18:32:29.564407: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 18:32:29.587445: I tensorflow/core/kernels/cuda_solvers.cc:159] Creating CudaSolver handles for stream 0x887a280

2020-09-24 18:32:29.587749: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 18:32:30.495204: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 18:32:30.732476: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 18:32:35,959 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 30, 0.73s/step

2020-09-24 18:32:39,290 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 30, 0.33s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 2184/2184 [00:00<00:00, 13266.41it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 1582/1582 [00:00<00:00, 12866.21it/s]

Validation cost: 0.001531

Mean average_precision (in %): 84.4518

class name      average precision (in %)

------------  --------------------------

mask                             86.629

no-mask                          82.2747

Median Inference Time: 0.005855

2020-09-24 18:32:43,172 [INFO] iva.detectnet_v2.scripts.evaluate: Evaluation complete.

Time taken to run iva.detectnet_v2.scripts.evaluate:main: 0:00:22.819600.

Step 10 - Prune The Trained Model

Pruning our model is a really important step. Optimising our model and removing unneccesary weights will optimise for inference - which is imperative for our streaming video use case. The parameters are as follows:

-m Specify pre-trained model we'd like to prune
-eq Equalization criterion (Applicable for resnets and mobilenets)
-pth Threshold for pruning (0.01 is a great starting point for detectnet_v2 models)
-k A key for saving/loading the model
-o Output directory for the pruned model

The code cell below will prune our model:

In [24]:

# Create an output directory if it doesn't exist.
!mkdir -p $USER_EXPERIMENT_DIR/experiment_dir_pruned

In [25]:

# Prune all the layers
!tlt-prune -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/resnet18_detector.tlt \
           -o $USER_EXPERIMENT_DIR/experiment_dir_pruned/resnet18_nopool_bn_detectnet_v2_pruned.tlt \
           -eq union \
           -pth 0.8 \
           -k $KEY

Out[25]:

2020-09-24 18:33:07.980539: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

Using TensorFlow backend.

2020-09-24 18:33:13.039763: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1

2020-09-24 18:33:13.085207: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:33:13.086204: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 18:33:13.086248: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 18:33:13.087768: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 18:33:13.089040: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 18:33:13.089401: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 18:33:13.091168: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 18:33:13.092472: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 18:33:13.096605: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 18:33:13.096730: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:33:13.097725: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:33:13.098664: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 18:33:13.098717: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 18:33:13.621359: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-09-24 18:33:13.621419: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0

2020-09-24 18:33:13.621440: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N

2020-09-24 18:33:13.621679: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:33:13.622709: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:33:13.623682: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:33:13.624632: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14803 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0)

2020-09-24 18:33:14,552 [INFO] modulus.pruning.pruning: Exploring graph for retainable indices

2020-09-24 18:33:15,152 [INFO] modulus.pruning.pruning: Pruning model and appending pruned nodes to new graph

2020-09-24 18:33:31,426 [INFO] iva.common.magnet_prune: Pruning ratio (pruned model / original model): 0.024069194313303975

Now view the output of the pruning...

In [26]:

!ls -rlt $USER_EXPERIMENT_DIR/experiment_dir_pruned/

Out[26]:

total 1260

-rw-r--r-- 1 root root 1286720 Sep 24 18:33 resnet18_nopool_bn_detectnet_v2_pruned.tlt

Step 11 - Retrain The Pruned Model

So our model is pruned but we're not quite done yet...

We need to retrain the pruned network to bring back any accuracy we lost during the pruning phase. First, we'll need to update the experiment file and set load_graph to true in the model_config. Then we need to update the specification for retraining which uses the pruned model as the pretrained weights.

note If the model shows some decrease in mAP, it could be that the originally trained model, was pruned a little too much. Please try reducing the pruning threshold in the previous step, thereby reducing the pruning ratio, and use the new model to retrain.

Let's change the following parameters by editing the file workspace/detectnet_v2/specs/detectnet_v2_retrain_resnet18_kitti.txt directly:

tfrecords_path = "/workspace/data/tfrecords/kitti_trainval/*"
image_directory_path = "/workspace/converted_datasets/train"
pretrained_model_file = "/workspace/detectnet_v2/experiment_dir_pruned/resnet18_nopool_bn_detectnet_v2_pruned.tlt"

Once completed the experiment file should look like this:

In [27]:

# Printing the retrain experiment file. 
# Note: We have updated the experiment file to include the 
# newly pruned model as a pretrained weights and, the
# load_graph option is set to true 
!cat $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt

Out[27]:

random_seed: 42

dataset_config {

  data_sources {

    tfrecords_path: "/workspace/data/tfrecords/kitti_trainval/"

    image_directory_path: "/workspace/converted_datasets/train"

  image_extension: "jpg"

  target_class_mapping {

    key: "mask"

    value: "mask"

  target_class_mapping {

    key: "no-mask"

    value: "no-mask"

  validation_fold: 0

  #validation_data_source: {

    #tfrecords_path: "/home/data/tfrecords/kitti_val/*"

    #image_directory_path: "/home/data/test"

#}

augmentation_config {

  preprocessing {

    output_image_width: 960

    output_image_height: 544

    min_bbox_width: 1.0

    min_bbox_height: 1.0

    output_image_channel: 3

  spatial_augmentation {

    hflip_probability: 0.5

    zoom_min: 1.0

    zoom_max: 1.0

    translate_max_x: 8.0

    translate_max_y: 8.0

  color_augmentation {

    hue_rotation_max: 25.0

    saturation_shift_max: 0.20000000298

    contrast_scale_max: 0.10000000149

    contrast_center: 0.5

postprocessing_config {

  target_class_config {

    key: "mask"

    value {

      clustering_config {

        coverage_threshold: 0.00499999988824

        dbscan_eps: 0.20000000298

        dbscan_min_samples: 0.0500000007451

        minimum_bounding_box_height: 20

  target_class_config {

    key: "no-mask"

    value {

      clustering_config {

        coverage_threshold: 0.00499999988824

        dbscan_eps: 0.15000000596

        dbscan_min_samples: 0.0500000007451

        minimum_bounding_box_height: 20

model_config {

  pretrained_model_file: "/workspace/detectnet_v2/experiment_dir_pruned/resnet18_nopool_bn_detectnet_v2_pruned.tlt"

  num_layers: 18

  use_batch_norm: true

  load_graph: true

  objective_set {

    bbox {

      scale: 35.0

      offset: 0.5

    cov {

  training_precision {

    backend_floatx: FLOAT32

  arch: "resnet"

evaluation_config {

  validation_period_during_training: 10

  first_validation_epoch: 10

  minimum_detection_ground_truth_overlap {

    key: "mask"

    value: 0.5

  minimum_detection_ground_truth_overlap {

    key: "no-mask"

    value: 0.5

  evaluation_box_config {

    key: "mask"

    value {

      minimum_height: 20

      maximum_height: 9999

      minimum_width: 10

      maximum_width: 9999

  evaluation_box_config {

    key: "no-mask"

    value {

      minimum_height: 20

      maximum_height: 9999

      minimum_width: 10

      maximum_width: 9999

  average_precision_mode: INTEGRATE

cost_function_config {

  target_classes {

    name: "mask"

    class_weight: 1.0

    coverage_foreground_weight: 0.0500000007451

    objectives {

      name: "cov"

      initial_weight: 1.0

      weight_target: 1.0

    objectives {

      name: "bbox"

      initial_weight: 10.0

      weight_target: 10.0

  target_classes {

    name: "no-mask"

    class_weight: 8.0

    coverage_foreground_weight: 0.0500000007451

    objectives {

      name: "cov"

      initial_weight: 1.0

      weight_target: 1.0

    objectives {

      name: "bbox"

      initial_weight: 10.0

      weight_target: 1.0

  enable_autoweighting: true

  max_objective_weight: 0.999899983406

  min_objective_weight: 9.99999974738e-05

training_config {

  batch_size_per_gpu: 24

  num_epochs: 120

  learning_rate {

    soft_start_annealing_schedule {

      min_learning_rate: 5e-06

      max_learning_rate: 5e-04

      soft_start: 0.10000000149

      annealing: 0.699999988079

  regularizer {

    type: L1

    weight: 3.00000002618e-09

  optimizer {

    adam {

      epsilon: 9.99999993923e-09

      beta1: 0.899999976158

      beta2: 0.999000012875

  cost_scaling {

    initial_exponent: 20.0

    increment: 0.005

    decrement: 1.0

  checkpoint_interval: 10

bbox_rasterizer_config {

  target_class_config {

    key: "mask"

    value {

      cov_center_x: 0.5

      cov_center_y: 0.5

      cov_radius_x: 0.40000000596

      cov_radius_y: 0.40000000596

      bbox_min_radius: 1.0

  target_class_config {

    key: "no-mask"

    value {

      cov_center_x: 0.5

      cov_center_y: 0.5

      cov_radius_x: 1.0

      cov_radius_y: 1.0

      bbox_min_radius: 1.0

  deadzone_radius: 0.400000154972

In [28]:

# Retraining using the pruned model as pretrained weights 
!tlt-train detectnet_v2 -e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt \
                        -r $USER_EXPERIMENT_DIR/experiment_dir_retrain \
                        -k $KEY \
                        -n resnet18_detector_pruned

Out[28]:

Using TensorFlow backend.

2020-09-24 18:35:57.500155: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

--------------------------------------------------------------------------

[[60030,1],0]: A high-performance Open MPI point-to-point messaging module

was unable to find any relevant network interfaces:

Module: OpenFabrics (openib)

  Host: 28071f6116a0

Another transport will be used instead, although this may result in

lower performance.

NOTE: You can disable this warning by setting the MCA parameter

btl_base_warn_component_unused to 0.

--------------------------------------------------------------------------

2020-09-24 18:36:00.721338: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1

2020-09-24 18:36:00.743077: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:36:00.744037: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 18:36:00.744077: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 18:36:00.744140: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 18:36:00.745474: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 18:36:00.745860: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 18:36:00.747661: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 18:36:00.748986: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 18:36:00.749063: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 18:36:00.749180: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:36:00.750212: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:36:00.751127: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 18:36:00.751180: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 18:36:01.590723: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-09-24 18:36:01.590780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0

2020-09-24 18:36:01.590803: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N

2020-09-24 18:36:01.591067: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:36:01.592078: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:36:01.593053: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:36:01.593979: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14636 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0)

2020-09-24 18:36:01,594 [INFO] iva.detectnet_v2.scripts.train: Loading experiment spec at /workspace/detectnet_v2/specs/detectnet_v2_retrain_resnet18_kitti.txt.

2020-09-24 18:36:01,596 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /workspace/detectnet_v2/specs/detectnet_v2_retrain_resnet18_kitti.txt

2020-09-24 18:36:01,930 [INFO] iva.detectnet_v2.scripts.train: Cannot iterate over exactly 2827 samples with a batch size of 24; each epoch will therefore take one extra step.

__________________________________________________________________________________________________

Layer (type)                    Output Shape         Param #     Connected to

==================================================================================================

input_1 (InputLayer)            (None, 3, 544, 960)  0

__________________________________________________________________________________________________

conv1 (Conv2D)                  (None, 16, 272, 480) 2368        input_1[0][0]

__________________________________________________________________________________________________

bn_conv1 (BatchNormalization)   (None, 16, 272, 480) 64          conv1[0][0]

__________________________________________________________________________________________________

activation_1 (Activation)       (None, 16, 272, 480) 0           bn_conv1[0][0]

__________________________________________________________________________________________________

block_1a_conv_1 (Conv2D)        (None, 16, 136, 240) 2320        activation_1[0][0]

__________________________________________________________________________________________________

block_1a_bn_1 (BatchNormalizati (None, 16, 136, 240) 64          block_1a_conv_1[0][0]

__________________________________________________________________________________________________

block_1a_relu_1 (Activation)    (None, 16, 136, 240) 0           block_1a_bn_1[0][0]

__________________________________________________________________________________________________

block_1a_conv_2 (Conv2D)        (None, 48, 136, 240) 6960        block_1a_relu_1[0][0]

__________________________________________________________________________________________________

block_1a_conv_shortcut (Conv2D) (None, 48, 136, 240) 816         activation_1[0][0]

__________________________________________________________________________________________________

block_1a_bn_2 (BatchNormalizati (None, 48, 136, 240) 192         block_1a_conv_2[0][0]

__________________________________________________________________________________________________

block_1a_bn_shortcut (BatchNorm (None, 48, 136, 240) 192         block_1a_conv_shortcut[0][0]

__________________________________________________________________________________________________

add_1 (Add)                     (None, 48, 136, 240) 0           block_1a_bn_2[0][0]

                                                                 block_1a_bn_shortcut[0][0]

__________________________________________________________________________________________________

block_1a_relu (Activation)      (None, 48, 136, 240) 0           add_1[0][0]

__________________________________________________________________________________________________

block_1b_conv_1 (Conv2D)        (None, 16, 136, 240) 6928        block_1a_relu[0][0]

__________________________________________________________________________________________________

block_1b_bn_1 (BatchNormalizati (None, 16, 136, 240) 64          block_1b_conv_1[0][0]

__________________________________________________________________________________________________

block_1b_relu_1 (Activation)    (None, 16, 136, 240) 0           block_1b_bn_1[0][0]

__________________________________________________________________________________________________

block_1b_conv_2 (Conv2D)        (None, 48, 136, 240) 6960        block_1b_relu_1[0][0]

__________________________________________________________________________________________________

block_1b_bn_2 (BatchNormalizati (None, 48, 136, 240) 192         block_1b_conv_2[0][0]

__________________________________________________________________________________________________

add_2 (Add)                     (None, 48, 136, 240) 0           block_1b_bn_2[0][0]

                                                                 block_1a_relu[0][0]

__________________________________________________________________________________________________

block_1b_relu (Activation)      (None, 48, 136, 240) 0           add_2[0][0]

__________________________________________________________________________________________________

block_2a_conv_1 (Conv2D)        (None, 24, 68, 120)  10392       block_1b_relu[0][0]

__________________________________________________________________________________________________

block_2a_bn_1 (BatchNormalizati (None, 24, 68, 120)  96          block_2a_conv_1[0][0]

__________________________________________________________________________________________________

block_2a_relu_1 (Activation)    (None, 24, 68, 120)  0           block_2a_bn_1[0][0]

__________________________________________________________________________________________________

block_2a_conv_2 (Conv2D)        (None, 104, 68, 120) 22568       block_2a_relu_1[0][0]

__________________________________________________________________________________________________

block_2a_conv_shortcut (Conv2D) (None, 104, 68, 120) 5096        block_1b_relu[0][0]

__________________________________________________________________________________________________

block_2a_bn_2 (BatchNormalizati (None, 104, 68, 120) 416         block_2a_conv_2[0][0]

__________________________________________________________________________________________________

block_2a_bn_shortcut (BatchNorm (None, 104, 68, 120) 416         block_2a_conv_shortcut[0][0]

__________________________________________________________________________________________________

add_3 (Add)                     (None, 104, 68, 120) 0           block_2a_bn_2[0][0]

                                                                 block_2a_bn_shortcut[0][0]

__________________________________________________________________________________________________

block_2a_relu (Activation)      (None, 104, 68, 120) 0           add_3[0][0]

__________________________________________________________________________________________________

block_2b_conv_1 (Conv2D)        (None, 24, 68, 120)  22488       block_2a_relu[0][0]

__________________________________________________________________________________________________

block_2b_bn_1 (BatchNormalizati (None, 24, 68, 120)  96          block_2b_conv_1[0][0]

__________________________________________________________________________________________________

block_2b_relu_1 (Activation)    (None, 24, 68, 120)  0           block_2b_bn_1[0][0]

__________________________________________________________________________________________________

block_2b_conv_2 (Conv2D)        (None, 104, 68, 120) 22568       block_2b_relu_1[0][0]

__________________________________________________________________________________________________

block_2b_bn_2 (BatchNormalizati (None, 104, 68, 120) 416         block_2b_conv_2[0][0]

__________________________________________________________________________________________________

add_4 (Add)                     (None, 104, 68, 120) 0           block_2b_bn_2[0][0]

                                                                 block_2a_relu[0][0]

__________________________________________________________________________________________________

block_2b_relu (Activation)      (None, 104, 68, 120) 0           add_4[0][0]

__________________________________________________________________________________________________

block_3a_conv_1 (Conv2D)        (None, 16, 34, 60)   14992       block_2b_relu[0][0]

__________________________________________________________________________________________________

block_3a_bn_1 (BatchNormalizati (None, 16, 34, 60)   64          block_3a_conv_1[0][0]

__________________________________________________________________________________________________

block_3a_relu_1 (Activation)    (None, 16, 34, 60)   0           block_3a_bn_1[0][0]

__________________________________________________________________________________________________

block_3a_conv_2 (Conv2D)        (None, 136, 34, 60)  19720       block_3a_relu_1[0][0]

__________________________________________________________________________________________________

block_3a_conv_shortcut (Conv2D) (None, 136, 34, 60)  14280       block_2b_relu[0][0]

__________________________________________________________________________________________________

block_3a_bn_2 (BatchNormalizati (None, 136, 34, 60)  544         block_3a_conv_2[0][0]

__________________________________________________________________________________________________

block_3a_bn_shortcut (BatchNorm (None, 136, 34, 60)  544         block_3a_conv_shortcut[0][0]

__________________________________________________________________________________________________

add_5 (Add)                     (None, 136, 34, 60)  0           block_3a_bn_2[0][0]

                                                                 block_3a_bn_shortcut[0][0]

__________________________________________________________________________________________________

block_3a_relu (Activation)      (None, 136, 34, 60)  0           add_5[0][0]

__________________________________________________________________________________________________

block_3b_conv_1 (Conv2D)        (None, 24, 34, 60)   29400       block_3a_relu[0][0]

__________________________________________________________________________________________________

block_3b_bn_1 (BatchNormalizati (None, 24, 34, 60)   96          block_3b_conv_1[0][0]

__________________________________________________________________________________________________

block_3b_relu_1 (Activation)    (None, 24, 34, 60)   0           block_3b_bn_1[0][0]

__________________________________________________________________________________________________

block_3b_conv_2 (Conv2D)        (None, 136, 34, 60)  29512       block_3b_relu_1[0][0]

__________________________________________________________________________________________________

block_3b_bn_2 (BatchNormalizati (None, 136, 34, 60)  544         block_3b_conv_2[0][0]

__________________________________________________________________________________________________

add_6 (Add)                     (None, 136, 34, 60)  0           block_3b_bn_2[0][0]

                                                                 block_3a_relu[0][0]

__________________________________________________________________________________________________

block_3b_relu (Activation)      (None, 136, 34, 60)  0           add_6[0][0]

__________________________________________________________________________________________________

block_4a_conv_1 (Conv2D)        (None, 16, 34, 60)   19600       block_3b_relu[0][0]

__________________________________________________________________________________________________

block_4a_bn_1 (BatchNormalizati (None, 16, 34, 60)   64          block_4a_conv_1[0][0]

__________________________________________________________________________________________________

block_4a_relu_1 (Activation)    (None, 16, 34, 60)   0           block_4a_bn_1[0][0]

__________________________________________________________________________________________________

block_4a_conv_2 (Conv2D)        (None, 48, 34, 60)   6960        block_4a_relu_1[0][0]

__________________________________________________________________________________________________

block_4a_conv_shortcut (Conv2D) (None, 48, 34, 60)   6576        block_3b_relu[0][0]

__________________________________________________________________________________________________

block_4a_bn_2 (BatchNormalizati (None, 48, 34, 60)   192         block_4a_conv_2[0][0]

__________________________________________________________________________________________________

block_4a_bn_shortcut (BatchNorm (None, 48, 34, 60)   192         block_4a_conv_shortcut[0][0]

__________________________________________________________________________________________________

add_7 (Add)                     (None, 48, 34, 60)   0           block_4a_bn_2[0][0]

                                                                 block_4a_bn_shortcut[0][0]

__________________________________________________________________________________________________

block_4a_relu (Activation)      (None, 48, 34, 60)   0           add_7[0][0]

__________________________________________________________________________________________________

block_4b_conv_1 (Conv2D)        (None, 16, 34, 60)   6928        block_4a_relu[0][0]

__________________________________________________________________________________________________

block_4b_bn_1 (BatchNormalizati (None, 16, 34, 60)   64          block_4b_conv_1[0][0]

__________________________________________________________________________________________________

block_4b_relu_1 (Activation)    (None, 16, 34, 60)   0           block_4b_bn_1[0][0]

__________________________________________________________________________________________________

block_4b_conv_2 (Conv2D)        (None, 48, 34, 60)   6960        block_4b_relu_1[0][0]

__________________________________________________________________________________________________

block_4b_bn_2 (BatchNormalizati (None, 48, 34, 60)   192         block_4b_conv_2[0][0]

__________________________________________________________________________________________________

add_8 (Add)                     (None, 48, 34, 60)   0           block_4b_bn_2[0][0]

                                                                 block_4a_relu[0][0]

__________________________________________________________________________________________________

block_4b_relu (Activation)      (None, 48, 34, 60)   0           add_8[0][0]

__________________________________________________________________________________________________

output_bbox (Conv2D)            (None, 8, 34, 60)    392         block_4b_relu[0][0]

__________________________________________________________________________________________________

output_cov (Conv2D)             (None, 2, 34, 60)    98          block_4b_relu[0][0]

==================================================================================================

Total params: 269,586

Trainable params: 267,234

Non-trainable params: 2,352

__________________________________________________________________________________________________

2020-09-24 18:36:05,565 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False

2020-09-24 18:36:05,565 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False

2020-09-24 18:36:05,566 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0)

2020-09-24 18:36:05,566 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 8, io threads: 16, compute threads: 8, buffered batches: 4

2020-09-24 18:36:05,566 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 2827, number of sources: 1, batch size per gpu: 24, steps: 118

2020-09-24 18:36:05,698 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.

2020-09-24 18:36:05.741176: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:36:05.742197: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 18:36:05.742249: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 18:36:05.742301: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 18:36:05.742353: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 18:36:05.742403: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 18:36:05.742453: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 18:36:05.742501: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 18:36:05.742547: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 18:36:05.742656: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:36:05.743658: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:36:05.744561: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 18:36:06,049 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: shuffle: True - shard 0 of 1

2020-09-24 18:36:06,057 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: sampling 1 datasets with weights:

2020-09-24 18:36:06,057 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: source: 0 weight: 1.000000

2020-09-24 18:36:06,740 [INFO] iva.detectnet_v2.scripts.train: Found 2827 samples in training set

2020-09-24 18:36:09,932 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False

2020-09-24 18:36:09,933 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False

2020-09-24 18:36:09,933 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0)

2020-09-24 18:36:09,933 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 8, io threads: 16, compute threads: 8, buffered batches: 4

2020-09-24 18:36:09,933 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 706, number of sources: 1, batch size per gpu: 24, steps: 30

2020-09-24 18:36:09,982 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.

2020-09-24 18:36:10,317 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: shuffle: False - shard 0 of 1

2020-09-24 18:36:10,324 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: sampling 1 datasets with weights:

2020-09-24 18:36:10,325 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: source: 0 weight: 1.000000

2020-09-24 18:36:10,784 [INFO] iva.detectnet_v2.scripts.train: Found 706 samples in validation set

2020-09-24 18:36:14.600049: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:36:14.601043: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 18:36:14.601103: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 18:36:14.601179: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 18:36:14.601243: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 18:36:14.601293: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 18:36:14.601343: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 18:36:14.601392: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 18:36:14.601436: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 18:36:14.601537: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:36:14.602544: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:36:14.603440: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 18:36:14.604835: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-09-24 18:36:14.604870: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0

2020-09-24 18:36:14.604886: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N

2020-09-24 18:36:14.605035: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:36:14.606043: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 18:36:14.607063: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14636 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0)

2020-09-24 18:36:40.087811: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 18:36:40.578595: I tensorflow/core/kernels/cuda_solvers.cc:159] Creating CudaSolver handles for stream 0x6a158f0

2020-09-24 18:36:40.578909: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 18:36:41.112260: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 18:36:41.149319: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 18:36:43,873 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 0/120: loss: 0.03920 Time taken: 0:00:00 ETA: 0:00:00

2020-09-24 18:36:43,873 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 4.178

2020-09-24 18:36:52,010 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 43.223

2020-09-24 18:36:55,578 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.143

2020-09-24 18:36:59,312 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.726

2020-09-24 18:37:02,901 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.180

2020-09-24 18:37:06,047 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 1/120: loss: 0.00129 Time taken: 0:00:27.742774 ETA: 0:55:01.390136

2020-09-24 18:37:06,969 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 147.501

2020-09-24 18:37:10,585 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.924

2020-09-24 18:37:14,299 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.593

2020-09-24 18:37:18,127 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 156.755

2020-09-24 18:37:21,866 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.456

2020-09-24 18:37:23,673 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 2/120: loss: 0.00138 Time taken: 0:00:17.643318 ETA: 0:34:41.911489

2020-09-24 18:37:25,572 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.928

2020-09-24 18:37:29,311 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.514

2020-09-24 18:37:32,869 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.642

2020-09-24 18:37:36,510 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.789

2020-09-24 18:37:40,246 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.613

2020-09-24 18:37:40,973 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 3/120: loss: 0.00135 Time taken: 0:00:17.318547 ETA: 0:33:46.269972

2020-09-24 18:37:43,806 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.561

2020-09-24 18:37:47,406 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.691

2020-09-24 18:37:50,844 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 174.532

2020-09-24 18:37:54,514 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.531

2020-09-24 18:37:57,845 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 4/120: loss: 0.00118 Time taken: 0:00:16.852611 ETA: 0:32:34.902884

2020-09-24 18:37:58,165 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.361

2020-09-24 18:38:01,728 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.407

2020-09-24 18:38:05,374 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.566

2020-09-24 18:38:08,952 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.707

2020-09-24 18:38:12,576 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.595

2020-09-24 18:38:14,828 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 5/120: loss: 0.00118 Time taken: 0:00:17.002347 ETA: 0:32:35.269959

2020-09-24 18:38:16,132 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.736

2020-09-24 18:38:19,689 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.698

2020-09-24 18:38:23,458 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.198

2020-09-24 18:38:27,070 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.136

2020-09-24 18:38:30,739 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.539

2020-09-24 18:38:32,020 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 6/120: loss: 0.00094 Time taken: 0:00:17.190248 ETA: 0:32:39.688246

2020-09-24 18:38:34,278 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.590

2020-09-24 18:38:37,982 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.983

2020-09-24 18:38:41,661 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.126

2020-09-24 18:38:45,286 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.507

2020-09-24 18:38:49,014 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.980

2020-09-24 18:38:49,361 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 7/120: loss: 0.00092 Time taken: 0:00:17.305124 ETA: 0:32:35.478963

2020-09-24 18:38:52,613 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.713

2020-09-24 18:38:56,430 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 157.214

2020-09-24 18:39:00,028 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.762

2020-09-24 18:39:03,679 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.369

2020-09-24 18:39:06,761 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 8/120: loss: 0.00092 Time taken: 0:00:17.437567 ETA: 0:32:33.007504

2020-09-24 18:39:07,508 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 156.722

2020-09-24 18:39:11,183 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.276

2020-09-24 18:39:14,808 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.534

2020-09-24 18:39:18,492 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.864

2020-09-24 18:39:22,162 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.495

2020-09-24 18:39:24,111 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 9/120: loss: 0.00067 Time taken: 0:00:17.329070 ETA: 0:32:03.526807

2020-09-24 18:39:25,858 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.348

2020-09-24 18:39:29,461 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.536

2020-09-24 18:39:33,147 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.816

2020-09-24 18:39:36,865 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.370

2020-09-24 18:39:40,541 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.266

2020-09-24 18:39:42,472 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 18:42:24,754 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 16.23s/step

2020-09-24 18:45:06,392 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 16.16s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 783166/783166 [00:42<00:00, 18374.49it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 1009644/1009644 [00:50<00:00, 20130.86it/s]

Epoch 10/120

=========================

Validation cost: 0.000762

Mean average_precision (in %): 14.3511

class name      average precision (in %)

------------  --------------------------

mask                            0.484533

no-mask                        28.2176

Median Inference Time: 0.004388

2020-09-24 18:49:31,984 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 10/120: loss: 0.00061 Time taken: 0:10:07.889806 ETA: 18:34:27.878690

2020-09-24 18:49:34,749 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 1.010

2020-09-24 18:49:38,444 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.431

2020-09-24 18:49:42,123 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.091

2020-09-24 18:49:45,646 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 170.316

2020-09-24 18:49:49,274 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 11/120: loss: 0.00067 Time taken: 0:00:17.250518 ETA: 0:31:20.306419

2020-09-24 18:49:49,412 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.325

2020-09-24 18:49:53,017 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.469

2020-09-24 18:49:56,570 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.882

2020-09-24 18:50:00,218 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.489

2020-09-24 18:50:03,880 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.875

2020-09-24 18:50:06,334 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 12/120: loss: 0.00053 Time taken: 0:00:17.087909 ETA: 0:30:45.494119

2020-09-24 18:50:07,467 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.275

2020-09-24 18:50:11,098 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.263

2020-09-24 18:50:14,647 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.069

2020-09-24 18:50:18,222 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.837

2020-09-24 18:50:21,907 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.848

2020-09-24 18:50:23,412 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 13/120: loss: 0.00048 Time taken: 0:00:17.058797 ETA: 0:30:25.291292

2020-09-24 18:50:25,562 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.171

2020-09-24 18:50:29,310 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.092

2020-09-24 18:50:32,976 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.717

2020-09-24 18:50:36,605 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.318

2020-09-24 18:50:40,240 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.088

2020-09-24 18:50:40,720 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 14/120: loss: 0.00065 Time taken: 0:00:17.331649 ETA: 0:30:37.154750

2020-09-24 18:50:44,060 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 157.082

2020-09-24 18:50:47,842 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.658

2020-09-24 18:50:51,397 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.807

2020-09-24 18:50:55,088 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.559

2020-09-24 18:50:58,026 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 15/120: loss: 0.00043 Time taken: 0:00:17.290431 ETA: 0:30:15.495307

2020-09-24 18:50:58,596 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 171.062

2020-09-24 18:51:02,286 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.650

2020-09-24 18:51:05,964 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.116

2020-09-24 18:51:09,665 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.158

2020-09-24 18:51:13,241 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.781

2020-09-24 18:51:15,342 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 16/120: loss: 0.00054 Time taken: 0:00:17.305713 ETA: 0:29:59.794195

2020-09-24 18:51:16,908 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.641

2020-09-24 18:51:20,533 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.514

2020-09-24 18:51:24,064 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.938

2020-09-24 18:51:27,822 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.675

2020-09-24 18:51:31,363 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.474

2020-09-24 18:51:32,344 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 17/120: loss: 0.00044 Time taken: 0:00:17.012848 ETA: 0:29:12.323334

2020-09-24 18:51:34,983 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.771

2020-09-24 18:51:38,590 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.343

2020-09-24 18:51:42,222 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.215

2020-09-24 18:51:45,785 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.412

2020-09-24 18:51:49,417 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 18/120: loss: 0.00050 Time taken: 0:00:17.091074 ETA: 0:29:03.289571

2020-09-24 18:51:49,418 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.204

2020-09-24 18:51:53,045 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.423

2020-09-24 18:51:56,771 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.057

2020-09-24 18:52:00,455 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.873

2020-09-24 18:52:04,040 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.357

2020-09-24 18:52:06,741 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 19/120: loss: 0.00041 Time taken: 0:00:17.307698 ETA: 0:29:08.077547

2020-09-24 18:52:07,747 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.885

2020-09-24 18:52:11,312 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.341

2020-09-24 18:52:14,880 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.130

2020-09-24 18:52:18,540 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.985

2020-09-24 18:52:22,076 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.680

2020-09-24 18:52:24,763 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 18:54:03,931 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 9.92s/step

2020-09-24 18:55:42,247 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 9.83s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 146409/146409 [00:07<00:00, 18514.53it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 23108/23108 [00:01<00:00, 16779.31it/s]

Epoch 20/120

=========================

Validation cost: 0.000456

Mean average_precision (in %): 32.3306

class name      average precision (in %)

------------  --------------------------

mask                             3.26018

no-mask                         61.401

Median Inference Time: 0.003923

2020-09-24 18:57:20,243 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 20/120: loss: 0.00039 Time taken: 0:05:13.511388 ETA: 8:42:31.138783

2020-09-24 18:57:22,179 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 1.999

2020-09-24 18:57:25,760 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.582

2020-09-24 18:57:29,461 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.125

2020-09-24 18:57:33,047 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.315

2020-09-24 18:57:36,617 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.084

2020-09-24 18:57:37,227 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 21/120: loss: 0.00038 Time taken: 0:00:16.973165 ETA: 0:28:00.343362

2020-09-24 18:57:40,133 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 170.675

2020-09-24 18:57:43,726 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.016

2020-09-24 18:57:47,439 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.628

2020-09-24 18:57:51,110 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.422

2020-09-24 18:57:54,422 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 22/120: loss: 0.00037 Time taken: 0:00:17.192148 ETA: 0:28:04.830548

2020-09-24 18:57:54,858 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.125

2020-09-24 18:57:58,388 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.969

2020-09-24 18:58:01,862 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 172.721

2020-09-24 18:58:05,461 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.756

2020-09-24 18:58:09,055 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.943

2020-09-24 18:58:11,247 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 23/120: loss: 0.00047 Time taken: 0:00:16.837468 ETA: 0:27:13.234364

2020-09-24 18:58:12,655 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.687

2020-09-24 18:58:16,263 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.318

2020-09-24 18:58:19,811 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.111

2020-09-24 18:58:23,341 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.983

2020-09-24 18:58:26,948 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.379

2020-09-24 18:58:28,172 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 24/120: loss: 0.00031 Time taken: 0:00:16.942528 ETA: 0:27:06.482712

2020-09-24 18:58:30,605 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.093

2020-09-24 18:58:34,240 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.099

2020-09-24 18:58:37,976 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.588

2020-09-24 18:58:41,657 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.003

2020-09-24 18:58:45,257 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.707

2020-09-24 18:58:45,403 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 25/120: loss: 0.00044 Time taken: 0:00:17.203813 ETA: 0:27:14.362242

2020-09-24 18:58:48,905 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.503

2020-09-24 18:58:52,542 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.025

2020-09-24 18:58:56,187 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.586

2020-09-24 18:58:59,771 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.447

2020-09-24 18:59:02,563 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 26/120: loss: 0.00033 Time taken: 0:00:17.181301 ETA: 0:26:55.042260

2020-09-24 18:59:03,472 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.146

2020-09-24 18:59:07,029 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.689

2020-09-24 18:59:10,675 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.576

2020-09-24 18:59:14,273 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.791

2020-09-24 18:59:17,891 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.871

2020-09-24 18:59:19,633 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 27/120: loss: 0.00038 Time taken: 0:00:17.054855 ETA: 0:26:26.101525

2020-09-24 18:59:21,550 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.989

2020-09-24 18:59:25,174 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.596

2020-09-24 18:59:28,894 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.289

2020-09-24 18:59:32,540 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.570

2020-09-24 18:59:36,170 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.308

2020-09-24 18:59:36,902 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 28/120: loss: 0.00025 Time taken: 0:00:17.276183 ETA: 0:26:29.408848

2020-09-24 18:59:39,933 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.467

2020-09-24 18:59:43,551 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.838

2020-09-24 18:59:47,243 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.545

2020-09-24 18:59:50,891 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.475

2020-09-24 18:59:54,216 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 29/120: loss: 0.00027 Time taken: 0:00:17.289040 ETA: 0:26:13.302605

2020-09-24 18:59:54,505 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.039

2020-09-24 18:59:58,172 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.676

2020-09-24 19:00:01,780 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.294

2020-09-24 19:00:05,381 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.611

2020-09-24 19:00:08,975 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.974

2020-09-24 19:00:12,328 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 19:00:52,894 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 4.06s/step

2020-09-24 19:01:38,689 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 4.58s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 20658/20658 [00:00<00:00, 23160.41it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 3628/3628 [00:00<00:00, 13181.45it/s]

Epoch 30/120

=========================

Validation cost: 0.000399

Mean average_precision (in %): 36.9906

class name      average precision (in %)

------------  --------------------------

mask                             4.71818

no-mask                         69.263

Median Inference Time: 0.004524

2020-09-24 19:02:20,877 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 30/120: loss: 0.00040 Time taken: 0:02:26.680091 ETA: 3:40:01.208224

2020-09-24 19:02:22,099 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 4.507

2020-09-24 19:02:25,654 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.778

2020-09-24 19:02:29,148 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 171.762

2020-09-24 19:02:32,776 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.385

2020-09-24 19:02:36,396 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.752

2020-09-24 19:02:37,717 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 31/120: loss: 0.00037 Time taken: 0:00:16.833377 ETA: 0:24:58.170543

2020-09-24 19:02:40,076 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.054

2020-09-24 19:02:43,696 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.780

2020-09-24 19:02:47,345 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.456

2020-09-24 19:02:50,942 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.788

2020-09-24 19:02:54,553 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.196

2020-09-24 19:02:54,865 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 32/120: loss: 0.00026 Time taken: 0:00:17.130792 ETA: 0:25:07.509687

2020-09-24 19:02:58,225 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.400

2020-09-24 19:03:01,951 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.041

2020-09-24 19:03:05,502 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.016

2020-09-24 19:03:09,220 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.370

2020-09-24 19:03:12,097 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 33/120: loss: 0.00061 Time taken: 0:00:17.271268 ETA: 0:25:02.600307

2020-09-24 19:03:12,817 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.807

2020-09-24 19:03:16,380 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.455

2020-09-24 19:03:20,184 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 157.742

2020-09-24 19:03:23,787 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.546

2020-09-24 19:03:27,418 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.257

2020-09-24 19:03:29,241 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 34/120: loss: 0.00047 Time taken: 0:00:17.126378 ETA: 0:24:32.868493

2020-09-24 19:03:30,972 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.829

2020-09-24 19:03:34,684 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.686

2020-09-24 19:03:38,351 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.635

2020-09-24 19:03:41,993 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.723

2020-09-24 19:03:45,815 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 157.034

2020-09-24 19:03:46,736 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 35/120: loss: 0.00033 Time taken: 0:00:17.475193 ETA: 0:24:45.391407

2020-09-24 19:03:49,348 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.810

2020-09-24 19:03:52,963 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.026

2020-09-24 19:03:56,591 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.400

2020-09-24 19:04:00,137 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.183

2020-09-24 19:04:03,772 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 36/120: loss: 0.00080 Time taken: 0:00:17.042989 ETA: 0:23:51.611077

2020-09-24 19:04:03,902 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.409

2020-09-24 19:04:07,528 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.460

2020-09-24 19:04:11,320 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.260

2020-09-24 19:04:15,046 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.042

2020-09-24 19:04:18,606 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.546

2020-09-24 19:04:21,180 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 37/120: loss: 0.00039 Time taken: 0:00:17.403021 ETA: 0:24:04.450731

2020-09-24 19:04:22,354 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.119

2020-09-24 19:04:26,073 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.352

2020-09-24 19:04:29,712 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.883

2020-09-24 19:04:33,383 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.447

2020-09-24 19:04:37,123 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.436

2020-09-24 19:04:38,543 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 38/120: loss: 0.00043 Time taken: 0:00:17.376115 ETA: 0:23:44.841437

2020-09-24 19:04:40,693 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.108

2020-09-24 19:04:44,383 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.599

2020-09-24 19:04:48,045 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.871

2020-09-24 19:04:51,673 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.418

2020-09-24 19:04:55,281 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.310

2020-09-24 19:04:55,700 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 39/120: loss: 0.00039 Time taken: 0:00:17.135169 ETA: 0:23:07.948691

2020-09-24 19:04:58,926 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.598

2020-09-24 19:05:02,449 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 170.349

2020-09-24 19:05:06,171 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.222

2020-09-24 19:05:09,799 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.398

2020-09-24 19:05:13,855 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 19:05:23,390 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.95s/step

2020-09-24 19:05:33,095 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.97s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 7427/7427 [00:00<00:00, 22384.66it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 6264/6264 [00:00<00:00, 14497.69it/s]

Epoch 40/120

=========================

Validation cost: 0.000387

Mean average_precision (in %): 38.2748

class name      average precision (in %)

------------  --------------------------

mask                             8.70811

no-mask                         67.8415

Median Inference Time: 0.004694

2020-09-24 19:05:43,260 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 40/120: loss: 0.00038 Time taken: 0:00:47.554917 ETA: 1:03:24.393349

2020-09-24 19:05:43,900 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 17.595

2020-09-24 19:05:47,493 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.016

2020-09-24 19:05:51,105 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.122

2020-09-24 19:05:54,784 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.117

2020-09-24 19:05:58,506 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.224

2020-09-24 19:06:00,527 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 41/120: loss: 0.00035 Time taken: 0:00:17.286655 ETA: 0:22:45.645741

2020-09-24 19:06:02,160 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.210

2020-09-24 19:06:05,802 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.746

2020-09-24 19:06:09,335 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.875

2020-09-24 19:06:12,978 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.718

2020-09-24 19:06:16,504 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 170.142

2020-09-24 19:06:17,503 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 42/120: loss: 0.00027 Time taken: 0:00:16.978847 ETA: 0:22:04.350031

2020-09-24 19:06:20,126 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.705

2020-09-24 19:06:23,629 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 171.285

2020-09-24 19:06:27,235 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.422

2020-09-24 19:06:30,860 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.514

2020-09-24 19:06:34,489 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 43/120: loss: 0.00037 Time taken: 0:00:16.986543 ETA: 0:21:47.963788

2020-09-24 19:06:34,489 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.366

2020-09-24 19:06:38,342 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 155.743

2020-09-24 19:06:41,992 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.369

2020-09-24 19:06:45,664 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.437

2020-09-24 19:06:49,316 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.289

2020-09-24 19:06:51,937 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 44/120: loss: 0.00041 Time taken: 0:00:17.443355 ETA: 0:22:05.694986

2020-09-24 19:06:53,024 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.844

2020-09-24 19:06:56,601 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.741

2020-09-24 19:07:00,298 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.320

2020-09-24 19:07:03,867 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.148

2020-09-24 19:07:07,553 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.790

2020-09-24 19:07:09,111 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 45/120: loss: 0.00027 Time taken: 0:00:17.178553 ETA: 0:21:28.391465

2020-09-24 19:07:11,204 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.339

2020-09-24 19:07:14,692 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 172.056

2020-09-24 19:07:18,288 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.878

2020-09-24 19:07:21,990 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.075

2020-09-24 19:07:25,743 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.879

2020-09-24 19:07:26,274 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 46/120: loss: 0.00024 Time taken: 0:00:17.144250 ETA: 0:21:08.674529

2020-09-24 19:07:29,256 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 170.828

2020-09-24 19:07:32,990 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.675

2020-09-24 19:07:36,513 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 170.349

2020-09-24 19:07:40,185 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.415

2020-09-24 19:07:43,358 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 47/120: loss: 0.00037 Time taken: 0:00:17.101318 ETA: 0:20:48.396223

2020-09-24 19:07:43,764 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.673

2020-09-24 19:07:47,414 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.361

2020-09-24 19:07:51,100 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.832

2020-09-24 19:07:54,818 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.406

2020-09-24 19:07:58,385 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.201

2020-09-24 19:08:00,522 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 48/120: loss: 0.00031 Time taken: 0:00:17.163529 ETA: 0:20:35.774099

2020-09-24 19:08:02,021 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.046

2020-09-24 19:08:05,641 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.752

2020-09-24 19:08:09,138 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 171.585

2020-09-24 19:08:12,920 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.664

2020-09-24 19:08:16,539 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.816

2020-09-24 19:08:17,648 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 49/120: loss: 0.00024 Time taken: 0:00:17.132588 ETA: 0:20:16.413775

2020-09-24 19:08:20,088 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.075

2020-09-24 19:08:23,711 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.610

2020-09-24 19:08:27,247 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.722

2020-09-24 19:08:30,880 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.185

2020-09-24 19:08:35,711 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 19:08:42,255 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.65s/step

2020-09-24 19:08:48,582 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.63s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 3166/3166 [00:00<00:00, 23945.50it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 6940/6940 [00:00<00:00, 14718.02it/s]

Epoch 50/120

=========================

Validation cost: 0.000392

Mean average_precision (in %): 35.9768

class name      average precision (in %)

------------  --------------------------

mask                              7.1685

no-mask                          64.785

Median Inference Time: 0.004531

2020-09-24 19:08:55,597 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 24.275

2020-09-24 19:08:55,740 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 50/120: loss: 0.00031 Time taken: 0:00:38.070783 ETA: 0:44:24.954786

2020-09-24 19:08:59,278 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.047

2020-09-24 19:09:02,982 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.976

2020-09-24 19:09:06,711 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.920

2020-09-24 19:09:10,359 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.516

2020-09-24 19:09:13,018 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 51/120: loss: 0.00034 Time taken: 0:00:17.249104 ETA: 0:19:50.188194

2020-09-24 19:09:13,919 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.554

2020-09-24 19:09:17,548 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.330

2020-09-24 19:09:21,199 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.377

2020-09-24 19:09:24,860 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.917

2020-09-24 19:09:28,570 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.708

2020-09-24 19:09:30,389 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 52/120: loss: 0.00024 Time taken: 0:00:17.380692 ETA: 0:19:41.887040

2020-09-24 19:09:32,256 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.825

2020-09-24 19:09:35,899 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.697

2020-09-24 19:09:39,519 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.765

2020-09-24 19:09:43,100 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.589

2020-09-24 19:09:46,854 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.841

2020-09-24 19:09:47,594 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 53/120: loss: 0.00026 Time taken: 0:00:17.198695 ETA: 0:19:12.312577

2020-09-24 19:09:50,575 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.251

2020-09-24 19:09:54,331 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.784

2020-09-24 19:09:58,027 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.333

2020-09-24 19:10:01,676 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.434

2020-09-24 19:10:05,138 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 54/120: loss: 0.00039 Time taken: 0:00:17.573803 ETA: 0:19:19.870995

2020-09-24 19:10:05,436 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.584

2020-09-24 19:10:09,083 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.567

2020-09-24 19:10:12,804 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.224

2020-09-24 19:10:16,507 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.071

2020-09-24 19:10:20,105 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.789

2020-09-24 19:10:22,385 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 55/120: loss: 0.00032 Time taken: 0:00:17.243935 ETA: 0:18:40.855782

2020-09-24 19:10:23,682 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.711

2020-09-24 19:10:27,252 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.127

2020-09-24 19:10:30,981 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.898

2020-09-24 19:10:34,578 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.805

2020-09-24 19:10:38,169 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.113

2020-09-24 19:10:39,554 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 56/120: loss: 0.00031 Time taken: 0:00:17.147934 ETA: 0:18:17.467758

2020-09-24 19:10:41,821 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.311

2020-09-24 19:10:45,411 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.155

2020-09-24 19:10:48,981 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.105

2020-09-24 19:10:52,700 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.333

2020-09-24 19:10:56,334 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.134

2020-09-24 19:10:56,607 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 57/120: loss: 0.00034 Time taken: 0:00:17.076460 ETA: 0:17:55.816958

2020-09-24 19:11:00,017 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.910

2020-09-24 19:11:03,629 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.113

2020-09-24 19:11:07,238 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.294

2020-09-24 19:11:10,950 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.631

2020-09-24 19:11:13,859 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 58/120: loss: 0.00020 Time taken: 0:00:17.259958 ETA: 0:17:50.117383

2020-09-24 19:11:14,514 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.378

2020-09-24 19:11:18,067 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.909

2020-09-24 19:11:21,709 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.727

2020-09-24 19:11:25,243 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.824

2020-09-24 19:11:28,811 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.167

2020-09-24 19:11:30,635 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 59/120: loss: 0.00022 Time taken: 0:00:16.758451 ETA: 0:17:02.265510

2020-09-24 19:11:32,362 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.007

2020-09-24 19:11:36,056 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.408

2020-09-24 19:11:39,700 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.668

2020-09-24 19:11:43,468 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.248

2020-09-24 19:11:47,165 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.345

2020-09-24 19:11:49,040 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 19:11:53,343 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.43s/step

2020-09-24 19:11:57,471 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.41s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 3217/3217 [00:00<00:00, 19808.44it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 4573/4573 [00:00<00:00, 12866.55it/s]

Epoch 60/120

=========================

Validation cost: 0.000319

Mean average_precision (in %): 44.6590

class name      average precision (in %)

------------  --------------------------

mask                             12.3694

no-mask                          76.9485

Median Inference Time: 0.004598

2020-09-24 19:12:02,703 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 60/120: loss: 0.00029 Time taken: 0:00:32.019598 ETA: 0:32:01.175895

2020-09-24 19:12:05,550 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 32.636

2020-09-24 19:12:09,131 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.554

2020-09-24 19:12:12,806 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.258

2020-09-24 19:12:16,513 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.884

2020-09-24 19:12:20,132 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 61/120: loss: 0.00044 Time taken: 0:00:17.497981 ETA: 0:17:12.380897

2020-09-24 19:12:20,280 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.294

2020-09-24 19:12:23,879 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.747

2020-09-24 19:12:27,613 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.683

2020-09-24 19:12:31,269 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.122

2020-09-24 19:12:35,011 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.354

2020-09-24 19:12:37,501 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 62/120: loss: 0.00035 Time taken: 0:00:17.319915 ETA: 0:16:44.555059

2020-09-24 19:12:38,693 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.984

2020-09-24 19:12:42,390 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.310

2020-09-24 19:12:46,079 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.670

2020-09-24 19:12:49,763 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.882

2020-09-24 19:12:53,279 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 170.667

2020-09-24 19:12:54,664 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 63/120: loss: 0.00027 Time taken: 0:00:17.211240 ETA: 0:16:21.040656

2020-09-24 19:12:56,921 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.754

2020-09-24 19:13:00,724 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 157.795

2020-09-24 19:13:04,324 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.685

2020-09-24 19:13:08,026 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.076

2020-09-24 19:13:11,608 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.533

2020-09-24 19:13:12,055 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 64/120: loss: 0.00022 Time taken: 0:00:17.379925 ETA: 0:16:13.275787

2020-09-24 19:13:15,265 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.067

2020-09-24 19:13:18,917 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.289

2020-09-24 19:13:22,650 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.766

2020-09-24 19:13:26,362 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.673

2020-09-24 19:13:29,476 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 65/120: loss: 0.00022 Time taken: 0:00:17.416542 ETA: 0:15:57.909800

2020-09-24 19:13:29,986 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.549

2020-09-24 19:13:33,616 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.342

2020-09-24 19:13:37,182 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.277

2020-09-24 19:13:40,819 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.968

2020-09-24 19:13:44,429 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.201

2020-09-24 19:13:46,445 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 66/120: loss: 0.00027 Time taken: 0:00:16.928187 ETA: 0:15:14.122105

2020-09-24 19:13:48,085 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.129

2020-09-24 19:13:51,953 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 155.141

2020-09-24 19:13:55,655 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.101

2020-09-24 19:13:59,273 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.855

2020-09-24 19:14:02,906 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.152

2020-09-24 19:14:03,928 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 67/120: loss: 0.00033 Time taken: 0:00:17.509300 ETA: 0:15:27.992874

2020-09-24 19:14:06,527 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.731

2020-09-24 19:14:10,086 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.592

2020-09-24 19:14:13,695 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.279

2020-09-24 19:14:17,432 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.588

2020-09-24 19:14:21,120 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 68/120: loss: 0.00040 Time taken: 0:00:17.181508 ETA: 0:14:53.438407

2020-09-24 19:14:21,120 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.697

2020-09-24 19:14:24,734 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.002

2020-09-24 19:14:28,360 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.503

2020-09-24 19:14:32,154 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.149

2020-09-24 19:14:35,835 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.024

2020-09-24 19:14:38,435 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 69/120: loss: 0.00032 Time taken: 0:00:17.341022 ETA: 0:14:44.392098

2020-09-24 19:14:39,465 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.305

2020-09-24 19:14:43,063 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.821

2020-09-24 19:14:46,762 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.192

2020-09-24 19:14:50,343 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.580

2020-09-24 19:14:54,055 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.671

2020-09-24 19:14:56,694 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 19:15:02,545 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.59s/step

2020-09-24 19:15:08,255 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.57s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 3305/3305 [00:00<00:00, 24772.77it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 7420/7420 [00:00<00:00, 14826.67it/s]

Epoch 70/120

=========================

Validation cost: 0.000444

Mean average_precision (in %): 30.4551

class name      average precision (in %)

------------  --------------------------

mask                             7.24163

no-mask                         53.6685

Median Inference Time: 0.004786

2020-09-24 19:15:15,054 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 70/120: loss: 0.00027 Time taken: 0:00:36.586919 ETA: 0:30:29.345965

2020-09-24 19:15:17,035 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 26.110

2020-09-24 19:15:20,699 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.746

2020-09-24 19:15:24,337 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.964

2020-09-24 19:15:28,069 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.768

2020-09-24 19:15:31,823 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.870

2020-09-24 19:15:32,419 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 71/120: loss: 0.00023 Time taken: 0:00:17.357221 ETA: 0:14:10.503812

2020-09-24 19:15:35,441 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.845

2020-09-24 19:15:39,183 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.373

2020-09-24 19:15:42,825 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.734

2020-09-24 19:15:46,539 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.566

2020-09-24 19:15:49,744 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 72/120: loss: 0.00026 Time taken: 0:00:17.342890 ETA: 0:13:52.458733

2020-09-24 19:15:50,150 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.196

2020-09-24 19:15:53,855 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.934

2020-09-24 19:15:57,560 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.977

2020-09-24 19:16:01,252 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.540

2020-09-24 19:16:04,843 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.085

2020-09-24 19:16:07,194 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 73/120: loss: 0.00021 Time taken: 0:00:17.415154 ETA: 0:13:38.512215

2020-09-24 19:16:08,627 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.580

2020-09-24 19:16:12,144 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 170.628

2020-09-24 19:16:15,791 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.524

2020-09-24 19:16:19,572 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.695

2020-09-24 19:16:23,189 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.897

2020-09-24 19:16:24,345 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 74/120: loss: 0.00041 Time taken: 0:00:17.188585 ETA: 0:13:10.674912

2020-09-24 19:16:26,800 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.172

2020-09-24 19:16:30,397 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.819

2020-09-24 19:16:34,020 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.639

2020-09-24 19:16:37,686 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.684

2020-09-24 19:16:41,391 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.947

2020-09-24 19:16:41,550 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 75/120: loss: 0.00035 Time taken: 0:00:17.180835 ETA: 0:12:53.137597

2020-09-24 19:16:45,059 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.626

2020-09-24 19:16:48,599 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.529

2020-09-24 19:16:52,159 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.549

2020-09-24 19:16:55,869 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.728

2020-09-24 19:16:58,601 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 76/120: loss: 0.00035 Time taken: 0:00:17.055869 ETA: 0:12:30.458241

2020-09-24 19:16:59,443 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.891

2020-09-24 19:17:03,184 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.397

2020-09-24 19:17:06,962 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.840

2020-09-24 19:17:10,752 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.339

2020-09-24 19:17:14,360 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.301

2020-09-24 19:17:16,189 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 77/120: loss: 0.00029 Time taken: 0:00:17.560616 ETA: 0:12:35.106489

2020-09-24 19:17:18,051 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.585

2020-09-24 19:17:21,823 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.082

2020-09-24 19:17:25,403 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.604

2020-09-24 19:17:28,981 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.718

2020-09-24 19:17:32,536 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.777

2020-09-24 19:17:33,289 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 78/120: loss: 0.00021 Time taken: 0:00:17.142346 ETA: 0:11:59.978518

2020-09-24 19:17:36,235 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.226

2020-09-24 19:17:40,053 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 157.152

2020-09-24 19:17:43,652 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.769

2020-09-24 19:17:47,368 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.474

2020-09-24 19:17:50,719 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 79/120: loss: 0.00023 Time taken: 0:00:17.373628 ETA: 0:11:52.318763

2020-09-24 19:17:50,980 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.101

2020-09-24 19:17:54,610 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.319

2020-09-24 19:17:58,167 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.694

2020-09-24 19:18:01,685 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 170.596

2020-09-24 19:18:05,330 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.635

2020-09-24 19:18:08,593 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 19:18:13,137 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.45s/step

2020-09-24 19:18:17,268 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.41s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 3994/3994 [00:00<00:00, 21161.04it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 3734/3734 [00:00<00:00, 13727.34it/s]

Epoch 80/120

=========================

Validation cost: 0.000356

Mean average_precision (in %): 44.4873

class name      average precision (in %)

------------  --------------------------

mask                             20.7635

no-mask                          68.2111

Median Inference Time: 0.004535

2020-09-24 19:18:22,559 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 80/120: loss: 0.00029 Time taken: 0:00:31.873883 ETA: 0:21:14.955311

2020-09-24 19:18:23,807 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 32.473

2020-09-24 19:18:27,376 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.106

2020-09-24 19:18:30,916 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.535

2020-09-24 19:18:34,588 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.414

2020-09-24 19:18:38,377 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.357

2020-09-24 19:18:39,675 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 81/120: loss: 0.00032 Time taken: 0:00:17.132768 ETA: 0:11:08.177939

2020-09-24 19:18:41,994 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.905

2020-09-24 19:18:45,785 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.274

2020-09-24 19:18:49,458 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.379

2020-09-24 19:18:53,090 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.220

2020-09-24 19:18:56,845 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.783

2020-09-24 19:18:57,152 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 82/120: loss: 0.00024 Time taken: 0:00:17.469684 ETA: 0:11:03.848006

2020-09-24 19:19:00,555 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.742

2020-09-24 19:19:04,258 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.029

2020-09-24 19:19:07,951 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.508

2020-09-24 19:19:11,698 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.134

2020-09-24 19:19:14,730 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 83/120: loss: 0.00031 Time taken: 0:00:17.589500 ETA: 0:10:50.811516

2020-09-24 19:19:15,462 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.433

2020-09-24 19:19:19,197 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.675

2020-09-24 19:19:22,777 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.580

2020-09-24 19:19:26,573 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.099

2020-09-24 19:19:30,182 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.246

2020-09-24 19:19:32,125 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 84/120: loss: 0.00034 Time taken: 0:00:17.375587 ETA: 0:10:25.521140

2020-09-24 19:19:33,912 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.858

2020-09-24 19:19:37,578 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.709

2020-09-24 19:19:41,251 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.378

2020-09-24 19:19:44,821 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.085

2020-09-24 19:19:48,513 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.508

2020-09-24 19:19:49,353 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 85/120: loss: 0.00025 Time taken: 0:00:17.239220 ETA: 0:10:03.372705

2020-09-24 19:19:52,108 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.954

2020-09-24 19:19:55,885 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.832

2020-09-24 19:19:59,565 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.069

2020-09-24 19:20:03,157 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.073

2020-09-24 19:20:06,523 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 86/120: loss: 0.00027 Time taken: 0:00:17.164827 ETA: 0:09:43.604130

2020-09-24 19:20:06,700 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.360

2020-09-24 19:20:10,303 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.566

2020-09-24 19:20:13,969 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.653

2020-09-24 19:20:17,650 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.027

2020-09-24 19:20:21,244 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.947

2020-09-24 19:20:23,691 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 87/120: loss: 0.00024 Time taken: 0:00:17.140959 ETA: 0:09:25.651632

2020-09-24 19:20:24,813 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.138

2020-09-24 19:20:28,511 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.279

2020-09-24 19:20:32,154 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.700

2020-09-24 19:20:35,839 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.838

2020-09-24 19:20:39,387 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.129

2020-09-24 19:20:40,888 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 88/120: loss: 0.00022 Time taken: 0:00:17.189657 ETA: 0:09:10.069016

2020-09-24 19:20:43,108 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.283

2020-09-24 19:20:46,713 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.418

2020-09-24 19:20:50,245 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.916

2020-09-24 19:20:53,838 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.017

2020-09-24 19:20:57,360 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 170.370

2020-09-24 19:20:57,811 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 89/120: loss: 0.00023 Time taken: 0:00:16.962471 ETA: 0:08:45.836586

2020-09-24 19:21:00,940 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.582

2020-09-24 19:21:04,578 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.938

2020-09-24 19:21:08,180 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.610

2020-09-24 19:21:11,805 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.535

2020-09-24 19:21:15,913 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 19:21:19,569 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.37s/step

2020-09-24 19:21:22,984 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.34s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 3004/3004 [00:00<00:00, 18291.09it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 2787/2787 [00:00<00:00, 13192.81it/s]

Epoch 90/120

=========================

Validation cost: 0.000278

Mean average_precision (in %): 61.3487

class name      average precision (in %)

------------  --------------------------

mask                             44.6264

no-mask                          78.071

Median Inference Time: 0.004753

2020-09-24 19:21:27,154 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 90/120: loss: 0.00019 Time taken: 0:00:29.344714 ETA: 0:14:40.341411

2020-09-24 19:21:27,725 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 37.690

2020-09-24 19:21:31,313 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.230

2020-09-24 19:21:34,943 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.315

2020-09-24 19:21:38,616 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.351

2020-09-24 19:21:42,125 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 170.995

2020-09-24 19:21:44,171 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 91/120: loss: 0.00024 Time taken: 0:00:16.981119 ETA: 0:08:12.452442

2020-09-24 19:21:45,751 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.494

2020-09-24 19:21:49,500 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.090

2020-09-24 19:21:53,092 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.016

2020-09-24 19:21:56,751 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.049

2020-09-24 19:22:00,443 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.511

2020-09-24 19:22:01,476 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 92/120: loss: 0.00043 Time taken: 0:00:17.331770 ETA: 0:08:05.289565

2020-09-24 19:22:04,042 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.714

2020-09-24 19:22:07,733 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.576

2020-09-24 19:22:11,324 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.128

2020-09-24 19:22:14,973 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.440

2020-09-24 19:22:18,708 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 93/120: loss: 0.00021 Time taken: 0:00:17.228131 ETA: 0:07:45.159532

2020-09-24 19:22:18,708 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.649

2020-09-24 19:22:22,411 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.039

2020-09-24 19:22:25,879 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 173.035

2020-09-24 19:22:29,554 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.272

2020-09-24 19:22:33,107 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.878

2020-09-24 19:22:35,645 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 94/120: loss: 0.00021 Time taken: 0:00:16.938921 ETA: 0:07:20.411945

2020-09-24 19:22:36,703 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.871

2020-09-24 19:22:40,351 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.487

2020-09-24 19:22:44,016 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.754

2020-09-24 19:22:47,659 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.716

2020-09-24 19:22:51,281 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.642

2020-09-24 19:22:52,959 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 95/120: loss: 0.00026 Time taken: 0:00:17.337254 ETA: 0:07:13.431345

2020-09-24 19:22:55,044 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.456

2020-09-24 19:22:58,633 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.203

2020-09-24 19:23:02,261 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.425

2020-09-24 19:23:05,788 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 170.094

2020-09-24 19:23:09,252 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 173.225

2020-09-24 19:23:09,828 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 96/120: loss: 0.00022 Time taken: 0:00:16.852198 ETA: 0:06:44.452755

2020-09-24 19:23:12,851 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.732

2020-09-24 19:23:16,344 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 171.788

2020-09-24 19:23:19,984 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.878

2020-09-24 19:23:23,672 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.701

2020-09-24 19:23:26,976 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 97/120: loss: 0.00017 Time taken: 0:00:17.133888 ETA: 0:06:34.079424

2020-09-24 19:23:27,414 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.340

2020-09-24 19:23:30,942 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 170.105

2020-09-24 19:23:34,593 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.355

2020-09-24 19:23:38,178 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.374

2020-09-24 19:23:41,919 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.422

2020-09-24 19:23:44,142 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 98/120: loss: 0.00020 Time taken: 0:00:17.152283 ETA: 0:06:17.350230

2020-09-24 19:23:45,525 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.409

2020-09-24 19:23:49,136 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.162

2020-09-24 19:23:52,770 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.141

2020-09-24 19:23:56,427 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.086

2020-09-24 19:24:00,004 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.733

2020-09-24 19:24:01,169 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 99/120: loss: 0.00026 Time taken: 0:00:17.050291 ETA: 0:05:58.056117

2020-09-24 19:24:03,600 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.871

2020-09-24 19:24:07,377 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.849

2020-09-24 19:24:11,133 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.764

2020-09-24 19:24:14,830 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.333

2020-09-24 19:24:19,625 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 19:24:23,565 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.39s/step

2020-09-24 19:24:27,022 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.35s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 3668/3668 [00:00<00:00, 17984.09it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 3228/3228 [00:00<00:00, 13721.55it/s]

Epoch 100/120

=========================

Validation cost: 0.000284

Mean average_precision (in %): 67.6944

class name      average precision (in %)

------------  --------------------------

mask                             54.1237

no-mask                          81.265

Median Inference Time: 0.004717

2020-09-24 19:24:31,458 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 36.084

2020-09-24 19:24:31,624 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 100/120: loss: 0.00015 Time taken: 0:00:30.423846 ETA: 0:10:08.476925

2020-09-24 19:24:35,090 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.271

2020-09-24 19:24:38,775 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.849

2020-09-24 19:24:42,433 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.065

2020-09-24 19:24:46,021 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.206

2020-09-24 19:24:48,722 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 101/120: loss: 0.00030 Time taken: 0:00:17.148890 ETA: 0:05:25.828910

2020-09-24 19:24:49,609 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.275

2020-09-24 19:24:53,271 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.826

2020-09-24 19:24:56,995 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.127

2020-09-24 19:25:00,577 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.526

2020-09-24 19:25:04,265 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.714

2020-09-24 19:25:06,023 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 102/120: loss: 0.00030 Time taken: 0:00:17.289413 ETA: 0:05:11.209429

2020-09-24 19:25:07,903 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.934

2020-09-24 19:25:11,588 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.835

2020-09-24 19:25:15,211 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.650

2020-09-24 19:25:18,864 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.256

2020-09-24 19:25:22,385 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 170.402

2020-09-24 19:25:23,121 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 103/120: loss: 0.00026 Time taken: 0:00:17.074948 ETA: 0:04:50.274109

2020-09-24 19:25:26,117 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.793

2020-09-24 19:25:29,827 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.744

2020-09-24 19:25:33,447 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.762

2020-09-24 19:25:37,169 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.219

2020-09-24 19:25:40,580 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 104/120: loss: 0.00035 Time taken: 0:00:17.471250 ETA: 0:04:39.540001

2020-09-24 19:25:40,879 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.744

2020-09-24 19:25:44,515 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.043

2020-09-24 19:25:48,189 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.317

2020-09-24 19:25:51,942 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.887

2020-09-24 19:25:55,614 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.399

2020-09-24 19:25:57,890 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 105/120: loss: 0.00030 Time taken: 0:00:17.278138 ETA: 0:04:19.172072

2020-09-24 19:25:59,275 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.921

2020-09-24 19:26:03,002 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.004

2020-09-24 19:26:06,599 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.817

2020-09-24 19:26:10,159 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.535

2020-09-24 19:26:13,745 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.358

2020-09-24 19:26:15,070 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 106/120: loss: 0.00031 Time taken: 0:00:17.199497 ETA: 0:04:00.792954

2020-09-24 19:26:17,539 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.143

2020-09-24 19:26:20,968 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 175.011

2020-09-24 19:26:24,424 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 173.632

2020-09-24 19:26:28,025 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.617

2020-09-24 19:26:31,675 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.399

2020-09-24 19:26:32,020 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 107/120: loss: 0.00018 Time taken: 0:00:16.925473 ETA: 0:03:40.031152

2020-09-24 19:26:35,378 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.083

2020-09-24 19:26:38,974 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.852

2020-09-24 19:26:42,678 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.016

2020-09-24 19:26:46,325 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.532

2020-09-24 19:26:49,373 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 108/120: loss: 0.00022 Time taken: 0:00:17.364669 ETA: 0:03:28.376032

2020-09-24 19:26:50,099 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.967

2020-09-24 19:26:53,616 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 170.628

2020-09-24 19:26:57,279 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.822

2020-09-24 19:27:00,789 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 170.956

2020-09-24 19:27:04,413 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.569

2020-09-24 19:27:06,316 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 109/120: loss: 0.00022 Time taken: 0:00:16.957361 ETA: 0:03:06.530966

2020-09-24 19:27:08,112 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.225

2020-09-24 19:27:11,666 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.845

2020-09-24 19:27:15,315 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.456

2020-09-24 19:27:18,815 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 171.457

2020-09-24 19:27:22,406 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.074

2020-09-24 19:27:24,288 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 19:27:28,312 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.40s/step

2020-09-24 19:27:31,823 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.35s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 3434/3434 [00:00<00:00, 18074.72it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 3044/3044 [00:00<00:00, 12740.58it/s]

Epoch 110/120

=========================

Validation cost: 0.000287

Mean average_precision (in %): 66.4332

class name      average precision (in %)

------------  --------------------------

mask                             53.7016

no-mask                          79.1648

Median Inference Time: 0.004638

2020-09-24 19:27:36,227 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 110/120: loss: 0.00021 Time taken: 0:00:29.901915 ETA: 0:04:59.019146

2020-09-24 19:27:39,022 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 36.111

2020-09-24 19:27:42,628 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.418

2020-09-24 19:27:46,232 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.503

2020-09-24 19:27:49,952 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.305

2020-09-24 19:27:53,527 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 111/120: loss: 0.00021 Time taken: 0:00:17.303051 ETA: 0:02:35.727457

2020-09-24 19:27:53,676 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.108

2020-09-24 19:27:57,321 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.656

2020-09-24 19:28:00,902 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.577

2020-09-24 19:28:04,649 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.134

2020-09-24 19:28:08,312 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.808

2020-09-24 19:28:10,793 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 112/120: loss: 0.00025 Time taken: 0:00:17.267142 ETA: 0:02:18.137136

2020-09-24 19:28:11,987 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.280

2020-09-24 19:28:15,569 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.517

2020-09-24 19:28:19,304 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.671

2020-09-24 19:28:23,041 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.578

2020-09-24 19:28:26,627 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.330

2020-09-24 19:28:28,200 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 113/120: loss: 0.00030 Time taken: 0:00:17.387783 ETA: 0:02:01.714480

2020-09-24 19:28:30,503 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 154.813

2020-09-24 19:28:34,194 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.569

2020-09-24 19:28:37,809 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.003

2020-09-24 19:28:41,590 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.721

2020-09-24 19:28:45,268 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.147

2020-09-24 19:28:45,706 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 114/120: loss: 0.00026 Time taken: 0:00:17.528213 ETA: 0:01:45.169280

2020-09-24 19:28:48,890 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.640

2020-09-24 19:28:52,663 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.053

2020-09-24 19:28:56,234 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.034

2020-09-24 19:28:59,984 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.005

2020-09-24 19:29:03,013 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 115/120: loss: 0.00028 Time taken: 0:00:17.301115 ETA: 0:01:26.505576

2020-09-24 19:29:03,565 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.592

2020-09-24 19:29:07,260 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.373

2020-09-24 19:29:10,990 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.909

2020-09-24 19:29:14,694 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 161.984

2020-09-24 19:29:18,281 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.290

2020-09-24 19:29:20,356 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 116/120: loss: 0.00033 Time taken: 0:00:17.357868 ETA: 0:01:09.431474

2020-09-24 19:29:22,014 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.754

2020-09-24 19:29:25,674 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.927

2020-09-24 19:29:29,305 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 165.267

2020-09-24 19:29:32,875 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 168.102

2020-09-24 19:29:36,481 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.372

2020-09-24 19:29:37,473 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 117/120: loss: 0.00016 Time taken: 0:00:17.106891 ETA: 0:00:51.320672

2020-09-24 19:29:40,011 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.996

2020-09-24 19:29:43,744 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.766

2020-09-24 19:29:47,321 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 167.739

2020-09-24 19:29:50,971 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 164.390

2020-09-24 19:29:54,731 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 118/120: loss: 0.00020 Time taken: 0:00:17.243589 ETA: 0:00:34.487178

2020-09-24 19:29:54,731 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.606

2020-09-24 19:29:58,271 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 169.491

2020-09-24 19:30:01,945 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.331

2020-09-24 19:30:05,677 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 160.778

2020-09-24 19:30:09,277 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 166.690

2020-09-24 19:30:12,014 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 119/120: loss: 0.00019 Time taken: 0:00:17.274985 ETA: 0:00:17.274985

2020-09-24 19:30:13,069 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.263

2020-09-24 19:30:16,749 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 163.032

2020-09-24 19:30:20,502 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 159.909

2020-09-24 19:30:24,291 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 158.348

2020-09-24 19:30:27,981 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.647

2020-09-24 19:30:30,620 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 29, 0.00s/step

2020-09-24 19:30:34,611 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 29, 0.40s/step

2020-09-24 19:30:38,148 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 29, 0.35s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 3424/3424 [00:00<00:00, 17741.34it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 3028/3028 [00:00<00:00, 13002.73it/s]

Epoch 120/120

=========================

Validation cost: 0.000285

Mean average_precision (in %): 67.7124

class name      average precision (in %)

------------  --------------------------

mask                             55.9825

no-mask                          79.4423

Median Inference Time: 0.004574

2020-09-24 19:30:42,458 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 162.647

Time taken to run iva.detectnet_v2.scripts.train:main: 0:54:42.006343.

In [29]:

# Listing the newly retrained model.
!ls -rlt $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights

Out[29]:

total 1260

-rw-r--r-- 1 root root 1286720 Sep 24 19:30 resnet18_detector_pruned.tlt

Step 12 - Evaluate The Pruned Model

Let's evaluate the pruned and retrained model, using tlt-evaluate, to view the performance of our face detector:

In [30]:

!tlt-evaluate detectnet_v2 -e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt \
                           -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt \
                           -k $KEY

Out[30]:

Using TensorFlow backend.

2020-09-24 19:46:40.000526: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 19:46:43,191 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /workspace/detectnet_v2/specs/detectnet_v2_retrain_resnet18_kitti.txt

2020-09-24 19:46:44.572831: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1

2020-09-24 19:46:44.616353: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:46:44.617335: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 19:46:44.617383: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 19:46:44.617461: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 19:46:44.618852: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 19:46:44.619251: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 19:46:44.621059: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 19:46:44.622443: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 19:46:44.622536: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 19:46:44.622663: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:46:44.623678: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:46:44.624608: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 19:46:44.624670: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 19:46:45.461405: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-09-24 19:46:45.461467: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0

2020-09-24 19:46:45.461490: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N

2020-09-24 19:46:45.461751: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:46:45.462766: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:46:45.463723: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:46:45.464649: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14636 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0)

2020-09-24 19:46:46,645 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False

2020-09-24 19:46:46,645 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False

2020-09-24 19:46:46,646 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0)

2020-09-24 19:46:46,646 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 8, io threads: 16, compute threads: 8, buffered batches: 4

2020-09-24 19:46:46,646 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 706, number of sources: 1, batch size per gpu: 24, steps: 30

2020-09-24 19:46:46,763 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.

2020-09-24 19:46:46.804045: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:46:46.804490: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 19:46:46.804532: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 19:46:46.804572: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 19:46:46.804610: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 19:46:46.804647: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 19:46:46.804683: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 19:46:46.804719: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 19:46:46.804756: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 19:46:46.804852: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:46:46.805287: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:46:46.805645: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 19:46:47,192 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: shuffle: False - shard 0 of 1

2020-09-24 19:46:47,200 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: sampling 1 datasets with weights:

2020-09-24 19:46:47,200 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: source: 0 weight: 1.000000

2020-09-24 19:46:47,662 [INFO] iva.detectnet_v2.evaluation.build_evaluator: Found 706 samples in validation set

__________________________________________________________________________________________________

Layer (type)                    Output Shape         Param #     Connected to

==================================================================================================

input_1 (InputLayer)            (None, 3, 544, 960)  0

__________________________________________________________________________________________________

conv1 (Conv2D)                  (None, 16, 272, 480) 2368        input_1[0][0]

__________________________________________________________________________________________________

bn_conv1 (BatchNormalization)   (None, 16, 272, 480) 64          conv1[0][0]

__________________________________________________________________________________________________

activation_1 (Activation)       (None, 16, 272, 480) 0           bn_conv1[0][0]

__________________________________________________________________________________________________

block_1a_conv_1 (Conv2D)        (None, 16, 136, 240) 2320        activation_1[0][0]

__________________________________________________________________________________________________

block_1a_bn_1 (BatchNormalizati (None, 16, 136, 240) 64          block_1a_conv_1[0][0]

__________________________________________________________________________________________________

block_1a_relu_1 (Activation)    (None, 16, 136, 240) 0           block_1a_bn_1[0][0]

__________________________________________________________________________________________________

block_1a_conv_2 (Conv2D)        (None, 48, 136, 240) 6960        block_1a_relu_1[0][0]

__________________________________________________________________________________________________

block_1a_conv_shortcut (Conv2D) (None, 48, 136, 240) 816         activation_1[0][0]

__________________________________________________________________________________________________

block_1a_bn_2 (BatchNormalizati (None, 48, 136, 240) 192         block_1a_conv_2[0][0]

__________________________________________________________________________________________________

block_1a_bn_shortcut (BatchNorm (None, 48, 136, 240) 192         block_1a_conv_shortcut[0][0]

__________________________________________________________________________________________________

add_1 (Add)                     (None, 48, 136, 240) 0           block_1a_bn_2[0][0]

                                                                 block_1a_bn_shortcut[0][0]

__________________________________________________________________________________________________

block_1a_relu (Activation)      (None, 48, 136, 240) 0           add_1[0][0]

__________________________________________________________________________________________________

block_1b_conv_1 (Conv2D)        (None, 16, 136, 240) 6928        block_1a_relu[0][0]

__________________________________________________________________________________________________

block_1b_bn_1 (BatchNormalizati (None, 16, 136, 240) 64          block_1b_conv_1[0][0]

__________________________________________________________________________________________________

block_1b_relu_1 (Activation)    (None, 16, 136, 240) 0           block_1b_bn_1[0][0]

__________________________________________________________________________________________________

block_1b_conv_2 (Conv2D)        (None, 48, 136, 240) 6960        block_1b_relu_1[0][0]

__________________________________________________________________________________________________

block_1b_bn_2 (BatchNormalizati (None, 48, 136, 240) 192         block_1b_conv_2[0][0]

__________________________________________________________________________________________________

add_2 (Add)                     (None, 48, 136, 240) 0           block_1b_bn_2[0][0]

                                                                 block_1a_relu[0][0]

__________________________________________________________________________________________________

block_1b_relu (Activation)      (None, 48, 136, 240) 0           add_2[0][0]

__________________________________________________________________________________________________

block_2a_conv_1 (Conv2D)        (None, 24, 68, 120)  10392       block_1b_relu[0][0]

__________________________________________________________________________________________________

block_2a_bn_1 (BatchNormalizati (None, 24, 68, 120)  96          block_2a_conv_1[0][0]

__________________________________________________________________________________________________

block_2a_relu_1 (Activation)    (None, 24, 68, 120)  0           block_2a_bn_1[0][0]

__________________________________________________________________________________________________

block_2a_conv_2 (Conv2D)        (None, 104, 68, 120) 22568       block_2a_relu_1[0][0]

__________________________________________________________________________________________________

block_2a_conv_shortcut (Conv2D) (None, 104, 68, 120) 5096        block_1b_relu[0][0]

__________________________________________________________________________________________________

block_2a_bn_2 (BatchNormalizati (None, 104, 68, 120) 416         block_2a_conv_2[0][0]

__________________________________________________________________________________________________

block_2a_bn_shortcut (BatchNorm (None, 104, 68, 120) 416         block_2a_conv_shortcut[0][0]

__________________________________________________________________________________________________

add_3 (Add)                     (None, 104, 68, 120) 0           block_2a_bn_2[0][0]

                                                                 block_2a_bn_shortcut[0][0]

__________________________________________________________________________________________________

block_2a_relu (Activation)      (None, 104, 68, 120) 0           add_3[0][0]

__________________________________________________________________________________________________

block_2b_conv_1 (Conv2D)        (None, 24, 68, 120)  22488       block_2a_relu[0][0]

__________________________________________________________________________________________________

block_2b_bn_1 (BatchNormalizati (None, 24, 68, 120)  96          block_2b_conv_1[0][0]

__________________________________________________________________________________________________

block_2b_relu_1 (Activation)    (None, 24, 68, 120)  0           block_2b_bn_1[0][0]

__________________________________________________________________________________________________

block_2b_conv_2 (Conv2D)        (None, 104, 68, 120) 22568       block_2b_relu_1[0][0]

__________________________________________________________________________________________________

block_2b_bn_2 (BatchNormalizati (None, 104, 68, 120) 416         block_2b_conv_2[0][0]

__________________________________________________________________________________________________

add_4 (Add)                     (None, 104, 68, 120) 0           block_2b_bn_2[0][0]

                                                                 block_2a_relu[0][0]

__________________________________________________________________________________________________

block_2b_relu (Activation)      (None, 104, 68, 120) 0           add_4[0][0]

__________________________________________________________________________________________________

block_3a_conv_1 (Conv2D)        (None, 16, 34, 60)   14992       block_2b_relu[0][0]

__________________________________________________________________________________________________

block_3a_bn_1 (BatchNormalizati (None, 16, 34, 60)   64          block_3a_conv_1[0][0]

__________________________________________________________________________________________________

block_3a_relu_1 (Activation)    (None, 16, 34, 60)   0           block_3a_bn_1[0][0]

__________________________________________________________________________________________________

block_3a_conv_2 (Conv2D)        (None, 136, 34, 60)  19720       block_3a_relu_1[0][0]

__________________________________________________________________________________________________

block_3a_conv_shortcut (Conv2D) (None, 136, 34, 60)  14280       block_2b_relu[0][0]

__________________________________________________________________________________________________

block_3a_bn_2 (BatchNormalizati (None, 136, 34, 60)  544         block_3a_conv_2[0][0]

__________________________________________________________________________________________________

block_3a_bn_shortcut (BatchNorm (None, 136, 34, 60)  544         block_3a_conv_shortcut[0][0]

__________________________________________________________________________________________________

add_5 (Add)                     (None, 136, 34, 60)  0           block_3a_bn_2[0][0]

                                                                 block_3a_bn_shortcut[0][0]

__________________________________________________________________________________________________

block_3a_relu (Activation)      (None, 136, 34, 60)  0           add_5[0][0]

__________________________________________________________________________________________________

block_3b_conv_1 (Conv2D)        (None, 24, 34, 60)   29400       block_3a_relu[0][0]

__________________________________________________________________________________________________

block_3b_bn_1 (BatchNormalizati (None, 24, 34, 60)   96          block_3b_conv_1[0][0]

__________________________________________________________________________________________________

block_3b_relu_1 (Activation)    (None, 24, 34, 60)   0           block_3b_bn_1[0][0]

__________________________________________________________________________________________________

block_3b_conv_2 (Conv2D)        (None, 136, 34, 60)  29512       block_3b_relu_1[0][0]

__________________________________________________________________________________________________

block_3b_bn_2 (BatchNormalizati (None, 136, 34, 60)  544         block_3b_conv_2[0][0]

__________________________________________________________________________________________________

add_6 (Add)                     (None, 136, 34, 60)  0           block_3b_bn_2[0][0]

                                                                 block_3a_relu[0][0]

__________________________________________________________________________________________________

block_3b_relu (Activation)      (None, 136, 34, 60)  0           add_6[0][0]

__________________________________________________________________________________________________

block_4a_conv_1 (Conv2D)        (None, 16, 34, 60)   19600       block_3b_relu[0][0]

__________________________________________________________________________________________________

block_4a_bn_1 (BatchNormalizati (None, 16, 34, 60)   64          block_4a_conv_1[0][0]

__________________________________________________________________________________________________

block_4a_relu_1 (Activation)    (None, 16, 34, 60)   0           block_4a_bn_1[0][0]

__________________________________________________________________________________________________

block_4a_conv_2 (Conv2D)        (None, 48, 34, 60)   6960        block_4a_relu_1[0][0]

__________________________________________________________________________________________________

block_4a_conv_shortcut (Conv2D) (None, 48, 34, 60)   6576        block_3b_relu[0][0]

__________________________________________________________________________________________________

block_4a_bn_2 (BatchNormalizati (None, 48, 34, 60)   192         block_4a_conv_2[0][0]

__________________________________________________________________________________________________

block_4a_bn_shortcut (BatchNorm (None, 48, 34, 60)   192         block_4a_conv_shortcut[0][0]

__________________________________________________________________________________________________

add_7 (Add)                     (None, 48, 34, 60)   0           block_4a_bn_2[0][0]

                                                                 block_4a_bn_shortcut[0][0]

__________________________________________________________________________________________________

block_4a_relu (Activation)      (None, 48, 34, 60)   0           add_7[0][0]

__________________________________________________________________________________________________

block_4b_conv_1 (Conv2D)        (None, 16, 34, 60)   6928        block_4a_relu[0][0]

__________________________________________________________________________________________________

block_4b_bn_1 (BatchNormalizati (None, 16, 34, 60)   64          block_4b_conv_1[0][0]

__________________________________________________________________________________________________

block_4b_relu_1 (Activation)    (None, 16, 34, 60)   0           block_4b_bn_1[0][0]

__________________________________________________________________________________________________

block_4b_conv_2 (Conv2D)        (None, 48, 34, 60)   6960        block_4b_relu_1[0][0]

__________________________________________________________________________________________________

block_4b_bn_2 (BatchNormalizati (None, 48, 34, 60)   192         block_4b_conv_2[0][0]

__________________________________________________________________________________________________

add_8 (Add)                     (None, 48, 34, 60)   0           block_4b_bn_2[0][0]

                                                                 block_4a_relu[0][0]

__________________________________________________________________________________________________

block_4b_relu (Activation)      (None, 48, 34, 60)   0           add_8[0][0]

__________________________________________________________________________________________________

output_bbox (Conv2D)            (None, 8, 34, 60)    392         block_4b_relu[0][0]

__________________________________________________________________________________________________

output_cov (Conv2D)             (None, 2, 34, 60)    98          block_4b_relu[0][0]

==================================================================================================

Total params: 269,586

Trainable params: 267,234

Non-trainable params: 2,352

__________________________________________________________________________________________________

2020-09-24 19:46:49.242001: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:46:49.242467: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 19:46:49.242520: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 19:46:49.242578: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 19:46:49.242616: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 19:46:49.242652: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 19:46:49.242687: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 19:46:49.242723: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 19:46:49.242758: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 19:46:49.242854: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:46:49.243282: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:46:49.243640: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 19:46:49.244995: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-09-24 19:46:49.245028: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0

2020-09-24 19:46:49.245048: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N

2020-09-24 19:46:49.245190: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:46:49.245641: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:46:49.246039: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14636 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0)

2020-09-24 19:46:51,001 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 30, 0.00s/step

2020-09-24 19:46:51.886636: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 19:46:51.935142: I tensorflow/core/kernels/cuda_solvers.cc:159] Creating CudaSolver handles for stream 0x26494070

2020-09-24 19:46:51.935369: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 19:46:53.253158: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 19:46:54.128963: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 19:46:58,399 [INFO] iva.detectnet_v2.evaluation.evaluation: step 10 / 30, 0.74s/step

2020-09-24 19:47:02,273 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 30, 0.39s/step

Matching predictions to ground truth, class 1/2.: 100%|█| 3535/3535 [00:00<00:00, 17682.63it/s]

Matching predictions to ground truth, class 2/2.: 100%|█| 3207/3207 [00:00<00:00, 13452.25it/s]

Validation cost: 0.001419

Mean average_precision (in %): 67.9807

class name      average precision (in %)

------------  --------------------------

mask                             56.0735

no-mask                          79.8879

Median Inference Time: 0.004509

2020-09-24 19:47:06,747 [INFO] iva.detectnet_v2.scripts.evaluate: Evaluation complete.

Time taken to run iva.detectnet_v2.scripts.evaluate:main: 0:00:23.558129.

Step 13 - Export The TLT Model For Inference

The final step is to export our TLT model for inference! This will allow us to use the mask detector with DeepStream..

We can use the tlt-export utility to export our weights. Run the code cell below:

In [31]:

!mkdir -p $USER_EXPERIMENT_DIR/experiment_dir_final_pruned

# Removing a pre-existing copy of the etlt if there is one.
import os
output_file=os.path.join(os.environ['USER_EXPERIMENT_DIR'],
                         "experiment_dir_final/resnet18_detector_pruned.etlt")
if os.path.exists(output_file):
    os.system("rm {}".format(output_file))
!tlt-export detectnet_v2 \
            -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt \
            -o $USER_EXPERIMENT_DIR/experiment_dir_final_pruned/resnet18_detector_pruned.etlt \
            -k $KEY

Out[31]:

Using TensorFlow backend.

2020-09-24 19:47:29.146901: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 19:47:32.440343: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1

2020-09-24 19:47:32.440553: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:32.441533: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 19:47:32.441571: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 19:47:32.441628: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 19:47:32.442918: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 19:47:32.442989: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 19:47:32.444746: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 19:47:32.446081: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 19:47:32.446153: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 19:47:32.446253: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:32.447240: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:32.448137: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 19:47:32.448188: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 19:47:33.247933: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-09-24 19:47:33.247992: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0

2020-09-24 19:47:33.248017: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N

2020-09-24 19:47:33.248247: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:33.249240: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:33.250223: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:33.251142: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14349 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0)

2020-09-24 19:47:36.563854: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:36.564831: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 19:47:36.564885: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 19:47:36.564935: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 19:47:36.564975: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 19:47:36.565007: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 19:47:36.565041: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 19:47:36.565074: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 19:47:36.565106: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 19:47:36.565195: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:36.566195: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:36.567083: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 19:47:36.567127: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-09-24 19:47:36.567148: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0

2020-09-24 19:47:36.567165: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N

2020-09-24 19:47:36.567291: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:36.568252: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:36.569148: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14349 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0)

2020-09-24 19:47:39.332652: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:39.333633: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 19:47:39.333688: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 19:47:39.333739: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 19:47:39.333782: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 19:47:39.333815: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 19:47:39.333849: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 19:47:39.333884: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 19:47:39.333916: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 19:47:39.334031: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:39.334969: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:39.335845: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 19:47:39.335890: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-09-24 19:47:39.335911: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0

2020-09-24 19:47:39.335928: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N

2020-09-24 19:47:39.336057: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:39.337024: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:39.337930: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14349 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0)

2020-09-24 19:47:41.770499: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:41.771512: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 19:47:41.771580: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 19:47:41.771645: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 19:47:41.771688: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 19:47:41.771720: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 19:47:41.771753: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 19:47:41.771785: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 19:47:41.771830: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 19:47:41.771950: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:41.772947: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:41.773880: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 19:47:41.774289: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:41.775228: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53

pciBusID: 0000:00:1e.0

2020-09-24 19:47:41.775274: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0

2020-09-24 19:47:41.775327: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

2020-09-24 19:47:41.775367: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0

2020-09-24 19:47:41.775399: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0

2020-09-24 19:47:41.775431: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0

2020-09-24 19:47:41.775463: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0

2020-09-24 19:47:41.775494: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-09-24 19:47:41.775611: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:41.776604: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:41.777523: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2020-09-24 19:47:41.777575: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-09-24 19:47:41.777606: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0

2020-09-24 19:47:41.777628: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N

2020-09-24 19:47:41.777759: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:41.778793: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2020-09-24 19:47:41.779726: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14349 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0)

NOTE: UFF has been tested with TensorFlow 1.14.0.

WARNING: The version of TensorFlow installed on this system is not guaranteed to work with UFF.

DEBUG [/usr/local/lib/python3.6/dist-packages/uff/converters/tensorflow/converter.py:96] Marking ['output_cov/Sigmoid', 'output_bbox/BiasAdd'] as outputs

In [32]:

print('Exported model:')
print('------------')
!ls -lh $USER_EXPERIMENT_DIR/experiment_dir_final_pruned

Out[32]:

Exported model:

------------

total 1.1M

-rw-r--r-- 1 root root 1.1M Sep 24 19:47 resnet18_detector_pruned.etlt

Step 14 - Prepare Deep Stream Configuration Files

As we're using NVIDIA's DeepStream application to perform inference we'll also need to update the DS configuration files to match our newly trained model.

We need to update the following parameters in the file /workspace/face-mask-detection/ds_configs/config_infer_primary_masknet_gpu.txt

tlt-encoded-model - This needs to be the relative path for your model when deployed in the DeepStream container. If you're following this notebook ./resnet18_detector_pruned.etlt is the answer
labelfile-path - The relative path to the labels file for your notebook. ./labels.txt
input-dims - These should match the input dims set previously. 960x544
model-engine - The same path as the model file but with the .engine extension
network-mode - The network mode for your deployment. As we're using a live stream this needs to be 2

In [33]:

!cat /workspace/face-mask-detection/ds_configs/config_infer_primary_masknet_gpu.txt

Out[33]:

# Copyright (c) 2020 NVIDIA Corporation.  All rights reserved.

# NVIDIA Corporation and its licensors retain all intellectual property

# and proprietary rights in and to this software, related documentation

# and any modifications thereto.  Any use, reproduction, disclosure or

# distribution of this software and related documentation without an express

# license agreement from NVIDIA Corporation is strictly prohibited.

[property]

gpu-id=0

net-scale-factor=0.0039215697906911373

tlt-model-key=tlt_encode

tlt-encoded-model=./resnet18_detector_pruned.etlt

labelfile-path=./labels.txt

# GPU Engine File

model-engine-file=./resnet18_detector_pruned.engine

# DLA Engine File

# model-engine-file=/home/nvidia/detectnet_v2_models/detectnet_4K-fddb-12/resnet18_RGB960_detector_fddb_12_int8.etlt_b1_dla0_int8.engine

input-dims=3;960;544;0

uff-input-blob-name=input_1

batch-size=1

model-color-format=0

## 0=FP32, 1=INT8, 2=FP16 mode

network-mode=2

#int8-calib-file=/mnt/8c3f68c9-a08a-400b-8c80-99c5fee26a06/detectnet_v2_models/detectnet_4K-fddb-12/calibration.bin

num-detected-classes=2

cluster-mode=1

interval=0

gie-unique-id=1

is-classifier=0

classifier-threshold=0.9

output-blob-names=output_bbox/BiasAdd;output_cov/Sigmoid

[class-attrs-0]

pre-cluster-threshold=0.3

group-threshold=1

eps=0.5

#minBoxes=1

detected-min-w=0

detected-min-h=0

detected-max-w=0

detected-max-h=0

[class-attrs-1]

pre-cluster-threshold=0.3

group-threshold=1

eps=0.3

#minBoxes=1

detected-min-w=0

detected-min-h=0

detected-max-w=0

detected-max-h=0

We don't need to make any changes to the Labels file as we're going to be using the same classes as the example from GitHub.

These labels are:

mask - Our positive class - someone is wearing a mask
no-mask - Our negative class - the subject in the video is not wearing a mask
default - The default for the detectnet model

You can see the simple labels.txt file by running the code cell below:

In [34]:

!cat /workspace/face-mask-detection/ds_configs/labels_masknet.txt

Out[34]:

mask

no-mask

default

Now we need to get a DeepStream config. These are available in the DeepStream container itself, but since we are deplyoing via Helm - we'll need to supply our own. You can grab our example from GitHub here and modify it to your use case.

There are two properties we're setting in the config:

RTSP Stream Location - the URL of our video feed
DeepStream Inference Config - the path to the model's inference. configuration we just updated.

Since we updated the config beforehand - there's nothing for us to change here.

In [35]:

!wget https://raw.githubusercontent.com/ChrisParsonsDev/ngc-gtc-content/master/demo-assets/dsconfig.txt

Out[35]:

--2020-09-24 20:42:20--  https://raw.githubusercontent.com/ChrisParsonsDev/ngc-gtc-content/master/demo-assets/dsconfig.txt

Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 199.232.64.133

Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|199.232.64.133|:443... connected.

HTTP request sent, awaiting response... 200 OK

Length: 3344 (3.3K) [text/plain]

Saving to: ‘dsconfig.txt’

dsconfig.txt        100%[===================>]   3.27K  --.-KB/s    in 0s

2020-09-24 20:42:20 (41.6 MB/s) - ‘dsconfig.txt’ saved [3344/3344]

That's it! We've successfully used TLT to retrain a DetectNet model to identify people wearing face masks in streaming video. Now let's use NGC and EGX Manager to create a deployment workflow.

Publishing The Model To An NGC Private Registry

Ok! So we've successfully built our new model with TLT - it's time we shared it with the world (or at least NGC).

We're going to upload the model to our NGC Private Registry so that other members of our team can take advantage of the awesome model we just built. We'll also be able to use the model we upload to NGC with DeepStream and deploy it for inference.

Let's use the NGC CLI to upload the model to our Private Registry. The process of publishing models to an NGC Private Registry goes like this:

Create the model entity
Create a model version
Upload the model files

The steps below don't show you everything NGC's Private Registries, and our CLI, can do. They are meant to give you a base of knowledge so you can expand on everything we're doing here and really take advantage of the full power of NGC for yourself.

If you need to know more information about a CLI command and see all the available options, you can run the --help option to see our docs at any time.

Let's try an example of that! Run the code cell below to see all some of the options for models in NGC Private Registries:

In [36]:

!ngc registry model --help

Out[36]:

usage: ngc registry model [--debug] [--format_type <fmt>] [--org <name>]

                          [--team <name>] [-h]

                          {create,download-version,info,list,remove,update,upload-version}

...

Model Commands

optional arguments:

  -h, --help            Show this help message and exit.

  --debug               Enable debug mode.

  --format_type <fmt>   Specify the output format type. Supported formats are:

                        ascii, csv, json. Only commands that produce tabular

                        data support csv format. Default: ascii

  --org <name>          Specify the organization name. Use "--org no-org" to

                        override other sources and specify no org. Default:

                        current configuration

  --team <name>         Specify the team name. Use "--team no-team" to

                        override other sources and specify no team. Default:

                        current configuration

model:

  {create,download-version,info,list,remove,update,upload-version}

    create              Create a model.

    download-version    Download a model version.

    info                Retrieve metadata for a model or model version.

    list                List model(s) or model version(s).

    remove              Remove a model or model version.

    update              Update a model or model version.

    upload-version      Upload a model version.

Step 1 - Creating the model entity

The first step in the process is to create the model entity. To create a model in an NGC Private Registry we need to create a card which will hold some basic information about our model. The full list of properties can be seen in our docs here. The parameters we need to set are:

MODEL_NAME - The name (and URL) for our model in our private registry. We also need to specify the org/team as part of this. The structure is <ORG>/<TEAM>/<MODEL-NAME>.
MODEL_DISPLAY_NAME - We don't technically need to set this, but it's displayed in the UI and it's really pretty. Any string display name for your model.
MODEL_APPLICATION - The deep learning application of the model. In our case that's Object Detection.
MODEL_FORMAT - What format is our model? TLT!!!!!
MODEL_FRAMEWORK - What framework did we use? Transfer Learning Toolkits
PRECISION - What GPU precision was used in the training phase? For us that's FP32
MODEL_SHORT_DESC - Short description of your model so other people in your team know what it does..

We need to provide these parameters to make our content identifiable, help other users in our team discover it and to remind us what we did when we come back to the model at a later date.

Let's set up some of the parameters (feel free to modify) in the code cell below:

In [37]:

# Model Name (org/team/modelname)
%env MODEL_NAME=ngcvideos/facemaskdetector

# Model Display Name
%env MODEL_DISPLAY_NAME="Face Mask Detector" 

# Application
%env MODEL_APPLICATION="Object Detection"

# Format
%env MODEL_FORMAT="TLT"

# Framework

%env MODEL_FRAMEWORK="Transfer Learning Toolkit"

# Precision
%env MODEL_PRECISION="FP32"

# Short Description
%env MODEL_SHORT_DESC="This model shows you the power of TLT and NGC Collections as part of the GTC Fall Demo"

Out[37]:

env: MODEL_NAME=ngcvideos/facemaskdetector

env: MODEL_DISPLAY_NAME="Face Mask Detector"

env: MODEL_APPLICATION="Object Detection"

env: MODEL_FORMAT="TLT"

env: MODEL_FRAMEWORK="Transfer Learning Toolkit"

env: MODEL_PRECISION="FP32"

env: MODEL_SHORT_DESC="This model shows you the power of TLT and NGC Collections as part of the GTC Fall Demo"

Now run the following code cell to create the entity:

In [38]:

!eval ngc registry model create $MODEL_NAME --application $MODEL_APPLICATION --framework $MODEL_FRAMEWORK \
    --format $MODEL_FORMAT --precision $MODEL_PRECISION --short-desc $MODEL_SHORT_DESC \
    --display-name $MODEL_DISPLAY_NAME

Out[38]:

Successfully created model 'ngcvideos/facemaskdetector'.

--------------------------------------------------

  Model Information

    Name: facemaskdetector

    Application: Object Detection

    Framework: Transfer Learning Toolkit

    Model Format: TLT

    Precision: FP32

    Short Description: This model shows you the power of TLT and NGC Collections as part of the GTC Fall Demo

    Display Name: Face Mask Detector

    Org: ngcvideos

    Team:

    Built By:

    Publisher:

    Created Date: 2020-09-24T20:43:36.251Z

    Updated Date: 2020-09-24T20:43:36.251Z

    Labels:

    Latest Version ID:

    Latest Version Size (bytes):

    Public Dataset Used:

        Name:

        Link:

        License:

    Overview:

--------------------------------------------------

One of the major options we've missed here is the --overview option! You'll have no doubt seen these all over NGC. We allow you to provide a .md overview file which can contain everything from how to use your model to how to get help/who to contact when it goes wrong.

It's really worth taking advantage of this - and you can update it any time via our CLI or UI.

Once this is completed you should see the model listed in the NGC Private Registry UI like this:

NGC Screenshot

Step 2 - Creating The Model Version

The final step in uploading our model to our NGC Private Registry is to create the version and upload the files. This is exactly the same process as we just followed to create the model. Let's start by creating the metadata. We need to specify two things:

We're going to repeat this step to publish three versions of our model! 1. Unpruned, 2. Pruned, 3. Inference to NGC. That way, if we need to retrain we have the full version of the model, but if we need to deploy for inference we can just use the pruned version.

VERSION_NUMBER - The version number we'd like to create in NGC
MODEL_FILE - The path to our model file we'd like to upload

If you're uploading multiple files, you can specify a path to a directory rather than individual files. There's also lots of metadata you can specify about an individual model version such as:

Accuracy
Epochs
Batch Size
Additional Resources - A hyperlink to additional code samples that can help your users
Custom Metrics - Up to 36 custom key/value pairs that describe your model and how it can be used

To view the full list of commands run:

ngc registry model upload-version --help

Let's start by publishing the unpruned model

In [39]:

# Version Number
%env VERSION_NUMBER=maskdetect-unpruned

# Model File
%env MODEL_VERSION_SOURCE=/workspace/detectnet_v2/experiment_dir_unpruned/weights/resnet18_detector.tlt

Out[39]:

env: VERSION_NUMBER=maskdetect-unpruned

env: MODEL_VERSION_SOURCE=/workspace/detectnet_v2/experiment_dir_unpruned/weights/resnet18_detector.tlt

In [40]:

# make directory for inference assets
%env UNPRUNED_ASSET_DIR=/workspace/detectnet_v2/experiment_dir_final_pruned/unpruned
!mkdir $UNPRUNED_ASSET_DIR

# Move spec file with inference labels to dir
!cp $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt $UNPRUNED_ASSET_DIR

# Move model file
!mv $MODEL_VERSION_SOURCE $UNPRUNED_ASSET_DIR

Out[40]:

env: UNPRUNED_ASSET_DIR=/workspace/detectnet_v2/experiment_dir_final_pruned/unpruned

Now run this code cell to upload our retrained model to the NGC Private Registry:

In [41]:

!eval ngc registry model upload-version $MODEL_NAME":"$VERSION_NUMBER  --source $UNPRUNED_ASSET_DIR

Out[41]:

Uploaded 42.93 MB, 1/2 files in 2s, Avg Upload speed: 21.47 MB/s, Curr Upload Speed: 22.33 MB/s

----------------------------------------------------

Model ID: facemaskdetector[version=maskdetect-unpruned]

Upload status: Completed

Uploaded local path (model): /workspace/detectnet_v2/experiment_dir_final_pruned/unpruned

Total files uploaded: 2

Total transferred: 42.93 MB

Started at: 2020-09-24 20:44:44 UTC

Completed at: 2020-09-24 20:44:47 UTC

Duration taken: 3s

----------------------------------------------------

Now let's repeat that process and publish the pruned model weights to our NGC Private Registry...

In [42]:

# Version Number
%env VERSION_NUMBER=maskdetect-pruned


# Model File
%env MODEL_VERSION_SOURCE=/workspace/detectnet_v2/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt

Out[42]:

env: VERSION_NUMBER=maskdetect-pruned

env: MODEL_VERSION_SOURCE=/workspace/detectnet_v2/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt

In [43]:

# make directory for inference assets
%env PRUNED_ASSET_DIR=/workspace/detectnet_v2/experiment_dir_final_pruned/pruned
!mkdir $PRUNED_ASSET_DIR

# Move spec file with inference labels to dir
!cp $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt $PRUNED_ASSET_DIR

# Move model file
!mv $MODEL_VERSION_SOURCE $PRUNED_ASSET_DIR

Out[43]:

env: PRUNED_ASSET_DIR=/workspace/detectnet_v2/experiment_dir_final_pruned/pruned

In [44]:

!ngc registry model upload-version $MODEL_NAME":"$VERSION_NUMBER --source $PRUNED_ASSET_DIR

Out[44]:

----------------------------------------------------

Model ID: facemaskdetector[version=maskdetect-pruned]

Upload status: Completed

Uploaded local path (model): /workspace/detectnet_v2/experiment_dir_final_pruned/pruned

Total files uploaded: 2

Total transferred: 1.23 MB

Started at: 2020-09-24 20:45:21 UTC

Completed at: 2020-09-24 20:45:22 UTC

Duration taken: 1s

----------------------------------------------------

Finally let's publish the exported model for inference

In [45]:

# Version Number
%env VERSION_NUMBER=maskdetect-ds-inference


# Model File
%env MODEL_VERSION_SOURCE=/workspace/detectnet_v2/experiment_dir_final_pruned/resnet18_detector_pruned.etlt

Out[45]:

env: VERSION_NUMBER=maskdetect-ds-inference

env: MODEL_VERSION_SOURCE=/workspace/detectnet_v2/experiment_dir_final_pruned/resnet18_detector_pruned.etlt

In [46]:

# make directory for inference assets
%env INFERENCE_ASSET_DIR=/workspace/detectnet_v2/experiment_dir_final_pruned/dsinference
!mkdir $INFERENCE_ASSET_DIR

#DS Configs Path
%env DS_CONFIG_PATH=/workspace/face-mask-detection/ds_configs

# Move labels file with inference labels to dir
!cp $DS_CONFIG_PATH/labels_masknet.txt $INFERENCE_ASSET_DIR/labels.txt

# Move inference config file
!cp $DS_CONFIG_PATH/config_infer_primary_masknet_gpu.txt $INFERENCE_ASSET_DIR/dsinferenceconfig.txt

# Move ds config file
!cp /workspace/dsconfig.txt $INFERENCE_ASSET_DIR

# Move model file
!mv $MODEL_VERSION_SOURCE $INFERENCE_ASSET_DIR

Out[46]:

env: INFERENCE_ASSET_DIR=/workspace/detectnet_v2/experiment_dir_final_pruned/dsinference

env: DS_CONFIG_PATH=/workspace/face-mask-detection/ds_configs

In [47]:

# List all the assets in our inference directory 
!ls $INFERENCE_ASSET_DIR

Out[47]:

dsconfig.txt  dsinferenceconfig.txt  labels.txt  resnet18_detector_pruned.etlt

In [48]:

!ngc registry model upload-version $MODEL_NAME":"$VERSION_NUMBER  --source $INFERENCE_ASSET_DIR

Out[48]:

----------------------------------------------------

Model ID: facemaskdetector[version=maskdetect-ds-inference]

Upload status: Completed

Uploaded local path (model): /workspace/detectnet_v2/experiment_dir_final_pruned/dsinference

Total files uploaded: 4

Total transferred: 1.1 MB

Started at: 2020-09-24 20:50:41 UTC

Completed at: 2020-09-24 20:50:42 UTC

Duration taken: 1s

----------------------------------------------------

Fantastic! Now all of our awesome work has been uploaded to our NGC Private Registry where we can share it with our other team members who can collaborate on the project or integrate the incredible face-mask-detector model into their own applications.

The model versions in NGC look like this:

NGC Model Versions

Using NGC & Helm To Package Inference Application

What Is Helm?

Helm provides a container orchestration tool that allows you to configure and manage your containerized application deployments. Helm charts allow DevOps engineers and system administrators to describe exactly what components an application needs to run, simplifying deployment pipelines and enabling the easy deployment and integration of new apps.

NGC has a full Helm Registry which you can use to manage your helm charts as well as a Helm Catalog

We can modify the DeepStream chart in our catalog to create a packaged deployment pattern for our mask detector.

Let's get to it...

Step 1 - Download/Install Helm & Plugins

The first step is to download and install Helm in our environment. This will allow us to download and publish charts to our NGC Private Registry. There are two components we need to install:

Helm - the orchestration tool itself
Helm Push Plugin - to publish charts to NGC

The next code block installs the Helm CLI tools:

In [49]:

!curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3
!chmod 700 get_helm.sh
!./get_helm.sh

Out[49]:

Downloading https://get.helm.sh/helm-v3.3.3-linux-amd64.tar.gz

Verifying checksum... Done.

Preparing to install helm into /usr/local/bin

helm installed into /usr/local/bin/helm

In [50]:

# Verify the installation worked..
!which helm

Out[50]:

/usr/local/bin/helm

Now let's grab the Helm Push plugin...

In [51]:

!helm plugin install https://github.com/chartmuseum/helm-push.git

Out[51]:

Downloading and installing helm-push v0.8.1 ...

https://github.com/chartmuseum/helm-push/releases/download/v0.8.1/helm-push_0.8.1_linux_amd64.tar.gz

Installed plugin: push

The NGC docs provide step-by-step instructions for configuring your helm environment. We can use the NGC CLI API key we set earlier to authenticate with the Helm service too. We're going to add two Helm repos:

The public catalog - so we can grab the starter DeepStream chart
Our NGC Private Registry - so we can share and modify the chart to use our face detector model

In [52]:

!helm repo add ngc-catalog https://helm.ngc.nvidia.com/nvidia

Out[52]:

"ngc-catalog" has been added to your repositories

In [53]:

# Edit the repo name/url to match your private registry
!helm repo add ngc-privatereg https://helm.ngc.nvidia.com/ngcvideos --username=\$oauthtoken --password=$NGC_KEY

Out[53]:

"ngc-privatereg" has been added to your repositories

In [54]:

# Verify both repos were added successfully
!helm repo list

Out[54]:

NAME          	URL

ngc-catalog   	https://helm.ngc.nvidia.com/nvidia

ngc-privatereg	https://helm.ngc.nvidia.com/ngcvideos

Step 3 - Download DeepStream Helm Chart

We need to download the DeepStream chart from NGC so we can modify it for our mask detection use case. Let's use the Helm CLI to check it's available in the NGC Catalog (we know it is):

In [55]:

!helm search repo video-analytics-demo

Out[55]:

NAME                            	CHART VERSION	APP VERSION	DESCRIPTION

ngc-catalog/video-analytics-demo	0.1.5        	1.2        	A Helm chart for Deepstream Intelligent Video A...

We now need to fetch the chart from the NGC Catalog and pull it to our local environment. The NGC UI makes this workflow really easy. We've put the command you need to helm fetch the chart right in our UI:

Helm Fetch Command

Let's run that in the code cell below:

In [56]:

!helm fetch https://helm.ngc.nvidia.com/nvidia/charts/video-analytics-demo-0.1.5.tgz

Step 4 - Modify The Chart To Use Our Model

Now let's modify the chart we downloaded from the NGC Catalog to use our mask-detector model. The process is really simple:

Rename the chart for our private registry
Unzip the chart
Edit the chart to point to our model
Package the chart

Let's start by unzipping the chart:

In [57]:

# Now unzip the chart
!tar -xvzf video-analytics-demo-0.1.5.tgz

Out[57]:

video-analytics-demo/Chart.yaml

video-analytics-demo/values.yaml

video-analytics-demo/templates/NOTES.txt

video-analytics-demo/templates/_helpers.tpl

video-analytics-demo/templates/configmap.yaml

video-analytics-demo/templates/deployment.yaml

video-analytics-demo/templates/ingress.yaml

video-analytics-demo/templates/service.yaml

video-analytics-demo/templates/tests/test-connection.yaml

video-analytics-demo/README.md

video-analytics-demo/_helmignore

video-analytics-demo/config/create_config.py

video-analytics-demo/config/play.html

We now need to edit the values.yaml file to point the chart to our face detector model.

We need to edit the following properties under the ngcModel heading:

getModel - This should be the NGC CLI command to download the model from our private registry
name - The name of the model
filename - The name of the model file
modelConfig - The config file for our model

Then edit the camera details to point to our own video stream:

camera1 - Link to the rtsp stream for our camera

note you can add other cameras by adding new rows such as camera2 && camera3 etc.

So in our example we're updating the values.yaml file with the following properties:

getModel - wget --content-disposition https://api.ngc.nvidia.com/v2/models/ngcvideos/facemaskdetector/versions/maskdetect-inference/zip -O facemaskdetector_maskdetect-inference.zip
name - facemaskdetector
filename - resnet18_detector_unpruned.etlt
modelConfig - /opt/nvidia/deepstream/deepstream5.0/samples/configs/tlt_pretrained_models/config_infer_primary_detectnet_v2.txt
camera1 - rtsp://admin:nvidia3D!@67.161.8.179:554/Streaming/Channels/101

You can see what the completed values.yaml file looks like by executing the code cell below.

Next we're going to update the Chart.yaml file to update the name description and version number for our modified chart:

name - The name of the packaged chart
description - Short description of the chart
version - SemVer version of our chart

We've updated the values to be:

name - ds-face-mask-detection
description - A Helm chart for DeepStream Face Mask Detection
version - 1.0.0 -> It's ready for release right?

CHART.YAML

If you want to view the full values.yaml and Chart.yaml files, run the next two code cells. Otherwise, skip ahead!

In [58]:

# View contents of values.yaml
!cat video-analytics-demo/values.yaml

Out[58]:

# Default values for video-analytics-demo

# This is a YAML-formatted file.

# Declare variables to be passed into your templates.

replicaCount: 1

image:

  repository: nvcr.io/nvidia/deepstream

  tag: 5.0-20.07-samples

  pullPolicy: IfNotPresent

  webui: anguda/ant-media:1.0

# Update the NGC Model to use in Deepstream

ngcModel:

  #NGC Model Pruned URL from NGC

  getModel: wget --content-disposition https://api.ngc.nvidia.com/v2/models/ngcvideos/facemaskdetector/versions/maskdetect-inference/zip -O facemaskdetector_facedetect-inference.zip

  #NGC model name

  name: facemaskdetector

  # Model File Name that will use in Deepstream

  fileName: resnet18_detector_unpruned.etlt

  # Model Config that needs to update

  modelConfig: /opt/nvidia/deepstream/deepstream5.0/samples/configs/tlt_pretrained_models/detectnet_v2_labels.txt

  #Do not update the Put Model

  putModel: /opt/nvidia/deepstream/deepstream-5.0/samples/configs/tlt_pretrained_models/

imagePullSecrets:

- name: nvidia-registrykey-secret

nameOverride: ""

fullnameOverride: ""

command:

  apprunnercmd: "python"

  apprunnername: "/opt/nvidia/deepstream/create_config.py"

  appname: "deepstream-app"

  apparg: "/opt/nvidia/deepstream/deepstream-5.0/samples/configs/deepstream-app/source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt"

service:

  type: NodePort

  port: 80

  nodePort: 31113

  webuiPort: 5080

  webuinodePort: 31115

#specify camera IP as rtsp://username:password@ip

#or rtsp://ip if it has no username and password

cameras:

  camera1: rtsp://admin:nvidia3D!@67.161.8.179:554/Streaming/Channels/101

ingress:

  enabled: false

  annotations: {}

    # kubernetes.io/ingress.class: nginx

    # kubernetes.io/tls-acme: "true"

  hosts:

    - host: chart-example.local

      paths: []

  tls: []

  #  - secretName: chart-example-tls

  #    hosts:

  #      - chart-example.local

resources: {}

  # We usually recommend not to specify default resources and to leave this as a conscious

  # choice for the user. This also increases chances charts run on environments with little

  # resources, such as Minikube. If you do want to specify resources, uncomment the following

  # lines, adjust them as necessary, and remove the curly braces after 'resources:'.

  # limits:

  #   cpu: 100m

  #   memory: 128Mi

  # requests:

  #   cpu: 100m

  #   memory: 128Mi

nodeSelector: {}

tolerations: []

affinity: {}

In [59]:

# View contents of Chart.yaml
!cat video-analytics-demo/Chart.yaml

Out[59]:

apiVersion: v1

appVersion: "1.2"

description: A Helm chart for DeepStream detection of face masks

name: ds-face-mask-detection

version: 1.0.0

Step 5 - Publish The Chart To An NGC Private Registry

Now we've edited our chart values, and before we upload any content to our NGC Private Registry, we need to package the helm chart and publish it to our private registry. The first step is to package our chart with the helm package command:

In [60]:

!helm package video-analytics-demo

Out[60]:

Successfully packaged chart and saved it to: /workspace/ds-face-mask-detection-1.0.0.tgz

Now let's use the helm push plugin to publish the chart to our NGC Private Registry. You can view the full docs here.

Remember to update the repo name to match your private registry...

In [61]:

!helm push ds-face-mask-detection-1.0.0.tgz ngc-privatereg

Out[61]:

Pushing ds-face-mask-detection-1.0.0.tgz to ngc-privatereg...

Done.

Done is right! We can now view the chart in our NGC Private Registry, all ready for deployment...

HELM-PR

Inference

We've done it! The end to process of using NGC Catalog content in Collections, retraining with TLT to our custom COVID-related use case, using NGC Private Registry to manage our models, share them with our team and package our application in helm charts and finally deploying the custom application with NVIDIA Fleet Commander. In one notebook. Easy.

Now our application is deployed and attached to our RTSP camera stream, we can detect if people are wearing face masks in real time:

Add New App

NGC GTC Fall 2020 Notebook.ipynb

NGC Catalog Collections - GTC 2020 Demo