ImageDev

Deep learning

This group contains algorithms performing a prediction from a fully convolutional neural network.

Overview

Among machine learning methods, deep learning has proved to be especially valuable in many image processing tasks. Deep learning models can be trained from a set of input images and the corresponding target results, such as manual segmentations reviewed by an expert. They can then be applied to predict results automatically from previously unseen images.

Deep learning refers to neural network models, which contain several layers of neurons. Each neuron combines pieces of input data or results from other neurons, to produce a result. The combination is realized through a weighted sum, where each weight corresponds to a parameter of the model. Typically, deep learning models can easily involve millions of such parameters.

Making predictions using a pre-trained model is a straightforward task. The deep learning prediction tools available in ImageDev can apply a variety of trained models, provided they are designed for 2D image processing tasks, such as image restoration or segmentation.

Prerequisites

To reap benefits from GPU optimizations when launching a prediction with ImageDev, you need: If the verbose mode is activated, it is indicated in the standard output, for each call to a ONNX command, if the command is executed on CPU or on GPU.

ONNX

ImageDev prediction tools use the Open Neural Network Exchange run time. ONNX is an interoperable framework enabling collaboration in the AI community. The ONNX framework provides tools for executing AI operations and a data model for representing convolutional neural networks.

ONNX models are the input of ImageDev prediction tools.

ImageDev relies on ONXX RunTime 1.8.2 that can run models with an opset version 14 or lower.

Model conversion

Some models can be found directly on the ONNX Model Zoo. TensorFlow and Keras models can be converted to ONNX with a Python script.

To apply the following snippet, we consider that a trained model has been created in the my_path folder. This model is composed of a weight file my_model.hdf5 and a configuration file my_model.json.
import keras
import tf2onnx
import tensorflow as tf
json_file = open(my_path + 'my_model.json', 'r') model_json = json_file.read() json_file.close() model_keras = keras.models.model_from_json(model_json) model_keras.load_weights(my_path + 'my_model.hdf5')
spec = (tf.TensorSpec((None, None, None, 1), tf.float32, name="input"),)
model_onnx, _ = tf2onnx.convert.from_keras(model_keras, input_signature=spec, opset=13, output_path= my_path+"my_model.onnx")

Pre-processing

Before performing a prediction, a set of operations can be sequentially applied to prepare the data in accordance with what the model expects and how it has been trained.

Normalization

A normalization is optionally applied on the input to map the input values to the range expected by the model.

The different options are: where: The normalization can be applied either individually on each image of the input batch, or globally to the input batch.

Tiling

If the image to be processed is large, ImageDev can process it tile by tile to reduce the GPU memory requirements. The tiling process is defined by a size parameter and an overlap. The patches sent to the prediction have systematically the size defined by the tileSize parameter. Using an overlap avoids wrong predictions at the tile borders.

Using tiles presents several interests: It is important to note that the size of the tiles to be processed depends on the architecture of the model. Indeed, the dimensions of each tile must be proportional to the number of downsamplings performed by the model in its contracting phase in order to retrieve a coherent image at the output of the expansive phase. Each component of the tile size must therefore be a multiple of $M = 2^N$, where $N$ is the number of downsampling or upsampling layers. This number can be determined by counting the number of upsampling layers in the model. If this condition is not met, an exception is raised.

If the input dimensions are a multiple of $M$, the tiling step can be skipped by setting a tile size equal to the input image size.

Data Format

A 4D tensor is expected as input by most deep learning models. ImageDev allows the conversion of the input data set to the NCHW and NHWC tensor layouts that are commonly used by the deep learning community. Depending on the model to apply, a specific layout must be set. ImageDev can automatically convert the input data set to the expected layout by specifying it with the dataFormat parameter.