ImageDev

CUDA

This category describes the ImageDev algorithms GPU-accelerated with the CUDA Toolkit.

Overview

Most ImageDev algorithms are CPU parallelized. Taking advantage of these accelerations is transparent for the user and automatically managed by their default implementation.

In addition to CPU parallelization, ImageDev also offers a collection of algorithms that are optimized for GPU computing. These algorithms are built using the NVIDIA® CUDA® toolkit and are provided as separate implementations, prefixed by the Cuda keyword. The main motivations for having created new algorithms rather than automatically invoking the GPU implementations are: By providing both CPU and GPU-accelerated algorithms, ImageDev aims to offer users the flexibility to choose the most efficient and effective implementation based on their hardware resources and specific processing requirements.

Prerequisites

To execute any GPU-accelerated ImageDev algorithm, you need:

Memory Management

The IOLink ImageView objects can be allocated either in CPU memory or in CUDA GPU memory. While this default behavior has the advantage of being transparent and easy to use for the developer, it may be useful to explicitly manage image transfers between GPU and CUDA memory in order to optimize transfer times.

#include <iolink/cuda/CudaImageFactory.h>
#include <iolink/view/ImageViewFactory.h>>

// Transfer a CPU image to CUDA memory auto imageCuda = iolink_cuda::CudaImageFactory::copyInCudaMemory( imageCpu ); // This image can now be directly processed on the GPU, and then transferred to CPU memory auto imageCpuNew = iolink::ImageViewFactory::copyInMemory( imageCuda );
import iolink
import iolink_cuda

# Transfer a CPU image to CUDA memory image_cuda = iolink_cuda.CudaImageFactory.copy_in_cuda_memory(image_cpu) # This image can now be directly processed on the GPU, and then transferred to CPU memory image_cpu_new = iolink.ImageViewFactory.copy_in_memory(image_cuda)
using IOLink;
using IOLink_Cuda;

// Transfer a CPU image to CUDA memory var imageCuda = CudaImageFactory.CopyInCudaMemory( imageCpu ); // This image can now be directly processed on the GPU, and then transferred to CPU memory var imageCpuNew = ImageViewFactory.CopyInMemory( imageCuda );

Tiling

If the image to be processed is large, ImageDev can process it tile by tile to reduce the GPU memory requirements. The tiling process is defined by a tiling mode and a size parameter. An overlap is deduced from the selected parameters and automatically set. For example, when applying an erosion of size 10 with tiling mode enabled, tiles are extracted with an overlap of 10 pixels.

The tiling step can be skipped by setting the tiling mode to NONE.





© 2025 Thermo Fisher Scientific Inc. All rights reserved.