ImageDev

CudaAdaptiveManifoldNlmFilter

Smooths an image using an advanced edge preserving filter.

Access to parameter description

For an introduction:

This module implements a non-local-means filtering algorithm using an adaptive-manifolds-based approach, as described by Gastal and Oliveira in Adaptive Manifolds for Real-Time High-Dimensional Filtering (2012).

The original non-local means algorithm was proposed by Buades et al. in A Non Local Algorithm for Image Denoising (2005). As many other denoising algorithms, it assumes that the noise in the dataset is white noise. This is a safe assumption for most datasets. The non-local-means algorithm naturally preserves most features present in the image, even small and thin ones. However, the algorithm must not be confused with a feature enhancement algorithm, such as edge enhancement. If edge enhancement is required, it is suggested to run non-local means filtering first followed by edge enhancement. This should give the best results.

In order to determine the new value for the current voxel, the original non-local means algorithm compares the neighborhoods of all voxels in a given search window with the neighborhood of the current voxel. The similarity between the neighborhoods determines the weight with which the value of a voxel in the search window will influence the new value of the current voxel. The final weights are determined by applying a Gauss kernel to the similarity values.

Even with restricted search windows, the original non-local means shows a poor performance due to its $O(n^3)$ complexity. Recent research in the literature has been focused on providing techniques to improve the performance of this algorithm. One of the techniques that has shown the most promising results is the high-dimensional filtering. High-dimensional filter is based on a signal-processing approach. The neighborhood surrounding a pixel constitutes a patch. The set of all patches is mapped onto a space of $n_r + n_s$ dimensions, where $n_r$ is the number of range dimensions (i.e., the size of the neighborhood), and $n_s$ is the size of the spatial dimensions (for 3D data, it is $3$). A blurring step is performed into this high-dimensional space, and then the values of the filtered dataset are retrieved.

Although this technique enables several optimization techniques, filtering the whole high-dimensional space is not cost-effective, as the original dataset only covers a very small portion of it. The adaptive-manifolds approach dramatically improves the performance. This algorithm generates several manifolds within the high-dimensional space that approximate the patch set. Only the data on the manifolds is blurred, and the filtered dataset is retrieved from these manifolds. This approximation to the non-local means algorithm reduces its complexity to linear in the number of pixels and in the dimension of the space.

A more detailed discussion and comparison of the Adaptive Manifold Non-Local Means filtering to other denoising methods is given in the following technical report.

Julian Lamas-Rodriguez, Moritz Ehlke, Rene Hoffmann and Stefan Zachow, GPU-accelerated denoising of large tomographic data sets with low SNR, 2015, ZIB-Report 15-14.

References

See also

Function Syntax

This function returns outputImage.
// Function prototype
std::shared_ptr< iolink::ImageView > cudaAdaptiveManifoldNlmFilter( std::shared_ptr< iolink::ImageView > inputImage, double spatialStandardDeviation, double intensityStandardDeviation, uint32_t kernelRadius, uint32_t patchRadius, CudaContext::Ptr cudaContext, std::shared_ptr< iolink::ImageView > outputImage = nullptr );
This function returns outputImage.
// Function prototype.
cuda_adaptive_manifold_nlm_filter(input_image: idt.ImageType,
                                  spatial_standard_deviation: float = 5,
                                  intensity_standard_deviation: float = 0.2,
                                  kernel_radius: int = 10,
                                  patch_radius: int = 3,
                                  cuda_context: Union[CudaContext, None] = None,
                                  output_image: idt.ImageType = None) -> idt.ImageType
This function returns outputImage.
// Function prototype.
public static IOLink.ImageView
CudaAdaptiveManifoldNlmFilter( IOLink.ImageView inputImage,
                               double spatialStandardDeviation = 5,
                               double intensityStandardDeviation = 0.2,
                               UInt32 kernelRadius = 10,
                               UInt32 patchRadius = 3,
                               Data.CudaContext cudaContext = null,
                               IOLink.ImageView outputImage = null );

Class Syntax

Parameters

Parameter Name Description Type Supported Values Default Value
input
inputImage
The input image. The image type can be integer or float. Image Grayscale or Multispectral nullptr
input
spatialStandardDeviation
The spatial standard deviation. Intuitively, it allows controlling how fast the similarity between two voxels decreases depending on their distance. The higher this value is, the blurrier the result is. Float64 >=0 5
input
intensityStandardDeviation
The intensity/range standard deviation. Intuitively, it allows controlling how fast the similarity between two voxels decreases depending on the intensities of their neighborhoods.
It affects the blurriness of the result in much the same way as the spatial standard deviation, but the effect is not as strong.

Limitation: Furthermore, memory fragmentation can prevent the allocation of large CUDA memory blocks. In both cases, the computation might fail due to lack of memory.
The recommendation, therefore, is to run the computation alone.
Float64 >0 0.2
input
kernelRadius
The search window size to apply. The algorithm looks for matches within this area around each point.
The value represents the radius of the search window area in number of voxels. The larger the search window is, the better the results usually are. But the size of the search window also effects the run time significantly. The larger the search window, the longer the run time. This value has to be set to a large enough value so that similar structures can be found within the search window area. Too small values will result in simple blurring of the image because there is not enough structural information within the search window area. A search window size of 10 is usually a good choice.
UInt32 >=1 10
input
patchRadius
Size of the neighborhood window. The influence of each point in the search window on the base point is weighted by comparing the neighborhood window of this point with the neighborhood window of the base point of the search window.
The value represents the radius of the neighborhood area in number of voxels, and affects the quality of the result as well as the run time. If this value is either much smaller or much larger than fine structures in the data the algorithm shows little or no effect at all. The larger the value, the longer the run time. For most use cases, a maximum value of five is sufficient to achieve good filtering results. Therefore, the maximum value of this port is set to five to avoid unnecessarily long run times.
UInt32 [3, 5] 3
input
cudaContext
CUDA context information. CudaContext nullptr
output
outputImage
The output image. Its dimensions and type are forced to the same values as the input image. Image nullptr
Parameter Name Description Type Supported Values Default Value
input
input_image
The input image. The image type can be integer or float. image Grayscale or Multispectral None
input
spatial_standard_deviation
The spatial standard deviation. Intuitively, it allows controlling how fast the similarity between two voxels decreases depending on their distance. The higher this value is, the blurrier the result is. float64 >=0 5
input
intensity_standard_deviation
The intensity/range standard deviation. Intuitively, it allows controlling how fast the similarity between two voxels decreases depending on the intensities of their neighborhoods.
It affects the blurriness of the result in much the same way as the spatial standard deviation, but the effect is not as strong.

Limitation: Furthermore, memory fragmentation can prevent the allocation of large CUDA memory blocks. In both cases, the computation might fail due to lack of memory.
The recommendation, therefore, is to run the computation alone.
float64 >0 0.2
input
kernel_radius
The search window size to apply. The algorithm looks for matches within this area around each point.
The value represents the radius of the search window area in number of voxels. The larger the search window is, the better the results usually are. But the size of the search window also effects the run time significantly. The larger the search window, the longer the run time. This value has to be set to a large enough value so that similar structures can be found within the search window area. Too small values will result in simple blurring of the image because there is not enough structural information within the search window area. A search window size of 10 is usually a good choice.
uint32 >=1 10
input
patch_radius
Size of the neighborhood window. The influence of each point in the search window on the base point is weighted by comparing the neighborhood window of this point with the neighborhood window of the base point of the search window.
The value represents the radius of the neighborhood area in number of voxels, and affects the quality of the result as well as the run time. If this value is either much smaller or much larger than fine structures in the data the algorithm shows little or no effect at all. The larger the value, the longer the run time. For most use cases, a maximum value of five is sufficient to achieve good filtering results. Therefore, the maximum value of this port is set to five to avoid unnecessarily long run times.
uint32 [3, 5] 3
input
cuda_context
CUDA context information. cuda_context None
output
output_image
The output image. Its dimensions and type are forced to the same values as the input image. image None
Parameter Name Description Type Supported Values Default Value
input
inputImage
The input image. The image type can be integer or float. Image Grayscale or Multispectral null
input
spatialStandardDeviation
The spatial standard deviation. Intuitively, it allows controlling how fast the similarity between two voxels decreases depending on their distance. The higher this value is, the blurrier the result is. Float64 >=0 5
input
intensityStandardDeviation
The intensity/range standard deviation. Intuitively, it allows controlling how fast the similarity between two voxels decreases depending on the intensities of their neighborhoods.
It affects the blurriness of the result in much the same way as the spatial standard deviation, but the effect is not as strong.

Limitation: Furthermore, memory fragmentation can prevent the allocation of large CUDA memory blocks. In both cases, the computation might fail due to lack of memory.
The recommendation, therefore, is to run the computation alone.
Float64 >0 0.2
input
kernelRadius
The search window size to apply. The algorithm looks for matches within this area around each point.
The value represents the radius of the search window area in number of voxels. The larger the search window is, the better the results usually are. But the size of the search window also effects the run time significantly. The larger the search window, the longer the run time. This value has to be set to a large enough value so that similar structures can be found within the search window area. Too small values will result in simple blurring of the image because there is not enough structural information within the search window area. A search window size of 10 is usually a good choice.
UInt32 >=1 10
input
patchRadius
Size of the neighborhood window. The influence of each point in the search window on the base point is weighted by comparing the neighborhood window of this point with the neighborhood window of the base point of the search window.
The value represents the radius of the neighborhood area in number of voxels, and affects the quality of the result as well as the run time. If this value is either much smaller or much larger than fine structures in the data the algorithm shows little or no effect at all. The larger the value, the longer the run time. For most use cases, a maximum value of five is sufficient to achieve good filtering results. Therefore, the maximum value of this port is set to five to avoid unnecessarily long run times.
UInt32 [3, 5] 3
input
cudaContext
CUDA context information. CudaContext null
output
outputImage
The output image. Its dimensions and type are forced to the same values as the input image. Image null

Object Examples

auto foam = readVipImage( std::string( IMAGEDEVDATA_IMAGES_FOLDER ) + "foam.vip" );

CudaAdaptiveManifoldNlmFilter cudaAdaptiveManifoldNlmFilterAlgo;
cudaAdaptiveManifoldNlmFilterAlgo.setInputImage( foam );
cudaAdaptiveManifoldNlmFilterAlgo.setSpatialStandardDeviation( 5.0 );
cudaAdaptiveManifoldNlmFilterAlgo.setIntensityStandardDeviation( 0.2 );
cudaAdaptiveManifoldNlmFilterAlgo.setKernelRadius( 10 );
cudaAdaptiveManifoldNlmFilterAlgo.setPatchRadius( 3 );
cudaAdaptiveManifoldNlmFilterAlgo.setCudaContext( nullptr );
cudaAdaptiveManifoldNlmFilterAlgo.execute();

std::cout << "outputImage:" << cudaAdaptiveManifoldNlmFilterAlgo.outputImage()->toString();
foam = imagedev.read_vip_image(imagedev_data.get_image_path("foam.vip"))

cuda_adaptive_manifold_nlm_filter_algo = imagedev.CudaAdaptiveManifoldNlmFilter()
cuda_adaptive_manifold_nlm_filter_algo.input_image = foam
cuda_adaptive_manifold_nlm_filter_algo.spatial_standard_deviation = 5.0
cuda_adaptive_manifold_nlm_filter_algo.intensity_standard_deviation = 0.2
cuda_adaptive_manifold_nlm_filter_algo.kernel_radius = 10
cuda_adaptive_manifold_nlm_filter_algo.patch_radius = 3
cuda_adaptive_manifold_nlm_filter_algo.cuda_context = None
cuda_adaptive_manifold_nlm_filter_algo.execute()

print("output_image:", str(cuda_adaptive_manifold_nlm_filter_algo.output_image))
ImageView foam = Data.ReadVipImage( @"Data/images/foam.vip" );

CudaAdaptiveManifoldNlmFilter cudaAdaptiveManifoldNlmFilterAlgo = new CudaAdaptiveManifoldNlmFilter
{
    inputImage = foam,
    spatialStandardDeviation = 5.0,
    intensityStandardDeviation = 0.2,
    kernelRadius = 10,
    patchRadius = 3,
    cudaContext = null
};
cudaAdaptiveManifoldNlmFilterAlgo.Execute();

Console.WriteLine( "outputImage:" + cudaAdaptiveManifoldNlmFilterAlgo.outputImage.ToString() );

Function Examples

auto foam = readVipImage( std::string( IMAGEDEVDATA_IMAGES_FOLDER ) + "foam.vip" );

auto result = cudaAdaptiveManifoldNlmFilter( foam, 5.0, 0.2, 10, 3, nullptr );

std::cout << "outputImage:" << result->toString();
foam = imagedev.read_vip_image(imagedev_data.get_image_path("foam.vip"))

result = imagedev.cuda_adaptive_manifold_nlm_filter(foam, 5.0, 0.2, 10, 3, None)

print("output_image:", str(result))
ImageView foam = Data.ReadVipImage( @"Data/images/foam.vip" );

IOLink.ImageView result = Processing.CudaAdaptiveManifoldNlmFilter( foam, 5.0, 0.2, 10, 3, null );

Console.WriteLine( "outputImage:" + result.ToString() );





© 2025 Thermo Fisher Scientific Inc. All rights reserved.