CudaAdaptiveManifoldNlmFilter
Smooths an image using an advanced edge preserving filter.
Access to parameter description
For an introduction:
This module implements a non-local-means filtering algorithm using an adaptive-manifolds-based approach, as described by Gastal and Oliveira in Adaptive Manifolds for Real-Time High-Dimensional Filtering (2012).
The original non-local means algorithm was proposed by Buades et al. in A Non Local Algorithm for Image Denoising (2005). As many other denoising algorithms, it assumes that the noise in the dataset is white noise. This is a safe assumption for most datasets. The non-local-means algorithm naturally preserves most features present in the image, even small and thin ones. However, the algorithm must not be confused with a feature enhancement algorithm, such as edge enhancement. If edge enhancement is required, it is suggested to run non-local means filtering first followed by edge enhancement. This should give the best results.
In order to determine the new value for the current voxel, the original non-local means algorithm compares the neighborhoods of all voxels in a given search window with the neighborhood of the current voxel. The similarity between the neighborhoods determines the weight with which the value of a voxel in the search window will influence the new value of the current voxel. The final weights are determined by applying a Gauss kernel to the similarity values.
Even with restricted search windows, the original non-local means shows a poor performance due to its $O(n^3)$ complexity. Recent research in the literature has been focused on providing techniques to improve the performance of this algorithm. One of the techniques that has shown the most promising results is the high-dimensional filtering. High-dimensional filter is based on a signal-processing approach. The neighborhood surrounding a pixel constitutes a patch. The set of all patches is mapped onto a space of $n_r + n_s$ dimensions, where $n_r$ is the number of range dimensions (i.e., the size of the neighborhood), and $n_s$ is the size of the spatial dimensions (for 3D data, it is $3$). A blurring step is performed into this high-dimensional space, and then the values of the filtered dataset are retrieved.
Although this technique enables several optimization techniques, filtering the whole high-dimensional space is not cost-effective, as the original dataset only covers a very small portion of it. The adaptive-manifolds approach dramatically improves the performance. This algorithm generates several manifolds within the high-dimensional space that approximate the patch set. Only the data on the manifolds is blurred, and the filtered dataset is retrieved from these manifolds. This approximation to the non-local means algorithm reduces its complexity to linear in the number of pixels and in the dimension of the space.
A more detailed discussion and comparison of the Adaptive Manifold Non-Local Means filtering to other denoising methods is given in the following technical report.
Julian Lamas-Rodriguez, Moritz Ehlke, Rene Hoffmann and Stefan Zachow, GPU-accelerated denoising of large tomographic data sets with low SNR, 2015, ZIB-Report 15-14.
References
See also
Access to parameter description
For an introduction:
- section CUDA
- NonLocalMeansFilter3d
This module implements a non-local-means filtering algorithm using an adaptive-manifolds-based approach, as described by Gastal and Oliveira in Adaptive Manifolds for Real-Time High-Dimensional Filtering (2012).
The original non-local means algorithm was proposed by Buades et al. in A Non Local Algorithm for Image Denoising (2005). As many other denoising algorithms, it assumes that the noise in the dataset is white noise. This is a safe assumption for most datasets. The non-local-means algorithm naturally preserves most features present in the image, even small and thin ones. However, the algorithm must not be confused with a feature enhancement algorithm, such as edge enhancement. If edge enhancement is required, it is suggested to run non-local means filtering first followed by edge enhancement. This should give the best results.
In order to determine the new value for the current voxel, the original non-local means algorithm compares the neighborhoods of all voxels in a given search window with the neighborhood of the current voxel. The similarity between the neighborhoods determines the weight with which the value of a voxel in the search window will influence the new value of the current voxel. The final weights are determined by applying a Gauss kernel to the similarity values.
Even with restricted search windows, the original non-local means shows a poor performance due to its $O(n^3)$ complexity. Recent research in the literature has been focused on providing techniques to improve the performance of this algorithm. One of the techniques that has shown the most promising results is the high-dimensional filtering. High-dimensional filter is based on a signal-processing approach. The neighborhood surrounding a pixel constitutes a patch. The set of all patches is mapped onto a space of $n_r + n_s$ dimensions, where $n_r$ is the number of range dimensions (i.e., the size of the neighborhood), and $n_s$ is the size of the spatial dimensions (for 3D data, it is $3$). A blurring step is performed into this high-dimensional space, and then the values of the filtered dataset are retrieved.
Although this technique enables several optimization techniques, filtering the whole high-dimensional space is not cost-effective, as the original dataset only covers a very small portion of it. The adaptive-manifolds approach dramatically improves the performance. This algorithm generates several manifolds within the high-dimensional space that approximate the patch set. Only the data on the manifolds is blurred, and the filtered dataset is retrieved from these manifolds. This approximation to the non-local means algorithm reduces its complexity to linear in the number of pixels and in the dimension of the space.
A more detailed discussion and comparison of the Adaptive Manifold Non-Local Means filtering to other denoising methods is given in the following technical report.
Julian Lamas-Rodriguez, Moritz Ehlke, Rene Hoffmann and Stefan Zachow, GPU-accelerated denoising of large tomographic data sets with low SNR, 2015, ZIB-Report 15-14.
References
- A. Buades, B. Coll, J.M.Morel. "A Non-Local Algorithm for Image Denoising". Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 60-65, 2005.
- J. Lamas-Rodriguez, M. Ehlke, R. Hoffmann and S. Zachow, GPU-accelerated denoising of large tomographic data sets with low SNR , ZIB-Report, vol. 15, pp. 1-65, 2015.
See also
Function Syntax
This function returns outputImage.
// Function prototype
std::shared_ptr< iolink::ImageView > cudaAdaptiveManifoldNlmFilter( std::shared_ptr< iolink::ImageView > inputImage, double spatialStandardDeviation, double intensityStandardDeviation, uint32_t kernelRadius, uint32_t patchRadius, CudaContext::Ptr cudaContext, std::shared_ptr< iolink::ImageView > outputImage = nullptr );
This function returns outputImage.
// Function prototype. cuda_adaptive_manifold_nlm_filter(input_image: idt.ImageType, spatial_standard_deviation: float = 5, intensity_standard_deviation: float = 0.2, kernel_radius: int = 10, patch_radius: int = 3, cuda_context: Union[CudaContext, None] = None, output_image: idt.ImageType = None) -> idt.ImageType
This function returns outputImage.
// Function prototype. public static IOLink.ImageView CudaAdaptiveManifoldNlmFilter( IOLink.ImageView inputImage, double spatialStandardDeviation = 5, double intensityStandardDeviation = 0.2, UInt32 kernelRadius = 10, UInt32 patchRadius = 3, Data.CudaContext cudaContext = null, IOLink.ImageView outputImage = null );
Class Syntax
Parameters
Parameter Name | Description | Type | Supported Values | Default Value | |
---|---|---|---|---|---|
![]() |
inputImage |
The input image. The image type can be integer or float. | Image | Grayscale or Multispectral | nullptr |
![]() |
spatialStandardDeviation |
The spatial standard deviation. Intuitively, it allows controlling how fast the similarity between two voxels decreases depending on their distance. The higher this value is, the blurrier the result is. | Float64 | >=0 | 5 |
![]() |
intensityStandardDeviation |
The intensity/range standard deviation. Intuitively, it allows controlling how fast the similarity between two voxels decreases depending on the intensities of their neighborhoods.
It affects the blurriness of the result in much the same way as the spatial standard deviation, but the effect is not as strong. Limitation: Furthermore, memory fragmentation can prevent the allocation of large CUDA memory blocks. In both cases, the computation might fail due to lack of memory. The recommendation, therefore, is to run the computation alone. |
Float64 | >0 | 0.2 |
![]() |
kernelRadius |
The search window size to apply. The algorithm looks for matches within this area around each point.
The value represents the radius of the search window area in number of voxels. The larger the search window is, the better the results usually are. But the size of the search window also effects the run time significantly. The larger the search window, the longer the run time. This value has to be set to a large enough value so that similar structures can be found within the search window area. Too small values will result in simple blurring of the image because there is not enough structural information within the search window area. A search window size of 10 is usually a good choice. |
UInt32 | >=1 | 10 |
![]() |
patchRadius |
Size of the neighborhood window. The influence of each point in the search window on the base point is weighted by comparing the neighborhood window of this point with the neighborhood window of the base point of the search window.
The value represents the radius of the neighborhood area in number of voxels, and affects the quality of the result as well as the run time. If this value is either much smaller or much larger than fine structures in the data the algorithm shows little or no effect at all. The larger the value, the longer the run time. For most use cases, a maximum value of five is sufficient to achieve good filtering results. Therefore, the maximum value of this port is set to five to avoid unnecessarily long run times. |
UInt32 | [3, 5] | 3 |
![]() |
cudaContext |
CUDA context information. | CudaContext | nullptr | |
![]() |
outputImage |
The output image. Its dimensions and type are forced to the same values as the input image. | Image | nullptr |
Parameter Name | Description | Type | Supported Values | Default Value | |
---|---|---|---|---|---|
![]() |
input_image |
The input image. The image type can be integer or float. | image | Grayscale or Multispectral | None |
![]() |
spatial_standard_deviation |
The spatial standard deviation. Intuitively, it allows controlling how fast the similarity between two voxels decreases depending on their distance. The higher this value is, the blurrier the result is. | float64 | >=0 | 5 |
![]() |
intensity_standard_deviation |
The intensity/range standard deviation. Intuitively, it allows controlling how fast the similarity between two voxels decreases depending on the intensities of their neighborhoods.
It affects the blurriness of the result in much the same way as the spatial standard deviation, but the effect is not as strong. Limitation: Furthermore, memory fragmentation can prevent the allocation of large CUDA memory blocks. In both cases, the computation might fail due to lack of memory. The recommendation, therefore, is to run the computation alone. |
float64 | >0 | 0.2 |
![]() |
kernel_radius |
The search window size to apply. The algorithm looks for matches within this area around each point.
The value represents the radius of the search window area in number of voxels. The larger the search window is, the better the results usually are. But the size of the search window also effects the run time significantly. The larger the search window, the longer the run time. This value has to be set to a large enough value so that similar structures can be found within the search window area. Too small values will result in simple blurring of the image because there is not enough structural information within the search window area. A search window size of 10 is usually a good choice. |
uint32 | >=1 | 10 |
![]() |
patch_radius |
Size of the neighborhood window. The influence of each point in the search window on the base point is weighted by comparing the neighborhood window of this point with the neighborhood window of the base point of the search window.
The value represents the radius of the neighborhood area in number of voxels, and affects the quality of the result as well as the run time. If this value is either much smaller or much larger than fine structures in the data the algorithm shows little or no effect at all. The larger the value, the longer the run time. For most use cases, a maximum value of five is sufficient to achieve good filtering results. Therefore, the maximum value of this port is set to five to avoid unnecessarily long run times. |
uint32 | [3, 5] | 3 |
![]() |
cuda_context |
CUDA context information. | cuda_context | None | |
![]() |
output_image |
The output image. Its dimensions and type are forced to the same values as the input image. | image | None |
Parameter Name | Description | Type | Supported Values | Default Value | |
---|---|---|---|---|---|
![]() |
inputImage |
The input image. The image type can be integer or float. | Image | Grayscale or Multispectral | null |
![]() |
spatialStandardDeviation |
The spatial standard deviation. Intuitively, it allows controlling how fast the similarity between two voxels decreases depending on their distance. The higher this value is, the blurrier the result is. | Float64 | >=0 | 5 |
![]() |
intensityStandardDeviation |
The intensity/range standard deviation. Intuitively, it allows controlling how fast the similarity between two voxels decreases depending on the intensities of their neighborhoods.
It affects the blurriness of the result in much the same way as the spatial standard deviation, but the effect is not as strong. Limitation: Furthermore, memory fragmentation can prevent the allocation of large CUDA memory blocks. In both cases, the computation might fail due to lack of memory. The recommendation, therefore, is to run the computation alone. |
Float64 | >0 | 0.2 |
![]() |
kernelRadius |
The search window size to apply. The algorithm looks for matches within this area around each point.
The value represents the radius of the search window area in number of voxels. The larger the search window is, the better the results usually are. But the size of the search window also effects the run time significantly. The larger the search window, the longer the run time. This value has to be set to a large enough value so that similar structures can be found within the search window area. Too small values will result in simple blurring of the image because there is not enough structural information within the search window area. A search window size of 10 is usually a good choice. |
UInt32 | >=1 | 10 |
![]() |
patchRadius |
Size of the neighborhood window. The influence of each point in the search window on the base point is weighted by comparing the neighborhood window of this point with the neighborhood window of the base point of the search window.
The value represents the radius of the neighborhood area in number of voxels, and affects the quality of the result as well as the run time. If this value is either much smaller or much larger than fine structures in the data the algorithm shows little or no effect at all. The larger the value, the longer the run time. For most use cases, a maximum value of five is sufficient to achieve good filtering results. Therefore, the maximum value of this port is set to five to avoid unnecessarily long run times. |
UInt32 | [3, 5] | 3 |
![]() |
cudaContext |
CUDA context information. | CudaContext | null | |
![]() |
outputImage |
The output image. Its dimensions and type are forced to the same values as the input image. | Image | null |
Object Examples
auto foam = readVipImage( std::string( IMAGEDEVDATA_IMAGES_FOLDER ) + "foam.vip" ); CudaAdaptiveManifoldNlmFilter cudaAdaptiveManifoldNlmFilterAlgo; cudaAdaptiveManifoldNlmFilterAlgo.setInputImage( foam ); cudaAdaptiveManifoldNlmFilterAlgo.setSpatialStandardDeviation( 5.0 ); cudaAdaptiveManifoldNlmFilterAlgo.setIntensityStandardDeviation( 0.2 ); cudaAdaptiveManifoldNlmFilterAlgo.setKernelRadius( 10 ); cudaAdaptiveManifoldNlmFilterAlgo.setPatchRadius( 3 ); cudaAdaptiveManifoldNlmFilterAlgo.setCudaContext( nullptr ); cudaAdaptiveManifoldNlmFilterAlgo.execute(); std::cout << "outputImage:" << cudaAdaptiveManifoldNlmFilterAlgo.outputImage()->toString();
foam = imagedev.read_vip_image(imagedev_data.get_image_path("foam.vip")) cuda_adaptive_manifold_nlm_filter_algo = imagedev.CudaAdaptiveManifoldNlmFilter() cuda_adaptive_manifold_nlm_filter_algo.input_image = foam cuda_adaptive_manifold_nlm_filter_algo.spatial_standard_deviation = 5.0 cuda_adaptive_manifold_nlm_filter_algo.intensity_standard_deviation = 0.2 cuda_adaptive_manifold_nlm_filter_algo.kernel_radius = 10 cuda_adaptive_manifold_nlm_filter_algo.patch_radius = 3 cuda_adaptive_manifold_nlm_filter_algo.cuda_context = None cuda_adaptive_manifold_nlm_filter_algo.execute() print("output_image:", str(cuda_adaptive_manifold_nlm_filter_algo.output_image))
ImageView foam = Data.ReadVipImage( @"Data/images/foam.vip" ); CudaAdaptiveManifoldNlmFilter cudaAdaptiveManifoldNlmFilterAlgo = new CudaAdaptiveManifoldNlmFilter { inputImage = foam, spatialStandardDeviation = 5.0, intensityStandardDeviation = 0.2, kernelRadius = 10, patchRadius = 3, cudaContext = null }; cudaAdaptiveManifoldNlmFilterAlgo.Execute(); Console.WriteLine( "outputImage:" + cudaAdaptiveManifoldNlmFilterAlgo.outputImage.ToString() );
Function Examples
auto foam = readVipImage( std::string( IMAGEDEVDATA_IMAGES_FOLDER ) + "foam.vip" ); auto result = cudaAdaptiveManifoldNlmFilter( foam, 5.0, 0.2, 10, 3, nullptr ); std::cout << "outputImage:" << result->toString();
foam = imagedev.read_vip_image(imagedev_data.get_image_path("foam.vip")) result = imagedev.cuda_adaptive_manifold_nlm_filter(foam, 5.0, 0.2, 10, 3, None) print("output_image:", str(result))
ImageView foam = Data.ReadVipImage( @"Data/images/foam.vip" ); IOLink.ImageView result = Processing.CudaAdaptiveManifoldNlmFilter( foam, 5.0, 0.2, 10, 3, null ); Console.WriteLine( "outputImage:" + result.ToString() );
© 2025 Thermo Fisher Scientific Inc. All rights reserved.