We present a new learning-based method for denoising of path-traced images from volumetric
medical data, including a specialized loss function, features tailored for this purpose
and a novel network architecture.
Path-traced videos of CT scans are of high value for surgery planning, medical education
and the like.
As convergence requires tracing thousands of lighting paths, fast previews with only a
few light samples exhibit a high amount of noise.
We present a learning-based method to denoise these noisy images using a convolutional neural
network.
To gain as much speedup as possible, we use images of only one sample per pixel as an
input, which unfortunately do not contain enough information for reconstruction.
Apart from the noisy one sample per pixel image itself, we therefore additionally feed
several further features to our network, like it is done for the surface-only case.
In the volumetric case, however, finding features is not trivial, as properties like depth and
normal are not well defined.
We opt for seven additional features, split into primary and secondary ones.
The primary features stem from the raceway up to the first scatter event.
The first one is the position of the first scatter event, interpreted as the position
of the pixel.
The second one, the gradient direction of this position, as an approximate normal vector.
The third primary feature is the albedo of the first scatter event.
The fourth feature texture combines multiple volumetric characteristics, gradient magnitude,
accumulated opacity and density of the volume, color-coded in R, G and B.
The secondary features deal with the further course of the wave.
Here we utilize the color of the first surface scatter event, the albedo at the second scatter
position and volumetric characteristics of the wave from the first to the second scatter
event.
Note that the features, just like the one sample per pixel color input, suffer from
noise due to undersampling, especially severe in the secondary features.
Our basic network is a standard unit with skip connections.
However, to be able to gainfully include the very noisy secondary features, we duplicate
it and create dual network, where one part computes direct effects using the primary
features and the other one computes indirect effects using the secondary features.
For evaluating the output of this network, we furthermore utilize a combination of a
static loss term and a relativistic discriminator to further sharpen the image.
Moreover, we include the temporal domain by taking the preceding and subsequent frame
into account.
To be able to do so, we reproject both the color input and all features to the current
frame.
The triplets are preprocessed in a number of separate autoencoders, which complete our
network architecture.
One can clearly see the improvements in the video results when adding the different components
one after another.
Note the gain in temporal stability when the reprojection stage is added to the network.
Our method creates stunning results, correctly reproducing both surface-like hard materials
with their typical specular highlights and fuzzy volumetric regions showing a high amount
of transparency.
We compare our method to the optics denoiser applied to the one sample per pixel color
inputs.
Our approach clearly outperforms it, generating images very similar to the target image in
Presenters
Zugänglich über
Offener Zugang
Dauer
00:04:08 Min
Aufnahmedatum
2021-11-13
Hochgeladen am
2021-11-13 22:26:12
Sprache
en-US
In this paper, we transfer machine learning techniques previously applied to denoising surface-only Monte Carlo renderings to path-traced visualizations of medical volumetric data. In the domain of medical imaging, path-traced videos turned out to be an efficient means to visualize and understand internal structures, in particular for less experienced viewers such as students or patients. However, the computational demands for the rendering of high-quality path-traced videos are very high due to the large number of samples necessary for each pixel. To accelerate the process, we present a learning-based technique for denoising path-traced videos of volumetric data by increasing the sample count per pixel; both through spatial (integrating neighboring samples) and temporal filtering (reusing samples over time). Our approach uses a set of additional features and a loss function both specifically designed for the volumetric case. Furthermore, we present a novel network architecture tailored for our purpose, and introduce reprojection of samples to improve temporal stability and reuse samples over frames. As a result, we achieve good image quality even from severely undersampled input images, as visible in the teaser image.