¹Technion – Israel Institute of Technology		 ²Princeton University
Image restoration problems are typically ill-posed in the sense that each degraded image can be restored in infinitely many valid ways. To accommodate this, many works generate a diverse set of outputs by attempting to randomly sample from the posterior distribution of natural images given the degraded input. Here we argue that this strategy is commonly of limited practical value because of the heavy tail of the posterior distribution. Consider for example inpainting a missing region of the sky in an image. Since there is a high probability that the missing region contains no object but clouds, any set of samples from the posterior would be entirely dominated by (practically identical) completions of sky. However, arguably, presenting users with only one clear sky completion, along with several alternative solutions such as airships, birds, and balloons, would better outline the set of possibilities. In this paper, we initiate the study of meaningfully diverse image restoration. We explore several post-processing approaches that can be combined with any diverse image restoration method to yield semantically meaningful diversity. Moreover, we propose a practical approach for allowing diffusion based image restoration methods to generate meaningfully diverse outputs, while incurring only negligent computational overhead. We conduct extensive user studies to analyze the proposed techniques, and find the strategy of reducing similarity between outputs to be significantly favorable over posterior sampling.
When sampling multiple times from diverse restoration models, the samples tend to repeat themselves, exhibiting only minor semantic variability. This is illustrated in the figure above which depicts two masked images with corresponding 10 random samples each, obtained from RePaint. As can be seen, none of the 10 completions corresponding to the eye region depict glasses, and none of the 10 samples corresponding to the mouth region depict a closed mouth. Yet, when examining 100 samples from the model, it is evident that such completions are possible; they are simply rare (2 out of 100 samples). We argue that this phenomenon is due to the fact that the posterior distribution is often heavy-tailed along semantically interesting directions. Heavy-tailed distributions are characterized by a non-negligible probability of obtaining very distinct "outliers", and in the context of image restoration, these outliers often correspond to different semantic meanings. In our paper we show that cases in which the posterior is heavy tailed are not rare, and therefore initiate the study of meaningfully diverse image restoration, which aims at reflecting to a user the perceptual range of plausible solutions rather than adhering to their likelihood.
We focus on restoration techniques that are based on diffusion models, as they achieve state-of-the-art results, and run the diffusion process to simultaneously generate N images all conditioned on the same degraded input image. We use each of the methods either as-is (left), or by augmenting it with our diversity guidance mechanism (right). Additional information can be found in the paper.
Bibtex
This webpage was originally made by Matan Kleiner with the
help of Hila Manor
for SinDDM and can be used as a template.
It is inspired by the template that was originally made by Phillip Isola and
Richard Zhang for a colorful ECCV project;
the code for the original template can be found here.