Locally Orderless Images for Optimization in Differentiable Rendering

CVPR 2025

Abstract

Problems in differentiable rendering often involve optimizing scene parameters that cause motion in image space. The gradients for such parameters tend to be sparse, leading to poor convergence. While existing methods address this sparsity through proxy gradients such as topological derivatives or lagrangian derivatives, they make simplifying assumptions about rendering. Multi-resolution image pyramids offer an alternative approach but prove unreliable in practice. We introduce a method that uses locally orderless images — where each pixel maps to a histogram of intensities that preserves local variations in appearance. Using an inverse rendering objective that minimizes histogram distance, our method extends support for sparsely defined image gradients and recovers optimal parameters. We validate our method on various inverse problems using both synthetic and real data.

[PDF] [BibTex]

Main Figures

Figure 1: Scale-space matching extends gradient support. Given an image (a) of a disk we recover its position θ on the horizontal axis. At stationary resolution (σ = 0), the initial and target (dotted) disks do not overlap, as shown in the corresponding 1D signals in (b). The image gradient ∂I/∂θ is sparse (orange) and is non-zero only at the boundaries of the disk (c-top). The error gradient ∂E/∂θ is zero everywhere (green) and the optimization is stuck in a local minimum. When matching at coarser scales (d), the gradients are no longer sparse (c-bottom), leading to optimal recovery.

Figure 2. Tonal Separation. Shown are two (a-top and a-bottom) 1D inverse problems where we recover disk positions (θ) from images (left). Image matching within σ-space measures only the errors in the mean of the intensity distributions at each scale. In inverse settings that involve multiple objects with different appear- ances, this approach is likely to get stuck in a local minimum (a-center-left). The α-space integration kernels are intensity-aware and treat images as sets of distinct equal-intensity isophotes (b). When images are matched in all three scale spaces, the optimization is less prone to getting stuck in local minima (a-center-right).

Figure 3. Histogram matching is less sensitive to noise. To recover the position (θ) of a circular disk from a noisy reference image (a-bottom-right), methods that match images only at their stationary resolution or in σ-scale space fail — as they overlook imprecision and uncertainty in radiance measurements. Our method uses a tonal parameter (β) to account for intensity uncertainty and an extent scale-space to preserve the distribution modes at coarser scales (b), leading to optimal recovery of θ.