Neural diffusion processes from Cambridge, Google and Secondmind challenge Gaussian processes to describe rich distributions over functions
While researchers have traditionally used Gaussian processes (GP) to specify the front and back distributions over functions, this approach becomes computationally expensive when scaled up, is limited by the expressiveness of its function of covariance and has trouble fitting a point estimate for hyperparameters.
A research team from the University of Cambridge, Secondmind and Google Research addresses these questions in the new Neural Diffusion Processes paper, proposing neural diffusion processes (NDPs). The new framework learns to sample from rich distributions over functions at lower computational cost and to capture distributions close to the true Bayesian posterior of a conventional Gaussian process.
The main author of the article, Vincent Dutordoir, Explain“Bayesian inference for regression is great, but it’s often very expensive and requires making a priori modeling assumptions. What if we could train a large neural network to sample plausible posterior samples over functions? C is the premise of our neural diffusion processes.
The team summarizes its main contributions as follows:
- We propose a new model, the neural diffusion process (NDP), which extends the use case of diffusion models to stochastic processes and is able to describe a rich distribution over functions.
- We take special care to apply symmetries and known properties of stochastic processes, including exchangeability, in the model, thus facilitating the learning process.
- We showcase the capabilities and versatility of NDPs by applying them to a range of Bayesian inference tasks, including prior and conditional sampling, regression, hyperparameter marginalization, and Bayesian optimization.
- We also present a new global optimization method using NDPs.
The proposed NDP is an approach based on a denoising diffusion model to learn the probabilities of a function and produce prior and conditional samples of functions. It allows complete marginalization on GP hyperparameters while reducing the computational load compared to GPs.
The team first looked at existing state-of-the-art generative neural network models in terms of sample quality. Based on their findings, they designed NDP to generalize diffusion models to infinite-dimensional functional spaces by allowing the indexing of random variables over which the model diffuses.
The researchers also adopted a new two-dimensional attention block to ensure equivariance on dimensionality and input sequence and allow the model to take samples from a stochastic process. As such, NDP can take advantage of the benefits of stochastic processes, such as exchangeability.
In their empirical study, the team assessed the ability of the proposed NDP to produce high-quality conditional samples and marginalize kernel hyperparameters; and on its input dimensionality invariance.
The results show that the NDP is able to capture functional distributions close to the true Bayesian posterior while reducing computational loads.
The researchers note that while the quality of the NDP sample improves with the number of diffusion steps, this also results in slower inference times. They suggest that speeding up inference or sample parameterization techniques could be explored in future studies to address this issue.
The paper Neural diffusion process is on arXiv.
Author: Hecate He | Editor: Michel Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Weekly Synchronized Global AI to get weekly AI updates.