That single innovation gave Poisson flow models far greater variability, with the extreme cases offering different benefits. When D is low, for example, the model is more robust, meaning it is more tolerant of the errors made in estimating the electric field. “The model can’t predict the electric field perfectly,” said Ziming Liu, another graduate student at MIT and co-author of both papers. “There’s always some deviation. But robustness means that even if your estimation error is high, you can still generate good images.” So you may not end up with the dog of your dreams, but you’ll still end up with something resembling a dog.
At the other extreme, when D is high, the neural network becomes easier to train, requiring less data to master its artistic skills. The exact reason isn’t easy to explain, but it owes to the fact that when there are more dimensions, the model has fewer electric fields to keep track of — and hence less data to assimilate.
The enhanced model, PFGM++, “gives you the flexibility to interpolate between those two extremes,” said Rose Yu, a computer scientist at the University of California, San Diego.
And somewhere within this range lies an ideal value for D that strikes the right balance between robustness and ease of training, said Xu. “One goal of future work will be to figure out a systematic way of finding that sweet spot, so we can select the best possible D for a given situation without resorting to trial and error.”
Another goal for the MIT researchers involves finding more physical processes that can provide the basis for new families of generative models. Through a project called GenPhys, the team has already identified one promising candidate: the Yukawa potential, which relates to the weak nuclear force. “It’s different from Poisson flow and diffusion models, where the number of particles is always conserved,” Liu said. “The Yukawa potential allows you to annihilate particles or split a particle into two. Such a model might, for instance, simulate biological systems where the number of cells does not have to stay the same.”
This may be a fruitful line of inquiry, Yu said. “It could lead to new algorithms and new generative models with potential applications extending beyond image generation.”
And PFGM++ alone has already exceeded its inventors’ original expectations. They did not realize at first that when D is set to infinity, their amped-up Poisson flow model becomes indistinguishable from a diffusion model. Liu discovered this in calculations he carried out earlier this year.
Mert Pilanci, a computer scientist at Stanford University, considers this “unification” the most important result stemming from the MIT group’s work. “The PFGM++ paper,” he said, “reveals that both of these models are part of a broader class, [which] raises an intriguing question: Might there be other physical models for generative AI awaiting discovery, hinting at an even grander unification?”