Nvidia released a paper about a 100KB text-to-image model that only trained for 4 minutes but claims to be better than bigger models

Open link in next tab

Key-Locked Rank One Editing for Text-to-Image Personalization

https://research.nvidia.com/labs/par/Perfusion/

Key-Locked Rank One Editing for Text-to-Image Personalization

They also claim that it only takes about 8 seconds to generate various good images.