Tuesday, June 27, 2023

Stability AI: A Huge Array of Apps to Choose From


When well over a year ago I first began experimenting with AI imaging apps there were two companies that were getting the most buzz in the media - Open AI and Stability AI.  At the time, it was Open AI's DALL-E 2 that was the center of attention and I was curious to see what results I could obtain with it.  Unfortunately, I did not find the results as aesthetically pleasing as I would have liked, not to mention the drawback of being limited to 15 free credits per month, and I soon stopped using it.  In the meantime, Open AI appeared to shift its focus to ChatGPT, the wildly successful alternative search engine (though that may be too limited a definition of its functionality).

When I moved on to Stability AI I found the company had at the time two apps to choose from - Stable Diffusion and Dream Studio.  Both were AI imaging apps, but there were notable differences between the two which were outlined fairly succinctly by Google Bard.  Very basically, "Stable Diffusion is an open-source generative AI model that can be used to generate images from text descriptions.  Dream Studio is a web app that uses Stable Diffusion to create images."  I soon found that Stable Diffusion would best suit my needs, most especially as it was free to use while Dream Studio was a paid subscription service.

Working with Stable Diffusion, I found that the images, while more pleasing than those I had obtained with DALL-E 2, were still lacking.  Aside from the inconvenience of generating characters with three arms or legs and hands whose fingers resembled strands of spaghetti, Stable Diffusion images were too lacking in detail to be fully acceptable.  They certainly could never be termed photo realistic.  Subsequent upgrades to the app failed to sufficiently correct these problems.

Meanwhile, Stability AI was moving on.  For one thing, it introduced a new generative AI model Deep Floyd which, at least according to the article in Tech Crunch, is more correctly the name of a research group backed by Stability AI.  According to Bard, there are three main differences between Stable Diffusion and Deep Floyd: (1) Model Architecture - Stable Diffusion is a latent diffusion model, while Deep Floyd is a pixel-based diffusion model; (2) Image Quality - Stable Diffusion is generally considered to produce more realistic images than Deep Floyd; and (3) Speed - Deep Floyd is generally faster than Stable Diffusion.  In spite of Bard's explanation, I found in actual practice there was not that much noticeable difference between images generated by Stable Diffusion and those generated by Deep Floyd.  In other words, I did not find Deep Floyd to be that great an improvement on Stable Diffusion.

In another turn of events, in March 2023 Stability AI acquired Init ML, makers of Clipdrop, which then became a wholly owned subsidiary of Stability AI.  This, of course, gave Stabilty AI users free access to the whole range of Clipdrop apps which at last count I numbered at nine: Reimagine XL, Uncrop, Relight, Image Upscaler, Text Remover, Replace Background, Remove Background, Cleanup, and Stable Diffusion XL.

It's the last of the nine, Stable Diffusion XL, that I wish to call attention to since Stabilty AI has also released a full beta version of this same app.  While both versions are free of charge to use, there are differences between them which, again according to Bard, are as follows: (1) Model Size - the full version images are larger (1.37GB) than the Clipdrop version images (,54 GB); (2) Image Quality - Clipdrop version images are of lower quality; and (3) Features - the full version offers more features, such as the ability to edit generated images.

So far, for convenience sake, I've been experimenting only with the Clipdrop version of Stable Diffusion XL and have been happy to discover the app is a significant improvement over both the traditional Stable Diffusion and Deep Floyd.  The images I've obtained to date are far more realistic, even in the lower quality Clipdrop versions, than those I've been able to obtain with the earlier apps when using the same text prompts.  As a bonus, one nice feature in XL is the ability to apply styles, such as Cinematic, Digital Art, and Fantasy Art, when regenerating images.

As I work more with XL, I will post sample images here.

No comments:

Post a Comment