From the course: What Is Generative AI?

Text to image applications - DALL-E Tutorial

From the course: What Is Generative AI?

Text to image applications

- In 2022, we have seen a rise in commercial image generation services. The technology behind these services is broadly referred as text to image. You simply type words on a screen and watch the algorithms create an image based on your queue, even if you description is not very specific. There are three main text to image generation services. Midjourney, DALL-E, and Stable Diffusion. If we were to compare these three text to image tools to operating systems, Midjourney would be macOS because they have a closed API and a very design and art-centric approach to the image generation process. DALL-E would be Windows but with an open API because the model is released by a corporation and it initially had the most superior machine-learning algorithm. Open AI values technical superiority over design and art sensitivities. And the third, the Stable Diffusion would be Linux because it is open source and is improving each day with the contribution of the generative AI community. The quality of the generated images from text to image models can depend both on the quality of the algorithm and the datasets they use to train it. So now that we know the main services, let's look at three industrial applications. First is Cuebric. Hollywood's first generative AI tool created by our company, Seyhan Lee, for streaming the production of film backgrounds. A normal virtual production workflow uses three dimensional world building which involves a bunch of people building 3D worlds that are custom made for that film. It's time consuming, expensive, and requires a lot of repetitive tasks. An alternative now is to augment 2D backgrounds into 2.5D by involving generative AI in the picture creation process. The second example would be Stitch Fix. When they suggest garments to discover their customer's fashion style, they use real clothes along with clothes generated with DALL-E. And finally, marketers and filmmakers use text to image models when ideating for a concept in a film. And actually, they may later on continue to use it to make storyboards and even use it in the production of the final art of their campaigns and films. Just like we have seen in Cuebric. A recent example from the marketing world would be Martini that used the Midjourney generated image in their campaign. Another one would be Heinz and Nestle that used DALL-E in their campaign. And GoFundMe that used Stable Diffusion in their artfully illustrated film. Marketers prefer using generative AI in their creative process for two reasons. First, for its time and cost-saving efficiency, and the second, for the unique look and feel that you get from text to image based tools.

Contents