Loading...

What is Dall-E, and why are people talking about it?

What is Dall-E, and why are people talking about it?
Photo Credit: OpenAI
22 Jul, 2022
Loading...

Two days ago, artificial intelligence research firm OpenAI expanded access to the second generation of its text-to-image AI tool, Dall-E. Having entered the beta stage, Dall-E 2 is now available to 1 million users who were so far waitlisted, giving significantly wider access to the cutting-edge AI image generation tool. Here’s why you should know about it.

What is Dall-E?

Simply put, Dall-E uses a proprietary artificial intelligence algorithm to convert instructions written in plain text into images. First unveiled by OpenAI back in January last year, the algorithm uses a modified version of the research firm’s natural language generation model, GPT-3, to understand instructions in plain text. It then uses machine learning to ’understand’ the logical link between the words, and interpret it in a visual form. The tool is among a rising crop of AI visual generators that can create high resolution and visually realistic images without any cognitive inputs from humans.

Loading...

How does it work?

Once you add a piece of text (which has to have some link, even if it is a bizarre one), Dall-E encodes this instruction to understand the words separately — and subsequently find a logical link between the words. The goal behind tools such as Dall-E is to see machines find abstract links between words, and interpret imagery based on subjective understanding of topics — and not always an objective, training-based one. Dall-E was originally trained using text-image pairs from the internet, and its first variant used 12 billion parameters to understand and convert text into visual art. The second-generation engine uses a fewer number of parameters (3.5 billion), but claims to produce a wider variety and higher resolution of visuals.

Where is Dall-E used?

Loading...

The scope of usage for image generative tools based on AI is a wide one. As per OpenAI, the tool has already been used by individuals with disabilities to create visual art, and offer illustrators with a tool to supplement their ability to create visuals — and use the same as a template to work on.

Are there more like it?

Yes, there are quite a few of them — although not all can be vouched in terms of their efficacy. Among notable parallels, Google Research unveiled its own text to image generation tool, Imagen, in May this year. The company claimed that Imagen’s performance was significantly more powerful and realistic in comparison to OpenAI’s Dall-E. There are others too, such as Replicate’s Pixray, as well as StarryAI — which is also available as a free to use app on mobile platforms. The app is a good example of how such AI visual generation tools work — though Dall-E’s performance is said to be significantly more superior than free tools. Nvidia also has a similar offering, GauGan and GauGan2 which also uses AI to generate backdrops and illustrations by transcoding plain text. 

Loading...

How much does it cost?

The lucky one million to get access to Dall-E 2 will get 50 credits in the first one month to use the tool with. Each credit can generate four images, or three image edits based on an existing image (with text instructions to tweak it). After the first month, users will get 15 credits each month, and purchase additional credit in bundles of $15 (about ₹1,200). For this, users will get 115 credits, which can be used to generate up to 460 images. There is no open, unlimited usage plan for anyone right now, since Dall-E 2 is still in limited access beta.