Gen AI attempts a Christmas Cover

Documents the trial and error involved in getting DALL-E (version 3 available in ChatGPT4 Subscription) to create the cover image for the Dec 22 2023 edition of S3T.

Peace to All DALL-E and Ralph Perrine

Documents the trial and error involved in getting DALL-E (version 3 available in ChatGPT4 Subscription) to create the cover image shown above. This image served as the cover for the Dec 22 2023 edition of S3T.

Peace to All is a collaboration between a person (me) and an AI capability (DALL-E). I attempted to prompt DALL-E to draw the entire scene as shown in the final cover image. DALL-E was unable to replicate a realistic northern lights display, and completely unable to render a peace symbol in an overhead perspective (seriously...the harder it tried the worse it got). So I modified the prompt to ask DALL-E simply to provide a wilderness scene with some tundra swans and an extra big sky. I then imported that DALL-E image into Sketchbook where I added in the atmospherics and the northern lights peace symbol by hand.

What follows are screen shots of my interactions with ChatGPT4 and DALL-E

As you see here the peace symbol was missing. Also some of the animals were strange..like the little reindeer bird flying in the upper right of the 2nd picture.

The first picture shows something that looks like a Mercedes logo. Here the aurora borealis is starting to look a little more realistic, but no progress on getting the peace symbol incorporated into the northern lights.

Here I resorted to drawing a quick sketch and uploading the sketch as part of the prompt. This seemed to help - DALL-E appears to have mimicked my drawing style somewhat.

When it displayed the images it also displayed a message stating some sort of failure (the message is no longer visible on the page). I responded in agreement:

(Side note: In the feedback I wrote above, I mention underwater scenes. In a previous exercise I'd asked DALL-E to draw an underwater view of a very large fish looking up at a boat on the surface of the water. DALL-E did not seem to know how to locate the boat on top of the surface, and kept depicting it as if it were submerged.)

At this point I switched tactics. I gave DALL-E a new prompt, asking it to only focus on the scene without the northern lights. I planned to draw in the northern lights myself.

When I saw the results I realized I didn't have any room to draw in the aurora borealis, so I modified the prompt:

I liked the layout of the 2nd picture and imported it into Sketchbook where I added the aurora borealis and atmospherics. The final result:

Pros and Cons

Below are the pros and cons of using DALL-E and similar capabilities, as surfaced in the exercise of creating the Christmas cover art.

First the Cons

DALL-E cannot currently lock one aspect of an image and then improve another aspect of the same image. Every image is regenerated completely. It also tends to lose specific instructions with each new iteration.
While generic scenes like "winter wilderness" can be rendered with reasonable reliability, DALL-E and similar capabilities frequently depict things that are inaccurate / impossible. Note the visually odd spacing of the swans in the first image above, as well as the numerous failed attempts to depict northern lights. Note also, even in the final version of the cover image, some of the birds rendered by DALL-E appear to have 3 or more wings. Even the acceptably rendered swans, all have their wings in the same (or very similar) up or down positions - as if their wings were connected by door hinges. Below is an early morning photo of swans I took several years ago - note the different positions of their wings.

Pros:

DALL-E can save time by depicting details that might take a person a long time to do - and might not be worth spending time on. For example, the highly detailed forest in the scene above would have taken me hours if not days to render at the same level of detail and realism.
DALL-E seems to do well at atmospherics, providing visual lighting effects that are pleasant to look at. I have seen this consistently whether the prompt asks for an early morning sunrise, desert scene, starry skies etc.
DALL-E seems to also be useful at times in offering options you didn't think of.

Proliferation of doers ...but what about purpose?

Ultimately you as the artist have to have an objective in mind, and then decide what elements if any can be provided by a capability like DALL-E.

Generative AI represents the proliferation of "doers" ...agents doing things at an extremely fast rate. This fast rate of doing brings with it a higher supervisory/quality control workload than the over-hyped tech press has acknowledged.

The "doers" of Generative AI have extreme cognitive limitations:

Limited or no grasp of context, judgment, or ethics.
Large uncatalogued and scary gaps in awareness (for example repeated depictions of cars with drivers, headlights etc facing the wrong direction).
An understanding of intent that is currently limited to the text in the prompt and whatever "content guidelines" that the platform creators may have included.

Given these points, we probably need to be a bit more discerning in the way we judge the value of what Generative AI capabilities can contribute.

While DALL-E can depict a forest (as above) by apparently drawing each tree at an extremely fast rate, an experienced artist can similarly depict a forest with a few skillful strokes of a brush. Which is better or more valuable? The answer depends on what you are trying to achieve.

S3T is a fun continuous learning platform for leaders learning how to drive positive change.

S3T helps you stay ahead of the curve on key trends, and helps you sharpen your change leadership skills and awareness, so you can maximize your influence and impact.
Sign up and start learning today.