Can AI replace the ad agency?

POV’s
October 4, 2022
Innovation Lab (DK)

In less than a minute, visual AI software programs, DALL-E 2 and Midjourney, can create four images from a given prompt. A solution that is both cheaper and more efficient than traditional means of creating digital art. But what happens when technology becomes so advanced that we prefer content produced by software, not only for its speed and efficiency, but also for its creative results?

Imagine a future where artificial image generators will be intuitive enough to create images for you before you even ask it to. Imagine you are writing a text that requires a picture to be released with it. Instead of wasting your time and energy finding the perfect picture, the software will simply follow the process of your writing and produce an image based on the expressional content of your text.

Also, you don’t have to worry about expressing yourself in a poor and unclear way – the machine knows what you mean. It outperforms the human artists every time when it comes to both efficiency and quantity – and in this scenario, even quality.

In the end, you fire all your creative employees because you never use their finished products anyway. Instead, you save a lot of money by hiring the world’s most efficient content creator. You only have one, an artificial one, and that is all you need. You begin to trust it blindly, slowly becoming completely uncritical of your new ‘employee’, and you think that everything it creates is amazing. You even start doubting your own judgement, whenever you don’t like the outcome.

In other words, the software has corrupted your mind and taken control of your company’s visual expression. Next week you acquire an artificial intelligence that uses 3D printing for sculptures and other physical products. The following week you’ll be implementing a musical AI that can compose songs and jingles for you. Next comes the copywriting software and soon they develop a robot that can make entire feature films, which uses a visual AI to create human-like characters, so that you don’t even need to hire actors for your commercials anymore.

In the end, you sit alone in your office, surrounded by robots, AI and machine learning models that work with extreme precision and efficiency. Full automation, loss of all human touch.

How to create visual art in seconds

For the past month we have been testing and comparing two different pieces of visual AI software; Midjourney and DALL-E 2.

We began with Midjourney and gave it a prompt with the text “A realistic photo of a luxury watch floating on water”.

In return, Midjourney provided us with the following result:

First, we believe that it’s great to have two visual expressions to choose between, two different combinations of colors, which allows you to select your preferred style and take it in the direction that you want to. Perhaps you want to have a cold or a warm color palette. Two of the images are undeniably accurate, with an object that could be considered a watch, floating on the water surface. Not to mention, the water appears in all four examples. However, if the prompt was interpreted so accurately in two of the images, then why are the last two images so far off, either not looking like a watch or a watch that floats on water? When a result like this unfolds, you can either take it or leave it. Or you can adjust your prompt to get a better result.

Let’s see how DALL-E 2 handles the same task:

It seems like DALL-E 2 understood the prompt better, since all four pictures contain a watch that is floating on water. On top of that, the watch is also more realistic than the ones Midjourney provided us with. However, something is clearly off. It looks like the result you would get after merging a .png picture of a watch with a photo of a water surface, and just left it at that. As impressive as it is, it does not compare to what a professional human can produce.

Creating through patterns

Let’s walk through how the software works. The software is presented with millions of pictures containing a set of descriptive captions from which it creates algorithms through pattern-recognition. If it has seen a million pictures of a car with the caption ‘car’ it will have an idea of what visually defines a car and how a car is supposed to look like. It has no perception of reality, meaning that it doesn’t know what a car is, but it will notice similarities between the many different pictures of cars that it sees, and through pattern-recognition be able to generate an image of a car.

That also means that if it’s presented with a picture of a car, but the caption says ‘ice cream’ it might start to think that ice cream and cars look alike, depending on how many incoherent pictures with captions it sees. If it gets presented with a million pictures of cars, and only one picture has the caption ‘ice cream’, it will probably just ignore it. But if it gets a lot of pictures with incorrect captions, it might get confused and misunderstand the intended connection. You could say that the software teaches itself the connection between a picture and its caption.

It might not seem like the most difficult task for a supposedly intelligent software, but it gets vastly more complicated, when it must also define more abstract words like ‘beautiful’, ‘wonderful’ and ‘great’. Words that are individually defined. One person might see a picture and call it beautiful, while another would consider it unappealing. And how will it find visual patterns for words like ‘how’, ‘after’ and ‘think’?

If you would like to investigate the AI’s perception of both ordinary and abstract concepts, then it is quite easy to access both programs. Midjourney is available today, through Discord, while DALL-E 2 is available through OpenAI, where you’ll have to join a waitlist before you get access. OpenAI also offers a text-writing software that is rather impressive – but that technology is outside of the scope of this article.

Nothing is flawless – not even artificial intelligence

The software and the images it create have already been widely critiqued. People say that the software lack human expression and soul. One person on Twitter wrote that “we are watching the death of artistry unfold right before our eyes” to a story on how an AI-generated picture won an art contest in Colorado. But is artistry as we know it coming to an end? Based on the examples above, the answer is most likely no. Let’s have a look at why, by diving into some of their weaknesses and incapabilities.

To begin with, AI bases its visual generations on the vast catalog of pictures all around the internet, which means that it only works if artists of all visual professions keep creating.

Besides, the AI cannot produce an accurate caricature of a human. It lacks an understanding of the person it portrays. For example, the AI struggles with generating anatomically correct fingers. In addition, the AI is not exactly competitive when it comes to depicting satirical, humoristic, and personal traits without giving it extensive prompts to do so. So, for now, we will still have to rely on human creativity. For instance, if a politician is portrayed, the AI would lack an understanding of what their political point of view is, they wouldn’t know anything about their personal characteristics, background, or comedic sense. All these insights are needed, if you want to make a satirical and realistic portrait of someone, which is difficult for the AI, mainly because they are designed not to generate anything related to individuals. This is an intentional restriction serving to protect individuals against content linked to specific names and the distribution of deep fakes.

For now, it seems that the software is primarily great for inspiration. One of the software’s strengths is that it can help create ideas and creative frameworks in seconds, some of which would take a human a lot of time and energy to come up with. Therefore, it makes concepting and brainstorming easier and more efficient by automatically generating mood boards for instance. Instead of being a tool to replace graphic designers, it seems to be a useful tool for graphic designers to have at their disposal – at least for the time being. Many artists are already using visual AI-technology, and more will likely follow. AI graphics are great at providing artists with things like creative impulses, ideas, alternatives, differences, contrasts, perspectives etc., rather than generating finished results.

Midjourney seems like the most useful software for creating visually beautiful images. It still takes a few tries to get it right, but you often end up with a quite satisfying result in terms of visual expression and creativity. The images you get from it are often quite enjoyable. However, they are not very accurate or realistic, so if you aim for a precise depiction of the prompt you gave it, you might want to consider DALL-E 2, as its pictures are vastly more realistic – however, not flawless. The results of DALL-E 2 are often very clinical and exact instead of beautiful and visually impressive, which is often the case with Midjourney.

The world is not going to end tomorrow

One could argue that the introduction of artificial image generators is similar to the introduction of the camera. Compared to the past, where painters used hours, or days even, to create an emotionally realistic painting, this new mechanical device could produce a photo in a matter of seconds. And once everybody could get their hands on a camera, it seemed like the death of visual art. What would become of paintings and drawings, when suddenly everybody could create photos in a matter of seconds, by simply pressing a button? People believed that pictures couldn’t have human depth if they were created by a machine. Today there’s a completely different view on photos and photographers, and we recognize the art of capturing extraordinary pictures with a camera.

So, dear creative genius, it seems you will get to keep your job after all. The dystopian war between humans and machines is still way down the road, and even if we live to see it, maybe we will learn to appreciate what visual AI is bringing to the table.

If you interested in the possibilities of AI software and want to learn more about it, you can reach out to Lasse Dam, Head of Innovation Labs, at [email protected].