Generative AI for Imagery: Positives and Pitfalls
Machine-learning algorithms are increasingly being promoted by big tech as the future of graphic design. Is this the case?
Image Credit: Maxim Berg, Unsplash
“AI” is a term that has become ubiquitous online in recent years. Since the rise of discussion-based algorithms such as ChatGPT and image generators like Midjourney, online discussion has increasingly focused on the boons and threats posed by these systems.
Pro-AI advocates have made proclamations that AI will grow more sophisticated exponentially, causing a huge shift in global society. Machines will be our doctors, our lawyers, our scientists, our artists, our writers. Dull busywork will be eliminated, allowing humanity to focus on leisure and creative pursuits for the sake of personal fulfillment. A new Golden Age will begin.
Conversely, anti-AI campaigners emphasise the potentially catastrophic risks of such a change. These people allege that AI will be used simply to cut mid-level jobs and remove human creativity from society, enforcing a life of manual labour and drudgery upon those who are not super-rich.
There are further worries regarding the capacity of AI systems to do what humans cannot. What happens when a sophisticated AI system with millions of scientific papers at its disposal is directed to research biological weapons? Can AI be used to produce flawless misinformation, crippling our concept of objective truth?
I presently find myself in the middle on this debate, weighing aspects of both views. Even within the field of 3D and graphic design, the presence of AI has been heavily felt in both constructive and destructive ways.
In a few short years, Midjourney has already seen rapid improvement in its ability to generate convincing images. While the algorithms frequently make mistakes when it comes to generating human figures, the overall aesthetic of many Midjourney images has developed immensely. In particular, generators allow people with very few technical art skills to produce images that (on the surface, at least), seem remarkably competent.
However, to the trained eye, AI images often feel crude. Lighting usually doesn’t make much sense in AI images. There is an over-reliance on very cliched composition and themes, with large galleries of the images often feeling extremely generic in a way even stock photos do not. In many cases the composition feels completely nonsensical, with no real centre of focus in the image at all.
AI “scraping” procedures used to “train” algorithms to produce images are also highly controversial. Billions of images are fed into algorithms to allow the program to understand the structure of objects and create thematically similar product images. Many artists allege their works have been plagiarised without their consent, used to fuel image generation machines others are profiting from. Copyright in general is presently proving a complex nightmare when it comes to AI image generation.
Unlike a human artist bound to long-standing legal precedent, AI cannot be relied upon to honestly divulge the sources of images used for a work. This is a major reason most journals will simply not accept AI-driven content.
Copyright may prove perhaps the most insurmountable issue with AI image generation, as it is a fundamental problem regardless of the level of sophistication of the technology. It is likely in coming years we will see big tech firms taken to court for serious copyright violations.
Image credit: Douglas Sanchez, Unsplash
A recent interview with OpenAI CTO Mira Murati served as a sharp illustration of these issues. When asked whether the text-to-video generator “Sora” harvests content data from YouTube videos, Murati completely shut down and pointedly refused to give a straight answer.[1]
There is also scepticism regarding the level of actual sophistication of many AI systems. The now-constant use of the term “AI” can arguably feel like the overuse of internet-related terms during the “dot-com boom” of the late 90’s.
Companies at the time were slapping “.com” on everything, no matter how mundane, gaining massive increases in stock valuation due to this embrace of the internet. This was despite, in many cases, having no real substance to their approach to the new technology. Some companies are now slapping refrains such as “Search with our AI!” on a standard search bar entry field invented in the early 2000s, leading to allegations of similar disingenuity.
The most positive aspect of AI for image generation is, in my personal opinion, for producing background details and automating most tedious parts of human-driven image production. Generation of seamless 3D textures or simplifying time-consuming technical procedures such as UV Unwrapping seems an excellent use of the technology; human creativity is preserved, copyright is not violated, but thousands of hours of dull labour is rendered unneccessary.
Blender already features some excellent automated add-ons, such as “UV Squares” (linked here), a program which converts disorganised unwrapped UV meshes into manageable and ordered grids.
In summary, in a few short years we will likely have a much clearer picture of where AI technology is leading. For now, it remains a complicated and controversial topic, presenting us with both incredible potential advantages and glaring usage issues.