Photography in Plato’s Cave

In On Photography, Susan Sontang describes humankind as “lingering unregenerately in Plato’s Cave”. But that the insatiability of the photographic eye changes the terms of our confinement in the cave.

In many ways Sontang’s book feels more relevant now than ever. Because of the meteoric growth in photographic technology and, in wealthy countries, near universal ownership of image making devices it’s critical we understand the power and importance of the photograph.

Recent advancements in technology have caused us to ask questions about the nature and value of image making, how we see the world, and even our nature itself. I believe it’s important that we address these questions head on, or else risk being trapped forever in an ever-shrinking cave staring out at shadows of a world we don’t care to understand.

I’ve been quite a serious hobbyist photographer for over 20 years. I’d consider myself a reasonably competent photographer, I can approach most situations with confidence.

I’ve come to know a lot about light, lenses, colour, and tone. I operate a carefully colour-managed workflow, producing prints to a high standard. I know the appropriate paper to use to suit my subject. I understand how light interacts with my sensor and my lens. I’m a halfway competent drone pilot and can make images from the air. In the darkroom I can tell you just how a photon interacting with a silver halide crystal creates a latent image, and what the size and shape of those crystals means for the warmth of the print. I can present and protect my images so that they should last for generations.

And every single bit of all of that is pointless.

Even before we look at new technologies it becomes important to ask, what’s the point? Once you’ve reached passable technical mastery, what’s the point? What’s the point of standing with my camera in the tripod holes of photographers who’ve gone before me? Lovely though it is to create another beautiful mountain top sunrise, and lovely though the print will look on my wall, how is it different to the identically similar image created by someone else yesterday?

Recent advances in generative AI, such as Stable Diffusion and DALL-E mean we don’t even need to have experienced what we want to photograph. These technologies can capture what we’re imagining. Just describe the photograph you have in mind, and you have the image.

For example, here’s an image I made a few years ago. This is a picture of the sunrise at Mam Tor in the Peak District. I had to get up at 4am and take a short hike to be at this spot in time. It’s not an amazing image, but it makes a pretty print.

Mam Tor Sunrise

This image is by no means unique. Indeed, my tripod was firmly planted in the holes made by photographers before me, many producing far superior versions of the same image. A quick Google image search results in this:

A Google search for Mam Tor Sunrise images

Photographs are experience captured. I love my Mam Tor image. Not because it’s a great photograph, it isn’t, but because it reminds me of the experience of being on that hilltop at that moment. What does my image say, what does it add to expand or alter the terms of our confinement in Plato’s Cave?

It’s hard to see that it says much more than “I was here, at this time”. But I don’t need a photograph to do that. I can go and experience the sun rise, unencumbered by photographic equipment and the need to think about image capture. Then, when I get back, I can simply ask DALL-E to generate:

“A vivid, high quality, HDR photograph of sunrise over Mam Tor. There is a path running away from the viewer up over the ridge. A small wooden fence runs along the grass to the distance.”

Which gives me:

DALL-E's version of my Mam Tor photograph

With technology like this that seeks to utterly devalue and remove the photographer from the art of image making we may well ask ourselves why bother?

Generative AI of all kinds can on one hand be seen as the ultimate expression of capitalism. We’ve long since past the point of art as commodity, so it’s perhaps only natural that we now find ourselves with the artist themselves being a person in the middle to be removed. We should not be surprised at technology that seeks to replicate and industrialise the means of production of a valuable commodity.

And yet this technology relies on humans. I saw someone recently describe sharing a great image you’ve made with DALL-E as “like sharing a Google search you’re particularly proud of”. All generative AI relies on unimaginably large sets of training data. As we’ve put more and more of our creative output online, so it’s been used to feed the machine.

Putting the very real issues of copyright and attribution aside, I believe that if we come to ascribe real value to images generated in this manner we face an urgent and existential cultural crisis.

All machine learning is by its nature conservative. It is only able to consider what has gone before. DALL-E could only make that image because of the many millions of photographers whose work OpenAI have appropriated. It does not experience the world, and despite appearance it does not create anything new. It’s a magic trick, they do it with mirrors.

So much of the breathless commentary surrounding these new technologies falls for the mistake of equating software with sentience. Humans are prone to seeing patterns in things and mistaking it for personality. The conjurer has held up a huge mirror, and we’ve failed to see ourselves in it. And this is dangerous, because when we tell ourselves these things are somehow “other” we start to ascribe a value to what they create that exceeds the value we place on its inputs.

I’ve been talking about models like Stable Diffusion or DALL-E that create images, but the same applies to large language models like GTP-3. These models capture and regurgitate our culture, but they cannot be allowed to become our culture. If they do, if we come to value what they create, our culture is doomed to stagnate. This point, right now, today, is as far as humanity can come and we can go no further.

By valuing the output of these models, we are saying to ourselves we are happy to continue “lingering unregenerately in Plato’s Cave”. The images these models create are not even shadows on the wall in the cave. They are just the outlines of the shadows we’ve seen before.

Where photography has the power to alter the terms of our confinement in the cave by seeking to enlarge our notion of what’s worth looking at, generative AI seeks to narrow our confinement and limit us.

This is why we should bother. We must continue to create photographs as experience captured. Dorothea Lange said, “a camera is a device for learning how to see without a camera”. If we give up our image making to a model trained on the past, we will stop learning how to see.

Every time we press the shutter and make an exposure we’re doing so for a reason. We are capturing some new essence of our experience. Regardless of whether it’s an unthinking snapshot, or a slow considered piece, every exposure ever made that was made by a human was made with human intention. Susan Sontag describes the camera as the “ideal arm of consciousness in its acquisitive mood”. Let’s let our consciousness be acquisitive, let’s capture new things. The decisive moment is not just capturing that elusive perfect moment of exposure, it’s capturing the experience and intent of the photographer.

Photography does redefine the terms of our confinement in Plato’s Cave, but only when it’s created by a human, when it shows us something new, as an expression of human experience.