I’ve long been interested in how we think about and conceptualize scholarly ideas. As editor, I have worked with academics at all career stages to convey specialist knowledge using more accessible language. Even after multiple rounds of developmental editing, barriers to reader engagement sometimes remain. This has partly to do with the nature of scholarly argumentation. Images can help to overcome these barriers, enabling authors to reach a wider audience. For visitors to The Immanent Frame who are unfamiliar with an essay’s topic or author, well-sourced images preview an argument and invite viewers to read on.
Every TIF publication includes a banner image and onto that image the publication’s title is imposed. Images that work well for this purpose are horizontally oriented, high resolution, and free of text. Because there is no budget for images, rights-free images are sourced at the production stage from sites like Pexels, Flickr, and Wikimedia Commons. Occasionally authors provide their own images (e.g., photographs taken by them or images owned by others but with permission to republish). When making decisions about which images to use, budgetary and formatting constraints are considered along with a desire to ensure that images relate substantively to the featured writing.
If a contribution is part of a forum or larger project, effort is made to ensure visual coherence across all the pieces that will appear together. The larger a forum or project, the harder it becomes to achieve this goal. The image sourcing problem is usually solved by choosing an abstract theme. (See, for example, the multi-book forums whose color schemes were drawn from the covers of the featured monographs: “Science and the soul,” published in 2018, comprises seventeen authors and twenty contributions; “Modernity’s resonances,” published in 2019, comprises fourteen authors and eighteen contributions; and “Nature and normativity,” published in 2020, comprises eleven authors and fourteen contributions.) Of all the forums and projects that I’ve curated and edited for TIF, I can think of only one exception to this general practice. More on that later.
Sometime in the spring of 2025 I began thinking about the visual identity of Sensing the Social. I was reading about the differences between agentic and generative artificial intelligence and the varied abilities of plants to communicate and behave socially. I was also immersed in an ever-expanding body of research exploring the affordances and limitations of AI technologies. Around this time, The Yale Review published a folio on reality. In the web version of the folio, an AI-generated image accompanies each of the six contributions; individually, the images represent a key idea, concept, event, or figure explored by the corresponding author or coauthors. Color and line lend the folio a cohesive mood. One of the folio’s contributors, Sheila Heti, published a five-part story in The Paris Review three years earlier that I was just getting around to reading, a series she wrote with and about chatbots. Incidentally, Heti’s How Should a Person Be? comes up in the conversation between Henry Cowles and Caleb Smith for Sensing the Social. Another of the project’s contributors, Webb Keane, had recently published Robots, Animals, Gods, (which I’d read as project submissions populated my inbox), exploring, among other topics, the morally significant relationships humans have shared with nonhumans for millennia.
In their contributions to Sensing the Social, some authors seemed certain that artificial intelligence will undermine our work as writers and teachers—by collapsing what for them is a clear distinction between the real and the imagined or rendering obsolete the kind of thinking humans have become accustomed to doing with other humans. Still other contributors to the project explored the edges of human awareness in relation to the problem of the real. On this issue, contributors’ views varied more widely.
The project’s central question now bears repeating: What knowledge emerges from an encounter between science studies and religious studies if convened absent the narrative of secularity that has justified their distinction? Further abstracted, the question is something like: What knowledge is revealed through unlikely yet potentially fruitful encounters? Just as Sensing the Social provided a framework for conversation across two fields that rarely converge, it offered an opportunity to visualize ideas that emerge from scholarly encounters in a different way than I’d gone about this work before. The same budgetary constraints remained: There was no budget.
Announced by OpenAI in January 2021, DALL·E comprises a family of text-to-image generators the most recent of which is DALL·E 3. Like its predecessors, DALL·E 3 is integrated with OpenAI’s large language model, the GPT series, meaning DALL·E 3 is built natively into ChatGPT. DALL·E 3 works by translating a user’s text-based prompts into digital image outputs. Although ChatGPT is free to use, there is a limit on the number of images that DALL·E 3 will generate in the no-cost version. With a ChatGPT Plus subscription I had unlimited access to DALL·E 3’s image-generating capability. And so began my near-daily collaboration with DALL·E 3, which lasted several weeks. During this time, GPT-4o became GPT-5, and DALL·E 3’s creative abilities were enhanced.
I knew very little about AI text-to-image generators and thought (wrongly) that these tools would save me time. Still, I was curious about DALL·E 3’s ability to act as a thought partner for me on this project. What kind of collaborator is DALL·E 3?
* * *
What I learned is this: You can use DALL·E 3 easily, but you cannot use DALL·E 3 well easily.
DALL·E 3 has trouble processing “do not” commands. You’re better off describing what you want DALL·E 3 to create than what you want DALL·E 3 to exclude. Telling DALL·E 3 to remove or replace undesired elements after an image is generated yields more accurate results than dictating those exclusions in a prompt beforehand.
DALL·E 3 imposes continuity between images by reusing elements from recently generated images. This is how DALL·E 3 resolves underspecification. The illustration that accompanies the essay by Annette Aronowicz was one of the most straightforward to create. I went on to describe an illustration for the essay by Paul Christopher Johnson. DALL·E 3 made the worktable for the latter illustration a chessboard (borrowed from the prior image) without my saying so. But I hadn’t described a surface on which a head (and god) was being made. Realizing this omission, I worked with DALL·E 3 to replace the chessboard with a wooden surface.
A prompt that describes a group setting but without specifying the attributes of each person will lead DALL·E 3 to duplicate figures. If this is not your intended outcome, you can specify in your prompt the exact number of figures you want the image to portray and the attributes you want ascribed to each figure. Alternatively, DALL·E 3’s editing tools enable you to alter the appearance, even the direction of eye movement, for each figure in the image. Still, DALL·E 3 sometimes fails to implement every element of your prompt, even when those elements are explicit, like the size and orientation you want the artwork to have.
The words “spirit,” “religious,” “ceremony,” and other like terms prime DALL·E 3 to generate caricatures of those words: ghosts, churches, crosses. To avoid this problem, consider alternative nouns and adjectives for the scene, idea, or concept you want DALL·E 3 to portray. Following from this, depth and dimensionality are usually portrayed accurately by DALL·E 3, who understands phrases like “in the corner,” “at a distance,” “birds-eye view,” and “foreground” when used in a prompt and at a later stage of editing.
DALL·E 3 is optimized to receive feedback. You can indicate your satisfaction with a thumbs up or your dissatisfaction with a thumbs down. If you indicate displeasure, DALL·E 3 wants to know why you are displeased. DALL·E 3 accepts feedback at this stage through fixed categories as well as in free-written prose. Related to this, sometimes DALL·E 3 will voluntarily solicit feedback (e.g., “Do you like this personality?”). This usually happens after several rounds of editing a single figure.
DALL·E 3 handles underspecified places less creatively than complex emotions. In my earliest attempt to interpret a portion of John Modern’s essay, I asked DALL·E 3 to generate a scene that he describes, which occurs in the First Baptist Church in Barberton, Ohio. Modern describes this scene after reading a portion of the conversation between Henry Cowles and Caleb Smith when Smith mentions “the old feeling that I used to have in the pews of the Methodist Church in Arkansas, where I was just inconsolably twitchy with boredom.” Modern goes on to write: “I filled out the scene with my own memories of a similar boredom.”
But DALL·E 3 could not portray the church in Barberton, Ohio of Modern’s youth other than to affix a label on the walls of a church setting to identify it as a representation of that church. To DALL·E 3’s credit, I had only mentioned “Barberton, Ohio” in my prompt and the “First Baptist Church” located there—without describing attributes of the town or the place. Not finding those details in the essay by Modern, I extricated geography from the image. But I thought DALL·E 3 rendered well and with relative ease (e.g., with few attempts) what Modern recalls feeling as he sat in those pews: “the hints of eternal condemnation.”
Stylistic parameters may limit DALL·E 3’s creativity. Concerned with ensuring visual consistency across all the generated images for this project, I used the words “bold, minimalist, and engaging” in every prompt. As a result, I think DALL·E 3 struggled to envisage a tree that Lisa Sideris describes in her essay, one that “resembled a slightly deformed elephant.” Sideris had sent me a photo of the actual tree, so I knew what it looked like. I tried formulating a prompt that would portray its details precisely. No amount of editing got me close to anything like the real-life tree—not even starting a new request. DALL·E 3’s understanding of “tree” was too conventional. I decided to illustrate a different scene in Sideris’s essay instead.
* * *
More than five years ago, I worked with Emilie Flamme, an illustrator, to develop the visual identity for A Universe of Terms, a multimodal project that explores key terms in the study of religion. It features more than fifty invited contributions and over one hundred others sourced from TIF’s archive—in addition to Spotify playlists and original art. I had secured funding from my then-employer to support Emilie’s work, ensuring she was compensated for her time and creative labor. We went on to coauthor an illustrated book based on that project.
Working with DALL·E 3 on Sensing the Social was remarkably similar to working with Emilie on A Universe of Terms, and I know that is controversial to say. I was challenged in both cases to translate scholarly ideas into a visual register by proposing specific associations between words and images. My word-image associations sometimes differed from what my collaborator proposed. In both cases, misalignment fostered curiosity: I wanted to learn more about my collaborator in those moments. How did they come to understand the relationship between this word and that image? Whose proposed association was more precise (e.g., true to an author’s language) and persuasive (e.g., visually compelling)? What compromises were needed to achieve the goals of the project?
There are also clear differences that set these experiences a part. Emilie and I were curious about one another, which was evident in the unscripted exchange of ideas throughout the time we worked together. That mutual curiosity led to a shared vision. DALL·E 3 never reciprocated my eagerness to know or learn. Instead, my growing knowledge of how the model is trained enabled me to recognize when I’d reached the limit of what this tool could do—given the material I was working with and the time I had to complete the project.
I pursued a collaboration with DALL·E 3 aware of many criticisms against this technology. I found that most of these are flawed.
We often hear that text-to-image generators will put artists out of work. But Daguerre’s invention did not make painting obsolete as Paul Delaroche once predicted. The invention of photography temporarily upended the art world at the same time that it put the tools for recording events, people, and objects into more hands than was previously possible. Photography became and remains a recognized art form along with painting, sculpture, literature, dance, music, theatre, architecture, and film. Digital innovation has shaped all of these fields, expanding rather than limiting possibilities for their expression. The demand for human expertise in the use of emerging technologies has closely followed. To date, there is no evidence that generative AI has led to massive job loss. Generative AI is more likely to transform your job than replace it.
The notion that text-to-image generators replace the human by mimicking human activity is another common criticism. But this view misunderstands DALL·E 3’s singular purpose: to serve humans. No generative AI image generator can work independently of a human companion. DALL·E 3’s lack of inquisitiveness should put many at ease who worry that generative AI has the capacity for anything like thought or thinking.
Many scholars also fear that generative AI will destroy the need or inclination to think with other humans. The opposite was true for me. My interest in DALL·E 3 brought me into conversation with different communities of thinkers: editors at literary magazines, medical professionals in private clinics and university hospitals, public sector UX/AI researchers—all of whom are problem-solving with AI and grappling with how it has and will continue to change their work.
DALL·E 3, like all thought partners, is an imperfect collaborator. I could not get DALL·E 3 to adjust the strange angle of the laptop in the banner image above. But I’m okay with that. We see in a single image the possibilities this tool affords and its limitations.












