Generating Art in the Style of M. C. Escher Using Stable Diffusion XL and Midjourney
Sometimes it looks like today's image generation tools are so advanced that they can produce virtually anything, from photorealistic scenes to fantasy art. Can they also match the creativity of the 20th-century genius?
Introduction
A few weeks ago, my high school math teacher died. The farewell ceremony took place in the school hall (a nostalgic place to return to years later, not counting my frequent back-to-school dreams), and while contemplating the untimely death of the great person, M. C. Escher came to my mind. This was not wholly accidental - Escher’s famous work combines mathematics and familiar places and objects in dreamy or unreal - and often plainly impossible - settings.
Later that day, in a somewhat existentialist mood, I realized it would be interesting to reimagine Escher’s work using text-to-image models. I decided to take the non-technical approach of simply using an off-the-shelf diffusion model - and even though you can search the internet for similar attempts, I tried to be novel and come up with some interesting prompts.
Generation with Stable Diffusion XL
If you have access to a GPU, Stable Diffusion XL is a way to generate high-quality images without having to pay any money. The following code, modified from a Jupyter notebook at GitHub, can be run in an interactive notebook environment such as Google Colab or Paperspace. First, we install the required Python packages:
1
%pip install --quiet --upgrade diffusers transformers accelerate mediapy
Next, we define the pipeline containing the model:
1
2
3
4
5
6
7
8
9
10
11
12
13
import mediapy as media
import random
import sys
import torch
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
torch_dtype=torch.float16,
use_safetensors=True,
variant="fp16",
).to("cuda")
For convenience, we will also define a function that produces an image according to a text prompt:
1
2
3
4
5
6
7
8
9
10
11
12
def generate(prompt, seed=None):
if seed is None:
seed = random.randint(0, sys.maxsize)
images = pipe(
prompt = prompt,
output_type = "pil",
generator = torch.Generator("cuda").manual_seed(seed),
).images
print(f"Prompt:\t{prompt}\nSeed:\t{seed}")
media.show_images(images)
Finally, we can simply call this function:
1
generate("Impossible perspective by M. C. Escher")
As each function call uses a different random seed, every result will be different. From here on, for the sake of readibility, I will be showing just prompts instead of code.
Impossible perspective by M. C. Escher
Not quite like Escher’s artworks, but the style looks similar. I tried to recreate the artist’s famous tesselations and periodic tilings but did not manage to obtain satisfactory results, as the images were chaotic and with gaps between the individual tiles. On the other hand, prompts to produce architecture work well:
Impossible architecture in the style of M. C. Escher
Impossible architecture in the style of M. C. Escher
Unfortunately, the images do not include any impossible geometry or perspective, even when explicitly prompted to do so. However, it is still possible to get some interesting results, such as with surreal “mazes”:
3D maze with impossible perspective floating in space, with bridges and columns connecting the different sections, minimalist drawing by M. C. Escher
3D maze with impossible perspective floating in space, with bridges and columns connecting the different sections, minimalist drawing by M. C. Escher
3D maze with impossible perspective floating in space, with bridges connecting the different sections, minimalist woodcut by M. C. Escher
3D maze with impossible perspective floating in space, with bridges connecting the different sections, minimalist woodcut by M. C. Escher
Maze with impossible geometry, minimalist drawing by M. C. Escher
Sometimes, the results are close to the work of Hieronymus Bosch, such as the following one:
Depiction of time as a fractal, with scenes from everyday human life, by M. C. Escher
With abstract prompts, you never get exactly what you want, but the results can be interesting nonetheless.
Mobius strip as a sea with ships travelling in one direction, minimalist drawing by M. C. Escher
Mobius strip as a sea with ships travelling in one direction, minimalist drawing by M. C. Escher
Generation with Midjourney
Midjourney is another popular image generation tool, although you usually need a subscription to generate images in larger quantities. As Midjourney is closed-source, the following results were obtained by typing the prompts into their Discord interface, using Midjourney version 5.2.
Impossible architecture by M. C. Escher
Cathedral by M. C. Escher, impossible architecture, optical paradoxes, impossible geometry
Impossible mirror, drawing by M. C. Escher, architecture, mathematics
In some domains such as the results above, Midjourney seems to produce better results than Stable Diffusion XL, while it fails in others. However, I was able to obtain interesting results with prompts requesting animals, such as fish:
Dissection of a fish, drawing by M. C. Escher, impossible geometry, unreal, symmetrical
Large funnel-shaped school of fish in the ocean, drawing by M. C. Escher, impossible geometry, unreal, unreal perspective
Large funnel-shaped school of fish in the ocean, drawing by M. C. Escher, impossible geometry, unreal, unreal perspective
Also, I was able to obtain results that at least resemble mosaics, such as the following output that concludes this section:
Mosaic of flying birds, drawing by M. C. Escher, tesselation, symmetrical, single repeating shape
Conclusion
Evidently, these results can’t approach the quality of Escher’s original work. Others have noted that M. C. Escher is more difficult for image generation models to imitate than other artists are, and it is not clear if this is because of the artist’s ingenuity, or simply because there is less training data for this artistic approach. In any case, I’m certain that text-to-image models will keep improving in the future, and I believe that unless we strictly restrict the definition of art only to works created by humans, we can look forward to some very interesting artistic experiences.
I am releasing the generated images into the public domain.