How to Illustrate Your Book with Artificial Intelligence

Table of Contents

As the world becomes increasingly digital, more and more people are turning to artificial intelligence to illustrate their books, stories, poems, and even songs. AI can create stunning images and illustrations that can help bring your novel to life in a whole new way.

You don't have to be a professional artist or a programmer to take advantage of this technology-anyone can use AI to create beautiful, unique artwork for their book.

AI Art Programs for Beginners

There are three illustration tools I recommend to beginners: Artbreeder, Wombo Dream, and Prose Painter. These are easily accessible and require no coding knowledge at all.

Artbreeder (artbreeder.com)

An image of a teenage girl with blonde hair and blue eyes. — A portrait of Princess Zora from *The Dream Kiss*. Made in Artbreeder.

We'll start with Artbreeder, which is a website that's available on both desktop and mobile. Artbreeder lets you generate and customize images in several categories. You can generate pictures of faces, landscapes, buildings, paintings, science fiction art, album covers, and more. This is a really excellent starting point for creating character images, landscape paintings, and other interior illustrations for your book.

Since Ai- generated images cannot be copyrighted in the United States, all Artbreeder images are public domain. You can take anyone's Artbreeder-generated images and edit them as you wish. However, this also means that anyone can take your images as well. If you want to keep some images private, you'll have to sign up for a Pro account and enable private mode.

Most Artbreeder images have a square aspect ratio, which makes them perfect for posting on Instagram. You can upload and modify your own images of people and landscapes. You can't upload images in other categories like anime faces or album covers.

Wombo Dream (Apple, Google Play)

Wombo Dream is a mobile-only app that works with both Android and iOS smartphones. Unlike Artbreeder, Wombo Dream allows you to generate images from just a text prompt. You can also select a preset style, and use an initial image to inspire the AI to a greater or lesser degree.

Wombo has a very intuitive interface. You can start generating images with just a few taps. The app also has a social component, where you can share your images and browse what other people have created.

Here's what the creation screen looks like on my phone:

Wombo creates images with a portrait aspect ratio of 960×1568, similar to your phone's screen size. It cannot make any other aspect ratios yet. This is very close to the aspect ratio of most tarot cards, and it can also make a good wallpaper for your smartphone.

Wombo is best at making landscapes, cityscapes, and abstract images. Its pictures of people vary in quality, from “pretty decent” to “what is THAT?!”

That said, Wombo is a great app to start with if you want to dip your toes into creating ai generated art.

Prose Painter (prosepainter.com)

A before and after comparison of the same image edited with Prose Painter. — The image to the right has been painted with the words “diamond crystal rain.”

Prose Painter is a website that allows you to “paint” any texture over a digital image. It has a really intuitive interface and the concept is pretty simple. Upload your image, describe what you want to add to the image in the text box, add it with a brush, and then press “Start.”

Prose Painter will then add the new texture for 30 steps, and you can choose which of the 30 steps you want to use for your final image. If you don't want to use any, you can select “discard.”

This is a great way to add custom textures to your illustrations and photos. The brush is a little clunky to use, but the interface is otherwise really easy. Whether you're a professional illustrator or an amateur, this tool is worth trying out.

Other Beginner AI Image Generators:

Nightcafe
starry ai
craiyon (formerly dall-e mini)
pixray
pixelz.ai

Writing Good Prompts

Most ai art generation programs are prompt-based, meaning they generate images from text prompts. These prompts can be anything from one word to a long description. Details added in the prompt will show up, in some form, in the final product.

When writing prompts, there are a few things to keep in mind:

Be as specific as possible. The more detail you give, the more likely you are to get results that match your vision. “Portrait of a girl” will give more generic results than “portrait of a girl with green eyes and curly red hair.”
Cite styles or artists to give your work a different feeling. Be careful doing this; if you cite an artist whose work is still protected by copyright, you are on ethically shaky grounds. I would recommend sticking to artists that are well in the public domain, such as Michelangelo or Vincent Van Gogh. However, if you're just making images for your own interest, go wild.
Try using emojis. Most AI image generators understand them and will create quite inventive images based off just emoji prompts.
Explore prompts written by other people. Apps like Wombo will allow you to see other people's images and the prompts they use. Try copying their prompts, or slightly tweaking them, to see what results you get.

Example Prompts

beautiful symmetrical portrait of a cyberpunk princess
octane render of a snowdonian landscape with villages, forests, and snow-capped mountains
mona lisa smile
?

Intermediate Programs

Image of a woman — Still frame from a video created in pytti5, a Google Colab notebook.

When you're comfortable using programs like Artbreeder and Wombo Dream, you might want to try more advanced tools to illustrate your book.

Generated art tools cover the gamut. Some are expensive, while others are free, at least up to a threshold. Some produce basic images, while others are so detailed that they seem like witchcraft. Some are easy to use, while others practically require a BA in Computer Science.

Some art generation tools, like DALL-E 2 and Midjourney, are invitation-only, and it might take months to receive an invite. DALL-E 2 can't be used for commercial purposes, so unless you're publishing your book to your blog or in a free newsletter, you can forget about putting its pictures in your book.

If you want to customize your illustrations on a higher level, you'll need to learn a free program called Google Colab. Colab allows you to use a super-fast GPU, like a Tesla V100 or A100, to render ai-generated images. The interface might look a little intimidating, but you don't really have to know any code to use it. I would recommend getting started with one of these models:

Minds Eye – if you've never used Google Colab before, and are intimidated by the idea, start here. Minds Eye links Google Colab to an easy-to-use interface.
VQGAN+Clip – a simple notebook that will walk you through the image generation process.
Structured Dreaming – Styledreams – this notebook makes photorealistic portraits of people in any style. This notebook does not save to Google Drive, so you'll need to save any images to your computer, or risk losing them.

If you don't know where to start, open one of these notebooks, go to the drop-down menu marked “Runtime,” and select “Run all.” Then watch as the program moves from step to step to generate an image.

The artist pharmapsychotic keeps a very up-to-date list of working Colab notebooks here. Try playing around with some of these notebooks until you find one you like.

More Intermediate AI Image Generators

Midjourney (invite only)
Dall E 2 (invite only)
Disco Diffusion on Replicate – a more user-friendly version of Disco Diffusion that you can run from your browser. Be sure to download anything made through this app; Replicate doesn't connect to Google Drive.

Slightly More Advanced Colab Notebooks

Comic Faces – a custom-trained version of Disco Diffusion that can create faces in the style of modern comic books. Perfect for PFPs and character sketches.
Pulp Sci-Fi Diffusion – a notebook that's been trained on old pulp science fiction covers. This is pretty easy to use, and great for illustrating sci fi stories.

Advanced Image Generation

A character portrait showing a pale young woman with black hair and blue eyes staring intently at the camera. Her face has been altered to look like cave gems. — A still image from an animation made with Disco Diffusion.

If you want even better control over your ai images, you can always buy your own GPU. One of the best GPUs in the world, the Tesla A100, can be yours for a measly $5,500.

Just kidding. You can rent an GPU from vast.ai for a dollar an hour or less. You can then upload your favorite Colab notebook as a .ipynb file and run it using Vast's built-in Jupyter notebook client.

Vast's user interface is significantly less user-friendly than Google Colab. If you're not a programmer, I recommend editing the generator in Google Colab before downloading it and uploading it to vast.ai.

Make sure that any initial images or videos you use are on the web, because vast.ai (and other GPU rental services) don't connect to Google Drive the way Colab does.

This also means you'll have to download any images, videos, etc. made on Vast. If you don't, they will be destroyed when you disconnect from the server.

If that all sounds like Ancient Sanskrit to you, don't worry. You can do all the AI image generation you want without ever leaving Google Colab. Renting your own GPU is more appropriate if you're running the kind of project that takes up significant time or memory.

More Advanced AI Image Generators

Disco Diffusion 5.4 – currently the most popular image generator on Colab. This is very powerful and very customizable. You can start by watching this video tutorial that walks through every single feature of Disco Diffusion, or by reading this guide.
pytti5 – short for “python text to image.” Great for creating trippy animations and still images with a pixel art look.

Copyright and AI

A lot of questions around copyright and AI have not been settled. Here are several things you can do to protect yourself and your books from any threatened lawsuits:

Don't use any copyrighted works as your initial images or videos. Stick to public domain images, or images you've created.
Be careful with sites like Unsplash and Wikimedia Commons; anyone can upload a copyrighted image to one of these databases and say it is public domain. Saying this doesn't make it so.
When writing a prompt, don't cite individual artists unless you're sure they're well out of public domain. Be careful with citing art styles with lots of works still protected by copyright.
Don't make likenesses of celebrities or living public figures, unless you're doing so purely for your own interest. “Serial killer Lana Del Rey” shouldn't be an illustration in your story–unless you think it's fun to get sued.
Try using the ai-generated art as inspiration for your own paintings, drawings, or collages. If you're worried about copyright questions, this will somewhat limit your liability and risk.
Remember that AI-generated art cannot be copyrighted. If you want to copyright an image, make sure to edit it in a program like Procreate or Adobe Photoshop.
If you really want to be on the safe side, don't put ai-generated illustrations in books you sell commercially. You can post them on your blog, in a member's site, in an interactive fiction game, or in a bonus ebook that you offer to newsletter subscribers. You can also use them in advertisements or in videos you make for your fans. I've posted character portraits so fans can get a sense of how I imagined these characters in my head. However, I'm going to wait on offering illustrated books until I get some legal clarity about this issue.

This is not legal advice, and I am not a lawyer. If you have specific questions about copyright law and AI, please consult a lawyer in your country. Do your due diligence before publishing anything.

Race, Ethnicity, and Stereotypes

Four images generated using the prompt “Persian vampire queen.” Clockwise from top left: Structured Dreaming, Comic Faces, Midjourney and craiyon.

If you don't specify a race or ethnicity in your prompt(s), any human subjects in your image will usually look white or white-ish. The default result is usually a white woman with brown eyes and hair, although that depends a lot on what else is in your prompt. If your prompt includes “goth,” “vampire,” or “Addams family,” you're likely to get a very pale woman with black hair. Phrases like “hip hop” or “rapper” might return a picture of a black man in athletic clothes or streetwear. “Barbie” will get you a tanned white woman with blonde hair and pink lipstick. You may have to add certain tags or descriptors to your image to get specific looks.

If the AI is not returning an image that matches your vision, there are several ways to improve the result:

add information about the subject's race or ethnicity, e.g. “a british-indian university student” or “a finnish cowboy.”
specify physical features, e.g. “a brown-skinned woman with amber eyes and wavy black hair.”
use different emojis to change the subject's look. Try out the exact same prompt with different emojis and see what the result is. ??????‍???‍???‍♀️??‍???‍???????

These programs can sometimes use this extra information to make your image look more stereotypical. If you're using an initial image, and include information about the subject's ethnicity, e.g. “a Russian man sitting in a chair,” the AI might make him look more stereotypically Russian than he actually is. You might therefore want to include physical descriptions, e.g. “a pale man with light brown hair sitting in a chair.”

Remember that the AI does not think; it's using a very complicated algorithm to generate an image, using the information you've given it and data based on millions of images. A picture of a Persian girl might include a headscarf, an Indian girl might have a bindi, a “nerdy” student could have glasses, and a “French princess” will probably appear in 18th Century dress. Most of these ai programs will allow you to use negative prompts to specify what you don't want.

I'll give you some examples of how you can tailor your prompts to fine-tune your image in several different programs. Let's say we want to pay homage to A Girl Walks Home Alone at Night with a portrait of a vampire queen from Persia. Here are several ways we could get a decent result, at least during the first pass:

Disco Diffusion: [“persian vampire queen:2”, “jewel tones”, “headscarf:-1”]
pytti5: persian vampire queen:2 | jewel tones | headscarf:-1:-.95 |
Midjourney: persian vampire queen::2 jewel tones::1 headscarf::-1
Structured Dreaming: persian vampire queen #film #eternity
craiyon: persian vampire queen

You can also add specific details to change the image, since one word can affect multiple aspects of the image. For example, if you want to make an image of a French princess from a world where the French Revolution never happened, you might include a phrase like “modern French princess” or “modern couture French elite” to give your image a more updated look.

Conclusion

If you're looking for a way to add an extra level of visual dynamism to your book, consider using AI art. With programs like Wombo, Artbreeder, and Disco Diffusion 5.4, you can generate stunning images that will help bring your novel to life in a whole new way.

I hope this article has given you some ideas about how to illustrate your book with artificial intelligence. If you have any questions or comments, please leave them here. And if you want to stay up-to-date with the world of bookmaking and AI, please subscribe to my newsletter for more ideas.