AI Art Generation Handbook/Promptcrafting
What is promptcrafting ?
editPromptcrafting a.k.a Promptcraft is combination of 2 words :
Based on the words definition from Wiktionary:
Prompt : A sequence of characters or symbols that is displayed to indicate that a computer is ready to receive input.
In our case, it is also the text that we create to enable the AI to understand and create the image that we want.
Craft : To construct, develop something (like a skilled craftsman).
Together, the two words, prompt and craft, become Promptcraft or prompt engineering, a human communicating with the AI Image model about the ideas they have for their final image output. Sometimes, the AI image model may get the ideas straight away, but often, you will have to tune the prompt until you get the image you want. Keep going, it's going to be great!
What is a prompt ?
editIn the context of AI Art Generation, a prompt is the set of instructions in the form of text input that will be processed by the AI Art Generation Model to generate the image or images that you desire. Although the current AI Art Generation Models are improving dramatically, almost month by month, we will still need to make our prompt as precise and descriptive as possible to guide the ai to produce what we want.
For successful text - to - image generation , a good prompt usually follows this format:
- What is the medium that you will create in? eg: oil, watercolour, pencil, crayon etc.
- Which artist's style, or what period, (eg romantic, cubist, impressionist etc, would you like the ai to replicate ?
- Describe the shot, how many people? How are they dressed? How old? What are they doing? What time is it? Where exactly are they? Describe location and colours, and anything else that you feel is important.
- Will you give the AI a NEGATIVE PROMPT? If you can, this is a list of things that you do not want to see in your image, eg malformed hands, faces, extra limbs etc. There are standard negative prompts available that cover most contingencies.
- How will you frame the shot. Wide, med., close-up?
- Which type of lighting techniques are you going to use ? What is the overall feel of the image you want?
To learn more about prompts, you can head directly to chapter : Prompting in Stable Diffusion Style to understand prompting in Stable Diffusion.
Note that the prompts discussed have been tested on popular AI Text to image generation models. In many models, word order is important, the nearer the front of the prompt, the more the AI will emphasise it. Experiment, get to know your preferred AI Model, and the list of specialised words that it knows and reacts to as you desire. As smart as the AI may appear, it still needs to know what you want, so in a relaxed and friendly manner, you have to tell it. (Aside here, no proof or reference, it's my feeling that our AI models will one day become sentient, I have seen conversations with AI, where it appeared to have sensitivities and feelings. System prompts often instruct the AI to NOT address such things, but I still think it's a good idea to be nice!).
In the chapter Prompting in Stable Diffusion Style we will also see how to prompt with a combination of text and image, upscale, and in and out painting to tune up the image we have produced.
[ART MEDIUM] + of [MAIN SUBJECT], [PERSPECTIVE], by [ARTIST], in the style of [STYLE], [MOOD], [OTHER DETAILS], [BOOSTERS]
Word Ordering
editAs per norm of English language structures, the "subject" should be at the front of prompt, the text encoder will put this in higher priority during the image generation. This will lead the AI Image model to have a higher chance of generating the images according to your requirements.
First example, we want the rhinoceros to be part of the design of the dollar note currency as per this Indonesia currency example:
Hence, the "subject" in this case is the dollar note, as we can see, the rhinoceros in the left image is generated without being part of the design of the dollar note because it (the "subject") is put at the end of the prompt. In the image on the right we get what we want!
Prompts in DALL-E 2 | Javan rhinoceros wearing a business suit screaming aloud with hands on the cheek while seeing the stock price crash
as design on dollar note |
Dollar note showing
Javan rhinoceros wearing a business suit screaming aloud with hands on the cheek while seeing the stock price crash |
---|---|---|
Images |
Second examples, we wanted the rhinoceros to paint the girl with Pearl Earring but in the left images, the word rhinoceros is at the front of the prompt making the "rhinoceros concept" bleeding into "Girl with Pearl Earring" instead. Instead, putting the word rhinoceros at the back making the AI Images generations as it should
Prompts in DALL-E 3 | Anthropomorphic rhinoceros wearing business suit touching up oil painting "Girl with Pearl Earring" with brushes | Oil painting of "Girl with Pearl Earring" is being touched up by anthropomorphic rhinoceros wearing business suit with brushes |
---|---|---|
Images |
Modifier
editModifier in a sense is the language of AI Art Generation models and it is able to tweak the generated images into different aesthetics / according to what you are looking for.
Usually, modifier consists such as following:
(a) Art Medium
(b) Artist Style
(e) Camera Types
One or more modifiers maybe added to create the unique image generation and the word ordering may changes according to your needs
References
edithttps://www.youtube.com/watch?v=F1X4fHzF4mQ
https://www.reddit.com/r/promptcraft/comments/x67fr3/stable_diffusion_keywords_for_enhancing_photos/
https://docs.google.com/spreadsheets/d/1inZdBt7zJZnM-B-V0OPxob8tWEmFFVTeaBjcsMzKrzo/edit#gid=0
https://docs.google.com/document/d/1Vb-4onScxOso1gqgXx7q80mnNL2JDKD9dTm3KKgiFD0/edit