A downloadable asset pack

Download NowName your own price

Guide to Training Your Drawings with Generative AI

1. Introduction

Hello, today I want to share with you a complete guide on how to use generative AI to create assets. My focus is for and towards other artists, whether to accelerate the workflow, explore interesting alternatives, due to technical limitations , or any other reason. This has allowed me to take control of the consistency and style of my creations, and I want to share this knowledge with others.

Steps to Follow:

We will gather a dataset of images.
We will automatically obtain their descriptions using a Colab notebook.
We will train an algorithm with the images and descriptions to obtain a safetensor file that we can use anywhere.

The platform you choose to generate with your LoRa is a different matter, you'll see many options. In this guide, I am focusing on helping you obtain a safetensor file.

1.1 Pre-training:

Dataset Preparation:

To start, you will need to collect your own drawings. I started with 35, but even a small dataset can be useful to generate a basic model that can be iterated and improved over time. When selecting the drawings for your dataset, it is important to maintain consistency in what you want to highlight. For example, my drawings of trees are plump and circular, a characteristic I wanted to emphasize.

Collection of Drawings

Another drawing of a plump tree — Another Drawing

Drawing of a detailed tree — Drawing of assets and landscapes

Drawing of a plump tree with colors — Colors I want to emphasize
The images can have different sizes, but not too much variation. I recommend using standard resolutions such as 1024x1024, 780x1024, and 1024x780. Preparing the dataset can take a few hours, but if you have few images, you will need to work on their quality and resolution.

Examples and Clarifications:

Before starting, I will show some generations of this style and make a few clarifications:

--------------------------------------------------------------------------------------------------

Prompt: houtline, best quality, line-up, line_art, 2d_outline, fornitures, props, magic tree, multiple_views, gemstones, sprite_sheet, white_background, simple_background, halloween_(theme)

--------------------------------------------------------------------------------------------------

Magic tree generation with prompt — Magic tree generation with the given prompt.

Furniture generation with prompt — Furniture generation with the given prompt.

--------------------------------------------------------------------------------------------------

prompt: houtline, best quality, line-up, line_art, 2d_outline, fornitures, props, village_fornitures, wooden, walls, wooden_structures, gemstones, sprite_sheet, white_background, simple_background, halloween_(theme)

--------------------------------------------------------------------------------------------------

Village furniture generation with the given prompt.

Church furniture with ghosts generation with the given prompt.

------------------------------------------------------------------------------------------------

Prompt: houtline, best quality, line-up, line_art, 2d_outline, fornitures, props, church, Tombs, ghost, tinny_ghost, church_fornitures, wooden, walls, wooden_structures, gemstones, sprite_sheet, white_background, simple_background, halloween_(theme)

------------------------------------------------------------------------------------------------

Here I will show other drawing models. The previous ones are Cheyenne, but you can also move the LoRa to other models.

_CHINOOK_ by Aurety

DynaVisionXL

Cheyenne By Aurety

Once this is shown, we will proceed with the training, but not before explaining:

What is a LoRa?

A LoRa is a specific type of generative AI model, designed with specific ingredients to produce coherent and consistent results in certain tasks. To better understand the difference between a LoRa and other types of models, let's look at the following categories:

Foundation Models

Model	Description
Stable Diffusion	A general model with a vast amount of information, it is open source and you can download and run it on your own machine.
DALL-E	Another general model with a vast amount of data, it is used in ChatGPT and Copilot.
MidJourney	A general subscription model.

Checkpoints

Models	Description
PonyDiffusionXl, DinavisionXL, Juggernaut XL, AnimagineXL, AutismixXL	Smaller models than foundation ones, but they still require a considerable amount of computing power.

LoRa

LoRas are models prepared with specific ingredients for specific tasks. They are used in conjunction with foundation models or checkpoints as a guide to produce coherent and consistent results in a specific task. They act like a guardrail that limits the model's creativity to produce more predictable and useful results in certain contexts.

It is recommended to familiarize yourself with image generation in Stable Diffusion to better understand how to use a LoRa training.

1.2 Tagging the Dataset for Training:

I use Colab because I don't have a computer capable of performing this type of training, but the free computing power available is more than enough for the task we are going to do.

Access the Colab Notebook

Let's go step by step:

Run and connect your notebook to Drive and name the project. This will create three folders: Loras/project_name/dataset

Next, go to your Drive and upload your photos to the dataset folder. Now, we move directly to step number 4 of the notebook.

Clarifications regarding this: there are two vision models in this notebook, one is anime and it is better for extracting tags from characters, while photography is better with general photos. The result will be something like this:

Anime: 1cat, fur, animal, blue eyes, sitting, chair, depth_of_field, indoor

Photography: A cat sitting in a chair in a bedroom.

You can choose either one. To train my style, I used photography (BLIP) and for my nurse, I used anime (waifudiffusion).

The threshold determines the level of sensitivity. For this step, I recommend leaving the parameters as suggested by the notebook itself. This process takes about 4 minutes.

Finally, we add an activation word:

Where it says "hatsune miku," we make a change and use an activation word. For my own style, I used "Houtline." This will activate all the features dictated in the training, and we will write it in the prompt when generating. Be careful not to use words that can be confused with other tags, like "cat," "dog," "girl," or any very general word.

Before continuing, remember to disconnect and delete the current run.

2. Training Notebook Setup

Now we have the dataset ready and tagged for training. I use a notebook from the same author, the link is below:

Hollowstrawberry's Lora Trainer

The first step:

Insert the same name you used previously in the project, then select the base model for training. Each one has its advantages.

Model	Description	Recommended Use
Stable Diffusion SDXL base 1.0	Suitable model for realistic images.	Diverse, but I use it to generate assets.
Pony Diffusion SDXL	Optimized for anime and NSFW content.	Generation of anime-related content and good for NSFW.
Animagine XL	Specialized in anime, less effective with NSFW content.	Very flexible, assets also look good in Animagine.

A LoRa trained on one of these models has more or less influence if used in other base models. For example, training my images in Stable Diffusion SDXL base 1.0 would make it harder to generate successfully in Pony Diffusion XL. Keep this in mind. For creating assets, I use SDXL base 1.0 since Cheyenne is the best model I've used for generating these and its base is SDXL.

Let's continue.

Activation tags: If we used a trigger word in the previous notebook, we leave this at 1 and continue.

2.1- Training Configuration

The following explains the key parameters to set up the training:

Parameter	Description
num_repeats	Number of times the training will iterate with each image.
Epochs	The model will train on a set of images for this number of epochs. Each epoch consists of processing all the images in the dataset once.
batch_size	Number of images the model will compare in each epoch. A higher batch_size can speed up the training but may also require more memory.

The configuration of these parameters will affect the performance and effectiveness of the generative model training.

Let's go through the math:

I always try to stay within the threshold of 300 to 500 total steps.

Number of images multiplied by num_repeats and divided by batch_size, and then multiplied by the number of epochs, would look like this:

Number of Images	num_repeats	batch_size	epochs	Total Steps
10	20	6	10	10 x 20 / 6 x 10 = 400
50	4	6	10	50 x 4 / 6 x 10 = 400
100	2	6	10	100 x 2 / 6 x 10 = 400

We set according to our calculations and go down to training.

We go to train_batch_size and configure it, I usually set it to 6.

2.2. Optimizer

Next is the optimizer. I've only used two: adamW8bits (for datasets with many images) and prodigy (for datasets with few images, my favorite for training characters).

Optimizers

Keep in mind that the notebook author recommends an argument for each optimizer. When you change the optimizer, also change the argument.

Once this is done, run the notebook, and the training will begin. This process takes between 1.5 to 3 hours. It should not exceed that time since Google provides a limited amount of compute time daily. After 3 hours of training, the notebook will disconnect, and it will stop wherever it has reached.

The final files will be in the output folder on Google Drive.

3. Image Generation

Here you can choose whatever you want. There are many platforms to upload your LoRa. I will use Civitai because I want to show a free alternative, but there are certainly many more options.

Civitai

Upload your model and follow the form. It's not a big deal, but you have to wait for some verifications. It's a bit tedious to do it again (check the privacy settings; there's a way to keep the model hidden for yourself. Once uploaded, I recommend making it public and waiting a bit before switching it to draft mode). I've been writing this guide for several hours, so I'll leave the link to my model and do the tests with it.

Upload model

Go to your profile and choose your LoRa.

Choose model

Choose a base model, write a prompt, and generate.

Generate images

I asked for winter trees; the results are:

Also, some houses:

Houses

I'll leave the link to my model here and post the images in case you want to copy the prompt or something like that. I will also upload a folder with many generations using my model. You can either just look at it or clean up the assets and use them.

This is the end of the guide. I'm not an expert in this, but I've done several tests and thought it would be helpful to share it with those related to image generation. It should be noted that no generated result is a good final output. You will always need to repair, select, clean, and work on everything you generate. But it's a good starting point and can be useful for making mock-ups and placeholders that you will later replace. In any case, I hope it is useful to you.

More information

Status	Released
Category	Assets
Author	houdini
Tags	AI Generated, artificial-intelligence, Drawing

Download

Download NowName your own price

Click download now to get access to the following files:

generative_sampler.rar 166 MB

Comments

sectorbob1 year ago

Wow.. very clearly written and easy to understand. I followed most of this on the first read through.

houdini1 year ago(+1)

It's certainly impressive that someone can carry out a Lora Training in their first shot. It is not a great science either, but it requires someone to tell you what to do and what not to do, and I am happy to be able to clarify it for others, as I am committed to the democratization of knowledge.

Guide to training your drawings in generative AI

Guide to Training Your Drawings with Generative AI

1. Introduction

Steps to Follow:

1.1 Pre-training:

Dataset Preparation:

Collection of Drawings

Examples and Clarifications:

What is a LoRa?

Foundation Models

Checkpoints

LoRa

1.2 Tagging the Dataset for Training:

2. Training Notebook Setup

2.1- Training Configuration

2.2. Optimizer

3. Image Generation

Download

Comments