You are reading the article How To Use Stable Diffusion To Create Ai updated in December 2023 on the website Achiashop.com. We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested January 2024 How To Use Stable Diffusion To Create Ai
Artificial intelligence chatbots, like ChatGPT, have become incredibly powerful recently – they’re all over the news! But don’t forget about AI image generators (like Stable Diffusion, DALL-E, and Midjourney). They can make virtually any image when provided with just a few words. Follow this tutorial to learn how to do this for free with no restrictions by running Stable Diffusion on your computer.
Good to know: learn how to fix the internal server error for ChatGPT.
What Is Stable Diffusion?Stable Diffusion is a free and open source text-to-image machine-learning model. Basically, it’s a program that lets you describe a picture using text, then creates the image for you. It was given billions of images and accompanying text descriptions and was taught to analyze and reconstruct them.
Stable Diffusion is not the program you use directly – think of it more like the underlying software tool that other programs use. This tutorial shows how to install a Stable Diffusion program on your computer. Note that there are many programs and websites that use Stable Diffusion, but many will charge you money and don’t give you as much control.
System RequirementsThe rough guidelines for what you should aim for are as follows:
macOS: Apple Silicon (an M series chip)
Windows or Linux: NVIDIA or AMD GPU
RAM: 16GB for best results
GPU VRAM: at least 4GB
Storage: at least 15GB
Install AUTOMATIC1111 Web UIWe are using the AUTOMATIC1111 Web UI program, available on all major desktop operating systems, to access Stable Diffusion. Make sure you make note of where the “stable-diffiusion-webui” directory gets downloaded.
AUTOMATIC1111 Web UI on macOS
In Terminal, install Homebrew by entering the command:
Copy the two commands for adding Homebrew to your PATH and enter them.
Quit and reopen Terminal, then enter:
brewinstall
cmake protobuf rust python@
3.10
git
wget
Enter:
Download the latest stable version of Python 3.10.
- AUTOMATIC1111 Web UI on Linux
Open the Terminal.
Enter one of the following commands, depending on your flavor of Linux:
Debian-based, including Ubuntu:
sudo
apt-get update
sudo
apt
install
wget
git
python3 python3-venvRed Hat-based:
sudo
dnf
install
wget
git
python3Arch-based:
sudo
pacman-S
wget
git
python3Install in “/home/$(whoami)/stable-diffusion-webui/” by executing this command:
Install a ModelYou’ll still need to add at least one model before you can start using the Web UI.
Go to CIVITAI.
Move the .safetensors file downloaded in step 2 into your “stable-diffiusion-webui/models/Stable-diffusion” folder.
Run and Configure the Web UIAt this point, you’re ready to run and start using the Stable Diffusion program in your web browser.
Paste the link in your browser address bar and hit Enter. The Web UI website will appear.
Scroll down and check “Enable quantization in K samplers for sharper and cleaner results.”
FYI: If you need to find an image source, use Google.
Use txt2txt to Generate Concept ImagesNow comes the fun part: creating some initial images and searching for one that most closely resembles the look you want.
Go to the “txt2img” tab.
In the first prompt text box, type words describing your image separated by commas. It helps to include words describing the style of image, such as “realistic,” “detailed,” or “close-up portrait.”
In the negative prompt text box below, type keywords that you do not want your image to look like. For instance, if you’re trying to create realistic imagery, add words like “video game,” “art,” and “illustration.”
Scroll down and set “Batch size” to “4.” This will make Stable Diffusion produce four different images from your prompt.
Make the “CFG Scale” a higher value if you want Stable Diffusion to follow your prompt keywords more strictly or a lower value if you want it to be more creative. A low value (like the default of 7) usually produces images that are good quality and creative.
If you don’t like any of the images, repeat steps 1 through 5 with slight variations.
Finding the Prompts Used for Past ImagesAfter you’ve generated a few images, it’s helpful to get the prompts and settings used to create an image after the fact.
Upload an image into the box. All of the prompts and other details of your image will appear on the right.
Tip: use one of these Windows tools to batch-edit images.
Use img2img to Generate Similar ImagesYou can use the img2img feature to generate new images mimicking the overall look of any base image.
On the “img2img” tab, ensure that you are using a previously generated image with the same prompts.
Set the “Denoising strength” value higher or lower to regenerate more or less of your image (0.50 regenerates 50% and 1 regenerates 100%).
Rewrite the prompts to add completely new elements to the image and adjust other settings as desired.
- Use inpaint to Change Part of an Image
The inpaint feature is a powerful tool that lets you make precise spot corrections to a base image by using your mouse to “paint” over parts of an image that you want to regenerate. The parts you haven’t painted aren’t changed.
Change your prompts if you want new visual elements.
Use your mouse to paint over the part of the image you want to change.
Change the “Sampling method” to DDIM, which is recommended for inpainting.
Set the “Denoising strength,” choosing a higher value if you’re making extreme changes.
Good to know: look through these websites to find images with a transparent background.
Upscale Your ImageYou’ve been creating relatively small images at 512 x 512 pixels up to this point, but if you increase your image’s resolution, it also increases the level of visual detail.
Install the Ultimate SD Upscale Extension
- Resize Your Image
On the “img2img” tab, ensure you are using a previously generated image with the same prompts. At the front of your prompt input, add phrases such as “4k,” “UHD,” “high res photo,” “RAW,” “closeup,” “skin pores,” and “detailed eyes” to hone it in more. At the front of your negative prompt input, add phrases such as “selfie,” “blurry,” “low res,” and “phone cam” to back away from those.
Set your “Denoising strength” to a low value (around 0.25) and double the “Width” and “Height” values.
In the “Script” drop-down, select “Ultimate SD upscale,” then under “Upscaler,” check the “R-ESRGAN 4x+” option.
- Frequently Asked Questions What is the difference between Stable Diffusion, DALL-E, and Midjourney?
All three are AI programs that can create almost any image from a text prompt. The biggest difference is that only Stable Diffusion is completely free and open source. You can run it on your computer without paying anything, and anyone can learn from and improve the Stable Diffusion code. The fact that you need to install it yourself makes it harder to use, though.
DALL-E and Midjourney are both closed source. DALL-E can be accessed primarily via its website and offers a limited number of image generations per month before asking you to pay. Midjourney can be accessed primarily via commands on its Discord server and has different subscription tiers.
What is a model in Stable Diffusion?A model is a file representing an AI algorithm trained on specific images and keywords. Different models are better at creating different types of images – you may have a model good at creating realistic people, another that’s good at creating 2D cartoon characters, and yet another that’s best for creating landscape paintings.
The Deliberate model we installed in this guide is a popular model that’s good for most images, but you can check out all kinds of models on websites like Civitai or Hugging Face. As long as you download a .safetensors file, you can import it to the AUTOMATIC1111 Web UI using the same instructions in this guide.
What is the difference between SafeTensor and PickleTensor?In short, always use SafeTensor to protect your computer from security threats.
While both SafeTensor and PickleTensor are file formats used to store models for Stable Diffusion, PickleTensor is the older and less secure format. A PickleTensor model can execute arbitrary code (including malware) on your system.
Should I use the batch size or batch count setting?You can use both. A batch is a group of images that are generated in parallel. The batch size setting controls how many images there are in a single batch. The batch count setting controls how many batches get run in a single generation; each batch runs sequentially.
If you have a batch count of 2 and a batch size of 4, you will generate two batches and a total of eight images.
If you prefer drawing things yourself, check out our list of sketching apps for Windows.
Image credit: Pixabay. All screenshots by Brandon Li.
Brandon Li
Brandon Li is a technology enthusiast with experience in the software development industry. As a result, he has a lot of knowledge about computers and is passionate about sharing that knowledge with other people. While he has mainly used Windows since early childhood, he also has years of experience working with other major operating systems.
Subscribe to our newsletter!
Our latest tutorials delivered straight to your inbox
Sign up for all newsletters.
By signing up, you agree to our Privacy Policy and European users agree to the data transfer policy. We will not share your data and you can unsubscribe at any time.
You're reading How To Use Stable Diffusion To Create Ai
How To Run Stable Diffusion 2.0 And A First Look
Stable Diffusion 2.0 was released. Improvements include among other things, using a larger text encoder (which improves image quality) and increased default image size to 768×768 pixels.
Different from 1.5, NSFW filter was applied to the training data so you can expect images generated would be sanitized. Images from celebrity and commercial artists were also suppressed.
In this article, I will cover 3 ways to run Stable diffusion 2.0: (1) Web services, (2) local install and (3) Google Colab.
In the second part, I will compare images generated with Stable Diffusion 1.5 and 2.0. I will share some thoughts on how 2.0 should be used and in which way it is better than v1.
Web services
This is the easiest option. Go visit the websites below and put in your prompt.
Currently there are only limited web options available. But more should be coming in the next few weeks.
Here’s a list of websites you can run Stable Diffusion 2.0
The settings are limited to none.
Local install
Install base software
We will go through how to use Stable Diffusion 2.0 in AUTOMATIC1111 GUI. Follow the installation instruction on your respective environment.
This GUI can be installed quite easily in Windows systems. You will need a dedicated GPU card with at least 6GB VRAM to go with this option.
Download Stable diffusion 2.0 files
After installation, you will need to download two files to use Stable Diffusion 2.0.
Download the model file (768-v-ema.ckpt)
Download the config file, rename it to 768-v-ema.yaml
Put both of them in the model directory:
stable-diffusion-webui/models/Stable-diffusionGoogle Colab
Stable Diffusion 2.0 is also available in Colab notebook in the Quick Start Guide , among a few other popular models.
A good option if you don’t have dedicated GPU cards. You don’t need a paid account, though it helps to prevent disconnection and get a GPU instance at busy times.
Go to Google Colab and start a new notebook.
In the runtime menu, select change runtime type. In hardware accelerator field, select GPU.
First, you will need to download the AUTOMATIC1111 repository.
And then upgrade to python 3.10.
!sudo apt-get update -y !sudo apt-get install python3.10 !sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.7 !sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 2 !sudo apt-get install python3.10-distutilsDownload Stable Diffusion 2.0 model and config files.
Finally, run the GUI. You should change the username and password below.
%cd stable-diffusion-webui !python chúng tôi --share --gradio-auth username:passwordThis step is going to take a while so be patient. When it is done, you should see a message:
Follow the link to start the GUI.
Using Stable Diffusion 2.0
Select the Stable Diffusion 2.0 checkpoint file 768-v-ema.ckpt.
Since the model is trained on 768×768 images, make sure to set the width and height to 768. 30 steps of DPM++2M Karras sampler works well for most images.
Comparing v1 and v2 models
The first thing many people do is to compare images between v1 and v2.0.
Some care needs to be taken when doing the comparison.
Make sure to set the image size to 512×512 when using v1.4 or v1.5. It was fine tuned with images of that size and 768×768 will not do well.
Set image size to 768×768 when using v2.0.
(Note: According to the announcement, v2 is designed to generate both 512×512 and 768×768 images. Although early testing seem to suggest 512×512 is not as good, it may just be some software setting issues we need to iron out.)
Don’t reuse v1 prompts
Prompts that work for v1 models may not work the same in v2. This is expected because v2 has switched to the much larger OpenClip H/14 text encoder (nearly 6 times larger than the v1 model) and is trained from scratch.
Here’s v2.0 generation with the same prompt. Not that far off but I like v1.4’s better.
Stable Diffusion 2.0 images.
That’s not to say v2.0 is not good but the prompt is optimized for v1.4.
This prompt didn’t work so well for v2.0….
Stable Diffusion 2.0 images
It generates more a realistic style which was not what I wanted.
I don’t think v2.0 is worse. The prompt just needs to be re-optimized.
If you have to reuse v1 prompt…
You can try the prompt converter which work by first generating a image with v1 and interrogate the prompt with CLIP interrogator 2. Effectively, it gives a prompt what the language model would describe.
The prompt of the failed ink drips prompt is
[amber heard:Ana de Armas:0.7], (touching face:1.2), shoulder, by agnes cecile, half body portrait, extremely luminous bright design, pastel colors, (ink drips:1.3), autumn lights
which is translated to
a painting of a woman with watercolors on her face, a watercolor painting by Ignacy Witkiewicz, tumblr, process art, intense watercolor, expressive beautiful painting, painted in bright water colors
The images are not quite the same as v1 but they do look better.
v2.0 images generated using the prompt translated to v2.
v1 techniques are usable
My first impression is that many of the techniques developed for v1 would still work. For example, keyword blending works quite well. The effect of using celebrity name appears to be reduced though.
Stable Diffusion 2.0 images generated with keyword blending.
Prompt building
One observation I have is Stable Diffusion 2.0 works better with longer, more specific prompts. Note that this is also true for v1 but it seems to be even more so for v2.
To illustrate this point, below are images generated using a single-word prompt “cat”.
We got expected result using v1.5:
v1.5 images with the prompt “cat”.
Below is what I got with the same prompt using v2.0.
v2.0 images with the prompt “cat”.
They are still kind of related to cat but not quite what a user might expect.
What if we use a longer, more specific prompt?
A photo of a Russian forrest cat wearing sunglasses relaxing on a beach
v1.5 images
v2.0 images
Here’s where Stable Diffusion 2.0 shines: It generates higher quality images in the sense that they matches the prompt more closely.
This is likely the benefit of the larger language model which increases the expressiveness of the network. 2.0 is able to understand text prompt a lot better than v1 models and allow you to design prompts with higher precision.
Summary
It’s still the early day of Stable Diffusion 2.0. We have just got the software running and we are actively exploring. I will write more when I find out how to use 2.0 more effectively. Stay tuned!
Stable Diffusion Prompt: A Definitive Guide
Developing a process to build good prompts is the first step every Stable Diffusion user tackles. This article summarizes the process and techniques developed through experimentations and other users’ inputs. The goal is to write down all I know about prompts, so you can know them all in one place.
Anatomy of a good prompt
A good prompt needs to be detailed and specific. A good process is to look through a list of keyword categories and decide whether you want to use any of them.
The keyword categories are
Subject
Medium
Style
Artist
Website
Resolution
Additional details
Color
Lighting
An extensive list of keywords from each category is available in the prompt generator. You can also find a short list here.
You don’t have to include keywords from all categories. Treat them as a checklist to remind you what could be used.
Let’s review each category and generate some images by adding keywords from each. I will use the v1.5 base model. To see the effect of the prompt alone, I won’t be using negative prompts for now. Don’t worry, we will study negative prompts in the later part of this article. All images are generated with 30 steps of DPM++ 2M Karas sampler and an image size 512×704.
Subject
The subject is what you want to see in the image. A common mistake is not writing enough about the subjects.
Let’s say we want to generate a sorceress casting magic. A newbie may just write
A sorceress
That leaves too much room for imagination. How do you want the sorceress to look? Any words describing her that would narrow down her image? What does she wear? What kind of magic is she casting? Is she standing, running, or floating in the air? What’s the background scene?
Stable Diffusion cannot read our minds. We have to say exactly what we want.
A common trick for human subjects is to use celebrity names. They have a strong effect and are an excellent way to control the subject’s appearance. However, be aware that these names may change not only the face but also the pose and something else. I will defer this topic to a later part of this article.
As a demo, let’s cast the sorceress to look like Emma Watson, the most used keyword in Stable Diffusion. Let’s say she is powerful and mysterious and uses lightning magic. We want her outfit to be very detailed so she would look interesting.
Emma Watson as a powerful mysterious sorceress, casting lightning magic, detailed clothing
We get Emma Watson 11 out of 10 times. Her name is such a strong effect on the model. I think she’s popular among Stable Diffusion users because she looks decent, young, and consistent across a wide range of scenes. Trust me, we cannot say the same for all actresses, especially the ones who have been active in the 90s or earlier…
Medium
Medium is the material used to make artwork. Some examples are illustration, oil painting, 3D rendering, and photography. Medium has a strong effect because one keyword alone can dramatically change the style.
Let’s add the keyword digital painting.
Emma Watson as a powerful mysterious sorceress, casting lightning magic, detailed clothing, digital painting
We see what we expected! The images changed from photographs to digital paintings. So far so good. I think we can stop here. Just kidding.
Style
The style refers to the artistic style of the image. Examples include impressionist, surrealist, pop art, etc.
Let’s add hyperrealistic, fantasy, surrealist, full body to the prompt.
Emma Watson as a powerful mysterious sorceress, casting lightning magic, detailed clothing, digital painting, hyperrealistic, fantasy, Surrealist, full body
Mmm… not sure if they have added much. Perhaps these keywords were already implied by the previous ones. But I guess it doesn’t hurt to keep it.
Artist
Artist names are strong modifiers. They allow you to dial in the exact style using a particular artist as a reference. It is also common to use multiple artist names to blend their styles. Now let’s add Stanley Artgerm Lau, a superhero comic artist, and Alphonse Mucha, a portrait painter in the 19th century.
Emma Watson as a powerful mysterious sorceress, casting lightning magic, detailed clothing, digital painting, hyperrealistic, fantasy, Surrealist, full body, by Stanley Artgerm Lau and Alphonse Mucha
We can see the styles of both artists blending in and taking effect nicely.
Website
Niche graphic websites such as Artstation and Deviant Art aggregate many images of distinct genres. Using them in a prompt is a sure way to steer the image toward these styles.
Let’s add artstation to the prompt.
Emma Watson as a powerful mysterious sorceress, casting lightning magic, detailed clothing, digital painting, hyperrealistic, fantasy, Surrealist, full body, by Stanley Artgerm Lau and Alphonse Mucha, artstation
It’s not a huge change but the images do look like what you would find on Artstation.
Resolution
Resolution represents how sharp and detailed the image is. Let’s add keywords highly detailed and sharp focus.
Emma Watson as a powerful mysterious sorceress, casting lightning magic, detailed clothing, digital painting, hyperrealistic, fantasy, Surrealist, full body, by Stanley Artgerm Lau and Alphonse Mucha, artstation, highly detailed, sharp focus
Well, not a huge effect perhaps because the previous images are already pretty sharp and detailed. But it doesn’t hurt to add.
Additional details
Additional details are sweeteners added to modify an image. We will add sci-fi, stunningly beautiful and dystopian to add some vibe to the image.
Emma Watson as a powerful mysterious sorceress, casting lightning magic, detailed clothing, digital painting, hyperrealistic, fantasy, Surrealist, full body, by Stanley Artgerm Lau and Alphonse Mucha, artstation, highly detailed, sharp focus, sci-fi, stunningly beautiful, dystopian
Color
You can control the overall color of the image by adding color keywords. The colors you specified may appear as a tone or in objects.
Let’s add some golden color to the image with the keyword iridescent gold.
Emma Watson as a powerful mysterious sorceress, casting lightning magic, detailed clothing, digital painting, hyperrealistic, fantasy, Surrealist, full body, by Stanley Artgerm Lau and Alphonse Mucha, artstation, highly detailed, sharp focus, sci-fi, stunningly beautiful, dystopian, iridescent gold
The gold comes out great!
Lighting
Any photographer would tell you lighting is a key factor in creating successful images. Lighting keywords can have a huge effect on how the image looks. Let’s add cinematic lighting and dark to the prompt.
Emma Watson as a powerful mysterious sorceress, casting lightning magic, detailed clothing, digital painting, hyperrealistic, fantasy, Surrealist, full body, by Stanley Artgerm Lau and Alphonse Mucha, artstation, highly detailed, sharp focus, sci-fi, stunningly beautiful, dystopian, iridescent gold, cinematic lighting, dark
This complete our example prompt.
Remarks
As you may have notice, the images are already pretty good with a few keywords added to the subject. When it comes to building a prompt for Stable Diffusion, often you don’t need to have many keywords to get good images.
Negative prompt
Using negative prompts is another great way to steer the image, but instead of putting in what you want, you put in what you don’t want. They don’t need to be objects. They can also be styles and unwanted attributes. (e.g. ugly, deformed)
Using negative prompts is a must for v2 models. Without it, the images would look far inferior to v1’s. They are optional for v1 models, but I routinely use them because they either help or don’t hurt.
I will use a universal negative prompt. You can read more about it if you want to understand how it works.
ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, extra limbs, disfigured, deformed, body out of frame, bad anatomy, watermark, signature, cut off, low contrast, underexposed, overexposed, bad art, beginner, amateur, distorted face, blurry, draft, grainy
With universal negative prompt.
The negative prompt helped the images to pop out more, making them less flat.
Process of building a good prompt
Iterative prompt building
You should approach prompt building as an iterative process. As you see from the previous section, the images could be pretty good with just a few keywords added to the subject.
I always start with a simple prompt with subject, medium, and style only. Generate at least 4 images at a time to see what you get. Most prompts do not work 100% of the time. You want to get some idea of what they can do statistically.
Add at most two keywords at a time. Likewise, generate at least 4 images to assess its effect.
Using negative prompt
You can use an universal negative prompt if you are starting out.
Adding keywords to the negative prompt can be part of the iterative process. The keywords can be objects or body parts you want to avoid (Since v1 models are not very good at rendering hands, it’s not a bad idea to use “hand” in the negative prompt to hide them.)
Prompting techniques
You can modify a keyword’s importance by switching to a different one at a certain sampling step.
Keyword weight
(This syntax applies to AUTOMATIC1111 GUI.)
You can adjust the weight of a keyword by the syntax (keyword: factor). factor is a value such that less than 1 means less important and larger than 1 means more important.
For example, we can adjust the weight of the keyword dog in the following prompt
dog, autumn in paris, ornate, beautiful, atmosphere, vibe, mist, smoke, fire, chimney, rain, wet, pristine, puddles, melting, dripping, snow, creek, lush, ice, bridge, forest, roses, flowers, by stanley artgerm lau, greg rutkowski, thomas kindkade, alphonse mucha, loish, norman rockwell.
(dog: 0.5)
dog
(dog: 1.5)
Increasing the weight of dog tends to generate more dogs. Decreasing it tends to generate fewer. It is not always true for every single image. But it is true in a statistical sense.
This technique can be applied to subject keywords and all categories, such as style and lighting.
() and [] syntax
(This syntax applies to AUTOMATIC1111 GUI.)
An equivalent way to adjust keyword strength is to use () and []. (keyword) increases the strength of the keyword by a factor of 1.1 and is the same as (keyword:1.1). [keyword] decrease the strength by a factor of 0.9 and is the same as (keyword:0.9).
You can use multiple of them, just like in Algebra… The effect is multiplicative.
Similarly, the effects of using multiple [] are
Keyword blending
(This syntax applies to AUTOMATIC1111 GUI.)
You can mix two keywords. The proper term is prompt scheduling. The syntax is
[keyword1 : keyword2: factor]
factor controls at which step keyword1 is switched to keyword2. It is a number between 0 and 1.
For example, if I use the prompt
Oil painting portrait of [Joe Biden: Donald Trump: 0.5]
for 30 sampling steps.
That means the prompt in steps 1 to 15 is
Oil painting portrait of Joe Biden
And the prompt in steps 16 to 30 becomes
Oil painting portrait of Donald Trump
The factor determines when the keyword is changed. it is after 30 steps x 0.5 = 15 steps.
The effect of changing the factor is blending the two presidents to different degrees.
You may have noticed Trump is in a white suit which is more of a Joe outfit. This is a perfect example of a very important rule for keyword blending: The first keyword dictates the global composition. The early diffusion steps set the overall composition. The later steps refine details.
Quiz: What would you get if you swapped Donald Trump and Joe Biden?
Blending faces
A common use case is to create a new face with a particular look, borrowing from actors and actresses. For example, [Emma Watson: Amber heard: 0.85], 40 steps is a look between the two:
When carefully choosing the two names and adjusting the factor, we can get the look we want precisely.
Poor man’s prompt-to-prompt
Using keyword blending, you can achieve effects similar to prompt-to-prompt, generating pairs of highly similar images with edits. The following two images are generated with the same prompt except for a prompt schedule to substitute apple with fire. The seed and number of steps were kept the same.
holding an [apple: fire: 0.9]
holding an [apple: fire: 0.2]
The factor needs to be carefully adjusted. How does it work? The theory behind this is the overall composition of the image was set by the early diffusion process. Once the diffusion is trapped in a small space, swapping any keywords won’t have a large effect on the overall image. It would only change a small part.
How long can a prompt be?
Depending on what Stable Diffusion service you are using, there could be a maximum number of keywords you can use in the prompt. In the basic Stable Diffusion v1 model, that limit is 75 tokens.
Note that tokens are not the same as words. The CLIP model Stable Diffusion uses automatically converts the prompt into tokens, a numerical representation of words it knows. If you put in a word it has not seen before, it will be broken up into 2 or more sub-words until it knows what it is. The words it knows are called tokens, which are represented as numbers. For example, dream is one token, beach is one token. But dreambeach are two tokens because the model doesn’t know this word, and so the model breaks the word up to dream and beach which it knows.
Prompt limit in AUTOMATIC1111
AUTOMATIC1111 has no token limits. If a prompt contains more than 75 tokens, the limit of the CLIP tokenizer, it will start a new chunk of another 75 tokens, so the new “limit” becomes 150. The process can continue forever or until your computer runs out of memory…
Each chunk of 75 tokens is processed independently, and the resulting representations are concatenated before feeding into Stable Diffusion’s U-Net.
In AUTOMATIC1111, You can check the number of tokens by looking at the small box at the top right corner of the prompt input box.
Token counter in AUTOMATIC1111
Checking keywords
The fact that you see people using a keyword doesn’t mean that it is effective. Like homework, we all copy each other’s prompts, sometimes without much thought.
You can check the effectiveness of a keyword by just using it as a prompt. For example, does the v1.5 model know the American painter Henry Asencio? Let’s check with the prompt
henry asencio
Positive!
How about the Artstation sensation wlop?
wlop
Well, doesn’t look like it. That’s why you shouldn’t use “by wlop”. That’s just adding noise.
Josephine Wall is a resounding yes:
You can use this technique to examine the effect of mixing two or more artists.
Henry asencio, Josephine Wall
Limiting the variation
To be good at building prompts, you need to think like Stable Diffusion. At its core, it is an image sampler, generating pixel values that we humans likely say it’s legit and good. You can even use it without prompts, and it would generate many unrelated images. In technical terms, this is called unconditioned or unguided diffusion.
The prompt is a way to guide the diffusion process to the sampling space where it matches. I said earlier that a prompt needs to be detailed and specific. It’s because a detailed prompt narrows down the sampling space. Let’s look at an example.
castle
castle, blue sky background
wide angle view of castle, blue sky background
By adding more describing keywords in the prompt, we narrow down the sampling of castles. In We asked for any image of a castle in the first example. Then we asked to get only those with a blue sky background. Finally, we demanded it is taken as a wide-angle photo.
The more you specify in the prompt, the less variation in the images.
Association effect
Attribute association
Some attributes are strongly correlated. When you specify one, you will get the other. Stable Diffusion generates the most likely images that could have an unintended association effect.
Let’s say we want to generate photos of women with blue eyes.
a young female with blue eyes, highlights in hair, sitting outside restaurant, wearing a white outfit, side light
Blue eyes
What if we change to brown eyes?
a young female with brown eyes, highlights in hair, sitting outside restaurant, wearing a white outfit, side light
Brown eyes
Nowhere in the prompts, I specified ethnicity. But because people with blue eyes are predominantly Europeans, Caucasians were generated. Brown eyes are more common across different ethnicities, so you will see a more diverse sample of races.
Stereotyping and bias is a big topic in AI models. I will confine to the technical aspect in this article.
Association of celebrity names
Every keyword has some unintended associations. That’s especially true for celebrity names. Some actors and actresses like to be in certain poses or wear certain outfits when taking pictures, and hence in the training data. If you think about it, model training is nothing but learning by association. If Taylor Swift (in the training data) always crosses her legs, the model would think leg crossing is Taylor Swift too.
Prompt: full body taylor swift in future high tech dystopian city, digital painting
When you use Taylor Swift in the prompt, you may mean to use her face. But there’s an effect of the subject’s pose and outfit too. The effect can be studied by using her name alone as the prompt.
Poses and outfits are global compositions. If you want her face but not her poses, you can use keyword blending to swap her in at a later sampling step.
Association of artist names
Perhaps the most prominent example of association is seen when using artist names.
The 19th-century Czech painter Alphonse Mucha is a popular occurrence in portrait prompts because the name helps generate interesting embellishments, and his style blends very well with digital illustrations. But it also often leaves a signature circular or dome-shaped pattern in the background. They could look unnatural in outdoor settings.
Prompt: digital painting of [Emma Watson:Taylor Swift: 0.6] by Alphonse Mucha. (30 steps)
Embeddings are keywords
Embeddings, the result of textual inversion, are nothing but combinations of keywords. You can expect them to do a bit more than what they claim.
Let’s see the following base images of Ironman making a meal without using embeddings.
Prompt: iron man cooking in kitchen.
Style-Empire is an embedding I like to use because it adds a dark tone to portrait images and creates an interesting lighting effect. Since it was trained on an image with a street scene at night, you can expect it adds some blacks AND perhaps buildings and streets. See the images below with the embedding added.
Prompt: iron man cooking in kitchen Style-Empire.
Note some interesting effects
The background of the first image changed to city buildings at night.
Iron man tends to show his face. Perhaps the training image is a portrait?
So even if an embedding is intended to modify the style, it is just a bunch of keywords and can have unintended effects.
Effect of custom models
Using a custom model is the easiest way to achieve a style, guaranteed. This is also a unique charm of Stable Diffusion. Because of the large open-source community, hundreds of custom models are freely available.
When using a model, we need to be aware that the meaning of a keyword can change. This is especially true for styles.
Let’s use Henry Asencio again as an example. In v1.5, his name alone generates:
Using DreamShaper, a model fine-tuned for portrait illustrations, with the same prompt gives
It is a very decent but distinctly different style. The model has a strong basis for generating clear and pretty faces, which has been revealed here.
So make sure to check when you use a style in custom models. van Gogh may not be van Gogh anymore!
Region-specific prompts
Do you know you can specify different prompts for different regions of the image?
For example, you can put the moon at the top left:
Or at the top right:
You can do that by using the Regional Prompter extension. It’s a great way to control image composition!
How To Create & Use Data
Whoever coined the term “content is king” didn’t warn us that all of the steps needed to create link-worthy type of content.
It’s easy enough to write your copy, post it to your blog, and call it a day, but the “king” part only comes when you structure your content to get found in the SERPs.
I’ve worked with a lot of brands to create data-driven content.
One, in particular, was an education company based in New York City. I worked with the editorial team for 3 months to create a long-form piece of ego bait content.
Not only did it gain 52 backlinks in one month, but it generated more than 100 press mentions and drove over 100,000 people to the website.
Learn how to create and use data-driven content for your link building strategy below.
Be the SourceCreating your own data for an article is typically one big headache.
if you’ve ever tried to survey customers, you know what I’m talking about.
But, as Search Engine Journal’s founder Loren Baker says, “Be the source.”
When you create your own data, people will want to link back to the place they cited.
Using tools like Google Trends (you can also subscribe to Google Trends) and Google Consumer Survey, you can search for trending topics and build your own data.
Take Echelon Insights, for example.
They leveraged Google Consumer Surveys to understand the Republican Primary Electorate. Echelon Insights found that Donald Trump was leading at 32% going into the first Republican Primary Debate.
This study generated links from top sites like Wired, The Washington Post, The Observer, and many more.
Pick Your TopicGood data doesn’t always equal good content.
You have to figure out how to tell a story with the data you have.
First, you must decide what your content is going to be about.
With data, this can be a chicken and egg situation – do you use the data you have to form your topic or do you choose your topic and then collect some data around it?
It may depend on whether you have pre-existing data or whether you already have a subject matter in mind that’s newsworthy or trending.
When researching what topics I may want to cover, I’ll start researching with Google Trends and BuzzSumo. These tools are built for research and exploring trends.
Gather Your DataThe first step to creating data-driven content is to collect the data.
I begin to gather my resources of data, whether I’m surveying users or if I’m using my own data.
Important note: When building content with your own proprietary data, it’s not about quantity.
For example, Shutterstock uses its proprietary data to create a genuinely useful piece of content with its 2023 Creative Trends infographic. This infographic generated more than 50 links.
Conduct SurveysThe go-to place for collecting fresh data, surveys are a fantastic way to gather information and to get statistics and data around subjects that you specifically want to focus on.
Think carefully about your questions before asking them. You want to get the best results possible to generate a variety of angles for you to use in your content.
Make sure your questions will support your story and limit the number of open-ended questions you ask. Like what I did here with our SEJ survey for an article I was working on:
Include a variety of demographic questions so that you can cross-reference answers given with details about the respondents. This will allow you to create multiple sub-stories and angles to push out to the local press.
Ask Your CommunityDo you have your own community of customers or fans?
Then ask them a few questions, survey them or send out a questionnaire to turn that data into content.
Like Moz does with their survey.
You can see the survey questions here. And, the results of the survey here.
The results alone drove 32 backlinks.
If you work for a bigger brand and have forums where your customers come together to discuss a range of different topics, this is a great place to start a conversation about the topic you want to create content around.
Many businesses also have a large database of customer contact details and some regularly send out newsletters.
If you have a large social media following, you can use Facebook and Twitter polls to gather data.
Use Your Own Data & ReportsMany SaaS companies don’t realize the amount of data they are already sitting on.
You likely have some analytical tools to track the success of your own website and marketing efforts. These tools could be used to give you useful insights and data you could use as part of your content marketing strategy.
Google Analytics is a good place to start, as you can look into different consumer demographics such as their age, gender, and location of your customers, along with the industries they work in, what they buy, what devices they use, and more.
You can also carry out your own tests and experiments to generate data and insight that will interest others in your industry or your customers.
Look for Interesting AnglesOnce you’ve got your data, you need to analyze it and pull out the angles you want to use to tell your story and make your content as newsworthy as possible.
Try and highlight any key points and statistics that support the storyline or headline you want to use and pull out any compelling insights into your results.
Use conditional formatting and create pivot charts to find correlations between different data sets.
If you don’t get the answer or result you were hoping for, don’t force it — put it to one side and focus on a different angle.
Once you have some strong data in front of you, segment your results demographically. This will help you find a range of local angles you can pull out for your content based on gender, age, location, etc. – perfect for pushing out to regional press and publishers for extra coverage.
Visualize Your DataThe way you present your data is key to the success of your content.
Data visualization is the first step in making your content engaging and shareable. But it isn’t easy.
Ideally, you should work with a designer to visualize your data. But if you don’t have access to one (or don’t have the budget), you can make it yourself using a data visualization tool.
This is one of my favorite visuals that came from data by Podio.
Once you’ve created your visual, you need to make sure there is still some content around it to tell your story and make your data come to life.
Always keep in mind how you want your readers to digest your content and that it needs to be responsive on mobile and tablet devices.
How to Structure Your Content Support ActivitiesIf the content is truly a unicorn, as Larry Kim would say, you need to do all the supporting activities around this piece.
Here’s how I structure my content support activities:
Collaborate with the PR team to create a strategy. PR teams develop some of the highest-quality link opportunities, but they leave a lot of opportunities on the table. This is where link builders come in to do the manual outreach.
Conduct manual outreach to industry blogs for backlinks and guest blogs.
Partner with other companies on a webinar to discuss the data.
Create a blog post series to give further context to the data and optimize for new search terms.
Use the data in presentations at conferences.
Recreate the data in infographics, charts, and graphs.
Awesome Examples of Data-Driven ContentHere are a few pieces of data-driven content to inspire you:
The Guardian has really taken the lead with data visualization and has a whole section on their site dedicated to it. It’s a great place to go for inspiration on how you can shape your data into eye-catching graphics.
Here’s another really cool example of some data visualization based on A Day in the Life of Americans:
Don’t Have Any Data?Don’t have time to collect data yourself?
No problem!
There are plenty of data sources you can use and combine to make a whole new data set.
For example, you could take two similar data sets that were created 10 years apart and then compare and contrast them.
Or, you could analyze someone else’s data and pull out some new angles that haven’t been used yet.
Here are some other resources to find some interesting data to use in your content or as a starting point for a bigger piece of data journalism:
You can also simply type into Google “[keyword] market research” or “[keyword] data sets” to find a range of different information available online.
Read this article for a case study and even more ideas: Building Links with Data-Driven Content (Even When You Don’t Have Any Data)
SummaryTimeframe: Every 3 months
Results detected: 2-6 months
Average links sent per month: 60
Tools:
Google Trends
Google Consumer Survey
BuzzSumo
Google Analytics
Benefits:
Great content has no shelf-life. With high-quality content, you will see a spike at the beginning and again 6 months later as you start to rise search rank.
Data-driven content always works because you created something people want. If you did your research right, you should have a powerful piece of content.
How To Use Ai Music Generator?
AI music generators have gained significant popularity in recent years, offering a wide range of possibilities for creating unique and personalized music compositions. These tools utilize artificial intelligence algorithms to generate original melodies, harmonies, and rhythms, allowing musicians, producers, and enthusiasts to explore new creative avenues. In this article, we will delve into the various ways on how to use AI music generator to enhance your music production and unleash your artistic potential.
Best 10 AI Face Generators You Should Try in 2023
Best AI Image Generators 2023
When it comes to AI music generators, there are several tools available that offer unique features and functionalities. Some of the popular options include Beatoven, Soundraw, Boomy, and Jukebox. Each tool comes with its own strengths and characteristics, providing users with a diverse array of possibilities. Explore these tools and determine which one aligns best with your creative vision and requirements.
Once you have chosen an AI music generator, it’s time to dive into the creative process. Most AI music generators offer a range of options that allow you to tailor the music according to your preferences. Here are some common options you may encounter:
Mood: Determine the overall mood or atmosphere of the music. Options may include happy, sad, energetic, calm, and more. Consider the emotional tone you want the music to convey.
Genre: Select the genre that best aligns with your vision. Genres could include classical, rock, electronic, hip-hop, jazz, or ambient. Experimenting with different genres can lead to surprising and inspiring results.
Length: Specify the desired duration of the music. This could be in minutes or seconds, depending on the tool. Consider the intended use of the music and its compatibility with the project you’re working on.
Tempo: Determine the tempo or speed of the music. This can range from slow and relaxed to fast and energetic. The tempo can greatly influence the mood and impact.
instrument combinations to be included in the composition. This option allows you to customize the sound and texture of the music. Experiment with different instrument combinations to create unique sonic landscapes.
Melody and Harmony: Depending on the AI music generator, you may have the ability to input specific melodies or harmonies, or you can let the algorithm generate them for you. If you have a specific musical idea in mind, inputting your own melodies or harmonies can help shape the direction of the composition.
Additional Effects: Some AI music generators provide additional effects or parameters to modify the sound, such as reverb, delay, or filters. These options can add depth and character to the music, allowing you to further customize the composition to your liking.
7 Best Free AI Art Generators to use in 2023
To avoid any legal complications and ensure smooth integration of the AI-generated music into your projects, many AI music generators come with built-in royalty-free music features. This means you can confidently use the music compositions without worrying about licensing issues. It not only saves you time and effort but also provides peace of mind, knowing that you are using music that is legally safe to incorporate into your productions.
If you are looking to take your AI-generated music to the next level, some tools, such as Boomy, offer the option to add custom vocals to the compositions. This feature opens up exciting possibilities for vocalists and artists who want to collaborate with the AI-generated music. By adding your own vocals, you can infuse the music with your unique style and release the songs as singles, showcasing your talent and creativity to a wider audience.
AI music generators provide an excellent platform for experimentation and exploration. While they can certainly be used in production, it’s important to note that these tools excel in generating a variety of musical samples, which can serve as a starting point for your creative process. By experimenting with different parameters, options, and styles, you can push the boundaries of your music and discover new sounds and compositions that may have otherwise remained unexplored.
After the initial generation, listen to the composition and evaluate how well it aligns with your creative vision. If you’re satisfied with the result, you can proceed to the next step. However, if you feel that some adjustments are needed, most AI music generators allow you to refine the composition.
Here are a few ways to refine the generated music:
Editing: Some AI music generators offer editing capabilities, allowing you to make changes to specific sections of the composition. You can adjust the timing, modify melodies, or tweak the dynamics to better match your desired outcome.
Variation: If you’re looking for more options or alternative versions of the music, many AI music generators provide the ability to generate variations based on the initial composition. This feature can help you explore different interpretations of your musical idea.
Iterative Process: Generating and refining music with AI is often an iterative process. Don’t be afraid to experiment and generate multiple versions of the composition. This allows you to compare and select the best outcome or combine elements from different versions to create a cohesive and unique composition.
AI music generators are becoming increasingly popular and there are several options available. There are numerous AI music generators available, each offering unique features and functionalities. Here are a few popular options:
Beatoven.ai: This simple and free AI music generator allows you to create a new song by providing input parameters such as track title, length, tempo, genre, and emotion. It provides a quick and intuitive way to generate music tailored to your specifications.
Soundraw: If you’re looking for a tool with a wide range of options, Soundraw is an excellent choice. It enables you to choose the mood, genre, and length of the music, giving you greater control over the output.
Boomy: For those interested in adding custom vocals to AI-generated songs, Boomy is a web-based AI music generator that allows you to release your creations as singles. It offers various musical styles like electronic dance, rap beats, Lo-Fi, global groove, and relaxing meditation.
MuseNet: If you’re eager to experiment with easy AI music creation for free, MuseNet is a great option. Powered by machine learning, it generates high-quality music across various genres, providing endless possibilities for exploration.
Aiva: With deep learning algorithms at its core, Aiva specializes in composing original pieces of music in genres like classical, jazz, and pop. While free to use, registration is required to access its full range of features.
Soundful: Designed specifically for creating custom background music for videos or podcasts, Soundful offers a diverse selection of genres and moods to choose from. It’s a valuable tool for content creators seeking to enhance their productions with captivating music.
Riffusion: Riffusion takes a unique approach to music generation by visualizing the music and offering customization options such as tempo, key, and style. While it requires registration, it provides a seamless and engaging user experience.
By exploring these options and considering your specific needs, you can find an AI music generator that resonates with your creative visions.
Limitations of AI music generators include:
Lack of musical depth, creativity, and complexity that human composers bring to their work.
Lacking originality, in some cases, the AI technology will generate new music that sounds similar to a previous generated track.
AI-generated music can possibly lack the emotional depth and creativity of human composers.
The AI used in some projects is not sophisticated enough to fully compose its own melody and lyrics, so humans on the project picked out the best parts.
Despite these limitations, AI music generators are becoming increasingly popular and are a versatile tool for music producers, who can use them to create new and unique musical styles that are not possible with traditional music production methods. One of the biggest benefits of AI music generators for music producers is the ability to produce new music quickly and efficiently.
AI music generators offer musicians and producers a wealth of possibilities by providing instant inspiration, generating unlimited music compositions, and allowing for customization. They can help overcome creative blocks, provide a starting point for new ideas, and serve as a valuable tool for exploration and experimentation.
Yes, many AI music generators come with features like royalty-free music, allowing you to use the generated compositions in your commercial projects without worrying about licensing issues. However, it is essential to check the specific terms and conditions of the tool you are using to ensure compliance.
AI music generators are not intended to replace human musicians but rather to complement and augment their creativity. These tools serve as a source of inspiration and generate musical ideas that can be further developed, enhanced, and personalized by human musicians.
Integrating AI-generated music into your projects is a straightforward process. Once you have generated the desired music composition, you can export it as a MIDI file or audio file and import it into your preferred digital audio workstation (DAW) or music production software. From there, you can further refine, mix, and arrange the music to fit your project’s requirements.
While AI music generators have come a long way in terms of sophistication and realism, they still have limitations. The generated music may lack the emotional depth and human touch that can only be achieved by skilled human musicians. Additionally, the AI algorithms may not always capture the exact intent or nuances that a human composer or producer would envision.
Some AI music generators utilize machine learning techniques that enable them to learn and improve over time. As more data is fed into these systems and more iterations of the algorithms are developed, the AI-generated music can become more refined and closer to human-created compositions.
AI music generators have revolutionized the music production landscape, offering musicians, producers, and enthusiasts new ways to create and explore music. By choosing the right tool, customizing instruments and structures, and utilizing the various options available, you can unlock a world of endless possibilities. Whether you are seeking inspiration, experimenting with new sounds, or looking to augment your compositions, AI music generators provide a powerful toolset to unleash your creativity. Embrace the future of music creation and embark on a journey of innovation and sonic exploration with AI music generators.
Share this:
Like
Loading…
Related
How To Use Microsoft Designer To Create Graphics
Microsoft Designer is a tool that allows you to create professional designs for various purposes, but it has been designed primarily to share graphics on social media and other channels.
It is a web-based application that you can access through a browser (but you can also install it as an app on Windows 11), and it offers a wide range of design templates and tools. However, Microsoft Designer integrates with DALL-E 2.5 from OpenAI, which can translate text into images, but in this case, into graphics, posters, and presentations.
Furthermore, the Designer app can offer suggestions, generate captions and hashtags, create animated visuals, backgrounds, text transitions, and more.
In this guide, you will learn the basic steps to get started with Microsoft Designer on Windows 11, 10, macOS, or Linux.
Get started using Microsoft DesignerTo use Microsoft Designer to create graphics, use these steps:
Quick note: During the preview trial, the app is free for anyone, but eventually, it’ll become available as a free addition for Microsoft 365 subscribers.
In the web app, you will find a chatbot prompt box where you can use AI to generate templates based on your description. On the right side, you will a list of templates you can select to get started understanding how to request the app to generate an image. For example, selecting one of the templates inserted the “a thumbnail for my YouTube video offering tips on sustainable living” prompt.
You can also use the “Add image” option to upload images that the tool can use to generate a new design.
The “Generate image” option is a separate feature that uses the DALL-E 2.5 model from OpenAI to generate images from a text description. You can then select one of the AI-generated images the Designer can use to create graphic suggestions.
If you want to create an image manually from scratch, you will have to use the “blank canvas” option at the bottom of the page.
The Microsoft Designer app is divided into three sections. The toolbar at the top includes some basic options to zoom, undo and redo steps, start a new design or resize the current project, and download the project.
On the left, you will find the tools you will use to edit and create your image. The “Templates” tab includes every design template available with Microsoft Designer.
Quick note: If you are using an AI-generated template, you won’t be using this option.
The “My media” tab allows you to upload images from your computer or other services, such as Dropbox, Google Photos, Google Drive, and OneDrive, and from your phone. You can drag and drop the file directly if you’re uploading an image from your computer.
The “Visuals” tab includes AI-curated images to add to your project. You can also add different types of shapes, videos, and other types of illustrations. You can use the search or tabs at the top to find the visual you want to use in your project. The “Generate” tab allows you to use AI to create images you add to your design.
The “Text” tab includes various styles of text that you can use for heading, subheading, and body text. You can also use the chatbot to generate rich text for a description, catchy title, etc.
On the right side, you have the “Ideas” panel that shows additional templates you can use related to your project. The “New ideas” button brings you back to the start to create a new design.
While editing or creating an image, when selecting an element, a floating toolbar will appear with the different options you can use for that particular element. Some of the options include the ability to change the opacity, change the layer position (front, forward, back, and backward), color, text style, cropping tool, effects, and more.
Although this is an excellent tool for beginners and content creators to create various types of graphics, it’s not meant to be a replacement for Photoshop and many other tools that provide many different functionalities and capabilities.
The idea of the Microsoft Designer app is to use AI tools to automate the creation process as much as possible to stay more productive when you don’t have a graphics designer on speed dial.
Update the detailed information about How To Use Stable Diffusion To Create Ai on the Achiashop.com website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!