我们在之前的文章【AI绘画】Midjourney后置指令–ar、–iw 、 --s 、–r、–stop详解中讲解了--ar、--iw 、 --s 、--r、--stop
后置指令,今天我们将介绍剩下几项Midjourney中常用的后置指令,包括–seed、–tile、–q、–chaos、–w、--no
,这些指令可以帮助我们更好地控制图像的生成过程和结果。通过合理使用这些指令,我们可以在Midjourney创作中获得更多灵活性和创意表达的空间。
测试提示词:
A cat sitting by the window looking out at the street
提示词加上sunset:
A cat sitting by the window looking out at the street,sunset
提示词加上种子0906:
A cat sitting by the window looking out at the street --ar 16:9 --seed 0906
在种子不变情况下,新增提示词:
A cat sitting by the window looking out at the street --ar 16:9 --seed 0906
作用:使用该参数后,生成的图像上下、左右可连接,可创建无缝图案。可用于壁纸、布料和纹理。 Midjourney--tile官方使用文档
intricate geometric pattern with floral elements, vibrant colors --tile
表象:决定了图片生成质量 原理:决定生成一张图片花了多少时间 Midjourney--quality官方使用文档
--q 0.5:
a detailed portrait of a futuristic robot with glowing eyes, intricate metallic textures --q 0.5
--q 1:
a detailed portrait of a futuristic robot with glowing eyes, intricate metallic textures --q 1
--q 2:
a detailed portrait of a futuristic robot with glowing eyes, intricate metallic textures --q 2
作用:混乱模式。该参数决定midjourney一次生成的四张图片在
内容、色彩、风格
的不同程度 Midjourney--chaos官方使用文档
Chaos
在英文中意为“混乱”。它在 Midjourney 中的作用是控制生成图像时的变化幅度。在之前当我们使用相同的提示词生成四张图像时,通常这些图像的风格会比较一致。然而,通过调整 chaos 值,我们可以影响图像之间的差异程度。可爱的2D卡通角色提示词:
a cute 2D cartoon character in a vibrant fantasy world, simple shapes and bright colors --chaos 80
作用:怪异模式。该参数生成的图像引入了古怪和另类的特性,会产生独特的、意想不到的效果。 Midjourney--weired官方使用文档
宁静的乡村小屋:
a beautiful cottage in a peaceful countryside, surrounded by trees and flowers --ar 16:9
–weird 2800 :
a beautiful cottage in a peaceful countryside, surrounded by trees and flowers --ar 16:9 --weird 2800
原图:
a mystical forest with glowing plants and ancient ruins, ethereal atmosphere, detailed trees and vines --ar 16:9
原始提示词
生成的图像展示了一个神秘的森林场景,背景中有发光的植物和古老的废墟,整体氛围安静而神秘。四张图片的风格较为一致,表现出柔和的光线和浓密的树木,环境中的细节处理得非常自然,展现出平和而略带神秘的氛围。
--chaos 50:
a mystical forest with glowing plants and ancient ruins, ethereal atmosphere, detailed trees and vines --chaos 50 --ar 16:9
--chaos 50
参数后,四张图像的差异性变得更为明显。图像的色彩开始大幅度变化,背景中的废墟造型、光线、植被以及整体氛围都有显著不同。例如,有的图像充满亮绿色和黄色的光线,有的图像则使用暗色调和神秘氛围。这一设定增强了画面在构图和风格上的差异性,体现了混乱值对图像随机性的影响。
--stylize 1000:
a mystical forest with glowing plants and ancient ruins, ethereal atmosphere, detailed trees and vines --stylize 1000 --ar 16:9
--stylize 1000
后,图像的美术风格显得更加突出。整体画面显得更加梦幻、艺术化,光线与阴影的处理更加精致。每个图像中,森林的树木、废墟的光影处理呈现出极高的艺术感,细节显得更加富有表现力,仿佛置身于一个童话般的幻想世界。这种风格化使图像显得更加富有情感和美学价值。
--weird 1500:
a mystical forest with glowing plants and ancient ruins, ethereal atmosphere, detailed trees and vines --weird 1500 --ar 16:9
--weird 1500
参数生成的图像出现了更具怪异和超现实感的变化。场景中的植被和废墟显得更加奇特,颜色搭配和构图充满了不寻常的元素。例如,某些图像中的光线和树木的处理显得更加诡异,有些图片表现出非自然的色彩和形状,场景带有强烈的异样感。整个画面充满了奇幻和超现实的视觉冲击,体现了怪异值对生成图像带来的戏剧性影响。
作用:反向提示词,告诉Midjourney什么是我们不需要的。 Midjourney--no官方使用文档
格式:
--no item1, item2, item3, item4
原图:
a beautiful park with children playing, sunny day, colorful balloons --ar 16:9
--no:
a beautiful park with children playing, sunny day, colorful balloons --no balloons --ar 16:9
这里由于原本提示词中有气球,所以no balloons难以消除所有的气球,气球只是数量变少。
--no+参数等于正面提示词中赋予-0.5的权重
原图:
Perfume still life --ar 16:9
flower::-0.5:
flower::-0.5 ,Perfume still life --ar 16:9
--no flower:
Perfume still life --ar 16:9 --no flower
通过合理使用 --seed、–tile、–q、–chaos、–weird 和 --no 等指令,不仅能够提高图像的生成质量,还能为创作带来更多的控制和灵感。这篇文章的目的是通过实际案例演示这些指令的效果,让大家在创作AI绘画时能更灵活地实现自己的想法。希望这篇文章能为每位读者的创作带来实用的帮助和启发,让我们在AI绘画的道路上共同进步。
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms, utils
from PIL import Image
import numpy as np
import cv2
import os
import random
class PaintingDataset(Dataset):
def __init__(self, root_dir, transform=None):
self.root_dir = root_dir
self.transform = transform
self.image_files = os.listdir(root_dir)
def __len__(self):
return len(self.image_files)
def __getitem__(self, idx):
img_name = os.path.join(self.root_dir, self.image_files[idx])
image = Image.open(img_name).convert('RGB')
if self.transform:
image = self.transform(image)
return image
class ResidualBlock(nn.Module):
def __init__(self, in_channels):
super(ResidualBlock, self).__init__()
self.conv_block = nn.Sequential(
nn.Conv2d(in_channels, in_channels, kernel_size=3, stride=1, padding=1),
nn.InstanceNorm2d(in_channels),
nn.ReLU(inplace=True),
nn.Conv2d(in_channels, in_channels, kernel_size=3, stride=1, padding=1),
nn.InstanceNorm2d(in_channels)
)
def forward(self, x):
return x + self.conv_block(x)
class Generator(nn.Module):
def __init__(self):
super(Generator, self).__init__()
self.downsampling = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=7, stride=1, padding=3),
nn.InstanceNorm2d(64),
nn.ReLU(inplace=True),
nn.Conv2d(64, 128, kernel_size=3, stride=2, padding=1),
nn.InstanceNorm2d(128),
nn.ReLU(inplace=True),
nn.Conv2d(128, 256, kernel_size=3, stride=2, padding=1),
nn.InstanceNorm2d(256),
nn.ReLU(inplace=True)
)
self.residuals = nn.Sequential(
*[ResidualBlock(256) for _ in range(9)]
)
self.upsampling = nn.Sequential(
nn.ConvTranspose2d(256, 128, kernel_size=3, stride=2, padding=1, output_padding=1),
nn.InstanceNorm2d(128),
nn.ReLU(inplace=True),
nn.ConvTranspose2d(128, 64, kernel_size=3, stride=2, padding=1, output_padding=1),
nn.InstanceNorm2d(64),
nn.ReLU(inplace=True),
nn.Conv2d(64, 3, kernel_size=7, stride=1, padding=3),
nn.Tanh()
)
def forward(self, x):
x = self.downsampling(x)
x = self.residuals(x)
x = self.upsampling(x)
return x
class Discriminator(nn.Module):
def __init__(self):
super(Discriminator, self).__init__()
self.model = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=4, stride=2, padding=1),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(64, 128, kernel_size=4, stride=2, padding=1),
nn.InstanceNorm2d(128),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(128, 256, kernel_size=4, stride=2, padding=1),
nn.InstanceNorm2d(256),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(256, 512, kernel_size=4, stride=2, padding=1),
nn.InstanceNorm2d(512),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(512, 1, kernel_size=4, stride=1, padding=1)
)
def forward(self, x):
return self.model(x)
def initialize_weights(model):
for m in model.modules():
if isinstance(m, (nn.Conv2d, nn.ConvTranspose2d)):
nn.init.normal_(m.weight.data, 0.0, 0.02)
elif isinstance(m, nn.InstanceNorm2d):
nn.init.normal_(m.weight.data, 1.0, 0.02)
nn.init.constant_(m.bias.data, 0)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
generator = Generator().to(device)
discriminator = Discriminator().to(device)
initialize_weights(generator)
initialize_weights(discriminator)
transform = transforms.Compose([transforms.Resize(256), transforms.ToTensor(), transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])])
dataset = PaintingDataset(root_dir='path_to_paintings', transform=transform)
dataloader = DataLoader(dataset, batch_size=16, shuffle=True)
criterion = nn.MSELoss()
optimizerG = optim.Adam(generator.parameters(), lr=0.0002, betas=(0.5, 0.999))
optimizerD = optim.Adam(discriminator.parameters(), lr=0.0002, betas=(0.5, 0.999))
def generate_noise_image(height, width):
return torch.randn(1, 3, height, width, device=device)
for epoch in range(100):
for i, data in enumerate(dataloader):
real_images = data.to(device)
batch_size = real_images.size(0)
optimizerD.zero_grad()
noise_image = generate_noise_image(256, 256)
fake_images = generator(noise_image)
real_labels = torch.ones(batch_size, 1, 16, 16, device=device)
fake_labels = torch.zeros(batch_size, 1, 16, 16, device=device)
output_real = discriminator(real_images)
output_fake = discriminator(fake_images.detach())
loss_real = criterion(output_real, real_labels)
loss_fake = criterion(output_fake, fake_labels)
lossD = (loss_real + loss_fake) / 2
lossD.backward()
optimizerD.step()
optimizerG.zero_grad()
output_fake = discriminator(fake_images)
lossG = criterion(output_fake, real_labels)
lossG.backward()
optimizerG.step()
with torch.no_grad():
fake_image = generator(generate_noise_image(256, 256)).detach().cpu()
grid = utils.make_grid(fake_image, normalize=True)
utils.save_image(grid, f'output/fake_painting_epoch_{epoch}.png')
def apply_style_transfer(content_img, style_img, output_img, num_steps=500, style_weight=1000000, content_weight=1):
vgg = models.vgg19(pretrained=True).features.to(device).eval()
for param in vgg.parameters():
param.requires_grad = False
content_img = Image.open(content_img).convert('RGB')
style_img = Image.open(style_img).convert('RGB')
content_img = transform(content_img).unsqueeze(0).to(device)
style_img = transform(style_img).unsqueeze(0).to(device)
target = content_img.clone().requires_grad_(True).to(device)
optimizer = optim.LBFGS([target])
content_layers = ['conv_4']
style_layers = ['conv_1', 'conv_2', 'conv_3', 'conv_4', 'conv_5']
def get_features(image, model):
layers = {'0': 'conv_1', '5': 'conv_2', '10': 'conv_3', '19': 'conv_4', '28': 'conv_5'}
features = {}
x = image
for name, layer in model._modules.items():
x = layer(x)
if name in layers:
features[layers[name]] = x
return features
def gram_matrix(tensor):
_, d, h, w = tensor.size()
tensor = tensor.view(d, h * w)
gram = torch.mm(tensor, tensor.t())
return gram
content_features = get_features(content_img, vgg)
style_features = get_features(style_img, vgg)
style_grams = {layer: gram_matrix(style_features[layer]) for layer in style_features}
for step in range(num_steps):
def closure():
target_features = get_features(target, vgg)
content_loss = torch.mean((target_features[content_layers[0]] - content_features[content_layers[0]])**2)
style_loss = 0
for layer in style_layers:
target_gram = gram_matrix(target_features[layer])
style_gram = style_grams[layer]
layer_style_loss = torch.mean((target_gram - style_gram)**2)
style_loss += layer_style_loss / (target_gram.shape[1] ** 2)
total_loss = content_weight * content_loss + style_weight * style_loss
optimizer.zero_grad()
total_loss.backward()
return total_loss
optimizer.step(closure)
target = target.squeeze().cpu().clamp_(0, 1)
utils.save_image(target, output_img)