在网站留外链怎么做,广州番禺邮政编码,从哪里下载wordpress,wordpress 删除 wordpress.org文章目录 昇思MindSpore应用实践基于MindSpore的Pix2Pix图像转换1、Pix2Pix 概述2、U-Net架构定义UNet Skip Connection Block 2、生成器部分3、基于PatchGAN的判别器4、Pix2Pix的生成器和判别器初始化5、模型训练6、模型推理 Reference 昇思MindSpore应用实践
本系列文章主要… 文章目录 昇思MindSpore应用实践基于MindSpore的Pix2Pix图像转换1、Pix2Pix 概述2、U-Net架构定义UNet Skip Connection Block 2、生成器部分3、基于PatchGAN的判别器4、Pix2Pix的生成器和判别器初始化5、模型训练6、模型推理 Reference 昇思MindSpore应用实践
本系列文章主要用于记录昇思25天学习打卡营的学习心得。
基于MindSpore的Pix2Pix图像转换
1、Pix2Pix 概述
Pix2Pix 是一个专门为图像到图像的转换任务设计的网络可以实现语义/标签到真实图片、灰度图到彩色图、航空图到地图、白天到黑夜、线稿图到实物图的转换。Pix2Pix是将条件GANCGAN应用于有监督需要成对的输入素描图像Sketch和真实图像GT来训练网络的图像到图像翻译的经典之作和所有的GANs一样模型同样包括生成器和判别器两个部分。
CGANCGAN(条件GAN) 的目标是生成与给定条件匹配的数据样本。这些条件可以是标签、部分实例标注数据或任何其他形式的多模态辅助信息。CGAN 通过将条件并入网络的生成器和判别器中来指导数据生成过程。 相比普通的生成对抗损失 L G A N ( G , D ) E y [ l o g ( D ( y ) ) ] E ( x , z ) [ l o g ( 1 − D ( x , z ) ) ] L_{GAN}(G,D)\mathbb{E}_{y}[log(D(y))]\mathbb{E}_{(x,z)}[log(1-D(x,z))] LGAN(G,D)Ey[log(D(y))]E(x,z)[log(1−D(x,z))] x x x代表观测图像的数据。 z z z代表随机噪声的数据。 y G ( x , z ) yG(x,z) yG(x,z)生成器网络给出由观测图像 x x x与随机噪声 z z z生成的“假”图片其中 x x x来自于训练数据而非生成器。 D ( x , G ( x , z ) ) D(x,G(x,z)) D(x,G(x,z))判别器网络给出图像判定为真实图像的概率其中 x x x来自于训练数据 G ( x , z ) G(x,z) G(x,z)来自于生成器。
CGAN多了来自于观测图像的条件 x x x因此Pix2Pix训练时采用有监督的方式需要标注好的语义数据如下图中的 Map2Aerial数据集、Anime Sketch Colorization Pair 素描生成动漫数据集
CGAN的目标可以表示为 L C G A N ( G , D ) E ( x , y ) [ l o g ( D ( x , y ) ) ] E ( x , z ) [ l o g ( 1 − D ( x , G ( x , z ) ) ) ] L_{CGAN}(G,D)\mathbb{E}_{(x,y)}[log(D(x,y))]\mathbb{E}_{(x,z)}[log(1-D(x,G(x,z)))] LCGAN(G,D)E(x,y)[log(D(x,y))]E(x,z)[log(1−D(x,G(x,z)))]
Pix2Pix 还包括 L1 损失帮助生成器产生结构上接近真实图像的结果这一点在图像翻译任务中尤为重要 L L 1 ( G ) E ( x , y , z ) [ ∣ ∣ y − G ( x , z ) ∣ ∣ 1 ] L_{L1}(G)\mathbb{E}_{(x,y,z)}[||y-G(x,z)||_{1}] LL1(G)E(x,y,z)[∣∣y−G(x,z)∣∣1]
进而得到最终目标 a r g min G max D L C G A N ( G , D ) λ L L 1 ( G ) arg\min_{G}\max_{D}L_{CGAN}(G,D)\lambda L_{L1}(G) argGminDmaxLCGAN(G,D)λLL1(G)
图像转换问题本质上其实就是像素到像素的映射问题Pix2Pix使用完全一样的网络结构和目标函数仅更换不同的训练数据集就能分别实现以上的任务。
2、U-Net架构
U-Net架构Pix2Pix 使用 U-Net 架构作为其生成器在传统的编解码网络结构基础上加入了跳跃连接的方式可以更好地捕捉图像的细节和上下文信息适合于图像到图像的翻译任务。相比于普通的编解码结构Encoder-DecoderU-Net在编码器和解码器之间引入了跳跃连接极大地改善了梯度流
定义UNet Skip Connection Block
import mindspore
import mindspore.nn as nn
import mindspore.ops as opsclass UNetSkipConnectionBlock(nn.Cell):def __init__(self, outer_nc, inner_nc, in_planesNone, dropoutFalse,submoduleNone, outermostFalse, innermostFalse, alpha0.2, norm_modebatch):super(UNetSkipConnectionBlock, self).__init__()down_norm nn.BatchNorm2d(inner_nc)up_norm nn.BatchNorm2d(outer_nc)use_bias Falseif norm_mode instance:down_norm nn.BatchNorm2d(inner_nc, affineFalse)up_norm nn.BatchNorm2d(outer_nc, affineFalse)use_bias Trueif in_planes is None:in_planes outer_ncdown_conv nn.Conv2d(in_planes, inner_nc, kernel_size4,stride2, padding1, has_biasuse_bias, pad_modepad)down_relu nn.LeakyReLU(alpha)up_relu nn.ReLU()if outermost:up_conv nn.Conv2dTranspose(inner_nc * 2, outer_nc,kernel_size4, stride2,padding1, pad_modepad)down [down_conv]up [up_relu, up_conv, nn.Tanh()]model down [submodule] upelif innermost:up_conv nn.Conv2dTranspose(inner_nc, outer_nc,kernel_size4, stride2,padding1, has_biasuse_bias, pad_modepad)down [down_relu, down_conv]up [up_relu, up_conv, up_norm]model down upelse:up_conv nn.Conv2dTranspose(inner_nc * 2, outer_nc,kernel_size4, stride2,padding1, has_biasuse_bias, pad_modepad)down [down_relu, down_conv, down_norm]up [up_relu, up_conv, up_norm]model down [submodule] upif dropout:model.append(nn.Dropout(p0.5))self.model nn.SequentialCell(model)self.skip_connections not outermostdef construct(self, x):out self.model(x)if self.skip_connections:out ops.concat((out, x), axis1)return out2、生成器部分
原始CGAN的输入是条件x和噪声z两种信息这里的生成器只使用了条件信息因此不能生成多样性的结果。因此Pix2Pix在训练和测试时都使用了dropout这样可以生成多样性的结果。
通过MindSpore实现基于U-Net的生成器
class UNetGenerator(nn.Cell):def __init__(self, in_planes, out_planes, ngf64, n_layers8, norm_modebn, dropoutFalse):super(UNetGenerator, self).__init__()unet_block UNetSkipConnectionBlock(ngf * 8, ngf * 8, in_planesNone, submoduleNone,norm_modenorm_mode, innermostTrue)for _ in range(n_layers - 5):unet_block UNetSkipConnectionBlock(ngf * 8, ngf * 8, in_planesNone, submoduleunet_block,norm_modenorm_mode, dropoutdropout)unet_block UNetSkipConnectionBlock(ngf * 4, ngf * 8, in_planesNone, submoduleunet_block,norm_modenorm_mode)unet_block UNetSkipConnectionBlock(ngf * 2, ngf * 4, in_planesNone, submoduleunet_block,norm_modenorm_mode)unet_block UNetSkipConnectionBlock(ngf, ngf * 2, in_planesNone, submoduleunet_block,norm_modenorm_mode)self.model UNetSkipConnectionBlock(out_planes, ngf, in_planesin_planes, submoduleunet_block,outermostTrue, norm_modenorm_mode)def construct(self, x):return self.model(x)3、基于PatchGAN的判别器
判别器使用的PatchGAN结构可看做卷积。 生成的矩阵中的每个点代表原图的一小块区域patch。通过矩阵中的各个值来判断原图中对应每个Patch的真假。
import mindspore.nn as nnclass ConvNormRelu(nn.Cell):def __init__(self,in_planes,out_planes,kernel_size4,stride2,alpha0.2,norm_modebatch,pad_modeCONSTANT,use_reluTrue,paddingNone):super(ConvNormRelu, self).__init__()norm nn.BatchNorm2d(out_planes)if norm_mode instance:norm nn.BatchNorm2d(out_planes, affineFalse)has_bias (norm_mode instance)if not padding:padding (kernel_size - 1) // 2if pad_mode CONSTANT:conv nn.Conv2d(in_planes, out_planes, kernel_size, stride, pad_modepad,has_biashas_bias, paddingpadding)layers [conv, norm]else:paddings ((0, 0), (0, 0), (padding, padding), (padding, padding))pad nn.Pad(paddingspaddings, modepad_mode)conv nn.Conv2d(in_planes, out_planes, kernel_size, stride, pad_modepad, has_biashas_bias)layers [pad, conv, norm]if use_relu:relu nn.ReLU()if alpha 0:relu nn.LeakyReLU(alpha)layers.append(relu)self.features nn.SequentialCell(layers)def construct(self, x):output self.features(x)return outputclass Discriminator(nn.Cell):def __init__(self, in_planes3, ndf64, n_layers3, alpha0.2, norm_modebatch):super(Discriminator, self).__init__()kernel_size 4layers [nn.Conv2d(in_planes, ndf, kernel_size, 2, pad_modepad, padding1),nn.LeakyReLU(alpha)]nf_mult ndffor i in range(1, n_layers):nf_mult_prev nf_multnf_mult min(2 ** i, 8) * ndflayers.append(ConvNormRelu(nf_mult_prev, nf_mult, kernel_size, 2, alpha, norm_mode, padding1))nf_mult_prev nf_multnf_mult min(2 ** n_layers, 8) * ndflayers.append(ConvNormRelu(nf_mult_prev, nf_mult, kernel_size, 1, alpha, norm_mode, padding1))layers.append(nn.Conv2d(nf_mult, 1, kernel_size, 1, pad_modepad, padding1))self.features nn.SequentialCell(layers)def construct(self, x, y):x_y ops.concat((x, y), axis1)output self.features(x_y)return output4、Pix2Pix的生成器和判别器初始化
实例化Pix2Pix生成器和判别器
import mindspore.nn as nn
from mindspore.common import initializer as initg_in_planes 3
g_out_planes 3
g_ngf 64
g_layers 8
d_in_planes 6
d_ndf 64
d_layers 3
alpha 0.2
init_gain 0.02
init_type normalnet_generator UNetGenerator(in_planesg_in_planes, out_planesg_out_planes,ngfg_ngf, n_layersg_layers)
for _, cell in net_generator.cells_and_names():if isinstance(cell, (nn.Conv2d, nn.Conv2dTranspose)):if init_type normal:cell.weight.set_data(init.initializer(init.Normal(init_gain), cell.weight.shape))elif init_type xavier:cell.weight.set_data(init.initializer(init.XavierUniform(init_gain), cell.weight.shape))elif init_type constant:cell.weight.set_data(init.initializer(0.001, cell.weight.shape))else:raise NotImplementedError(initialization method [%s] is not implemented % init_type)elif isinstance(cell, nn.BatchNorm2d):cell.gamma.set_data(init.initializer(ones, cell.gamma.shape))cell.beta.set_data(init.initializer(zeros, cell.beta.shape))net_discriminator Discriminator(in_planesd_in_planes, ndfd_ndf,alphaalpha, n_layersd_layers)
for _, cell in net_discriminator.cells_and_names():if isinstance(cell, (nn.Conv2d, nn.Conv2dTranspose)):if init_type normal:cell.weight.set_data(init.initializer(init.Normal(init_gain), cell.weight.shape))elif init_type xavier:cell.weight.set_data(init.initializer(init.XavierUniform(init_gain), cell.weight.shape))elif init_type constant:cell.weight.set_data(init.initializer(0.001, cell.weight.shape))else:raise NotImplementedError(initialization method [%s] is not implemented % init_type)elif isinstance(cell, nn.BatchNorm2d):cell.gamma.set_data(init.initializer(ones, cell.gamma.shape))cell.beta.set_data(init.initializer(zeros, cell.beta.shape))class Pix2Pix(nn.Cell):Pix2Pix模型网络def __init__(self, discriminator, generator):super(Pix2Pix, self).__init__(auto_prefixTrue)self.net_discriminator discriminatorself.net_generator generatordef construct(self, reala):fakeb self.net_generator(reala)return fakeb5、模型训练
训练分为两个主要部分训练判别器和训练生成器 训练判别器的目的是最大程度地提高判别图像真伪的概率 训练生成器是希望能产生更好的虚假图像 在这两个部分中分别获取训练过程中的损失并在每个周期结束时进行统计。
通过MindSpore进行训练
import numpy as np
import os
import datetime
from mindspore import value_and_grad, Tensorepoch_num 3
ckpt_dir results/ckpt
dataset_size 400
val_pic_size 256
lr 0.0002
n_epochs 100
n_epochs_decay 100def get_lr():lrs [lr] * dataset_size * n_epochslr_epoch 0for epoch in range(n_epochs_decay):lr_epoch lr * (n_epochs_decay - epoch) / n_epochs_decaylrs [lr_epoch] * dataset_sizelrs [lr_epoch] * dataset_size * (epoch_num - n_epochs_decay - n_epochs)return Tensor(np.array(lrs).astype(np.float32))dataset ds.MindDataset(./dataset/dataset_pix2pix/train.mindrecord, columns_list[input_images, target_images], shuffleTrue, num_parallel_workers1)
steps_per_epoch dataset.get_dataset_size()
loss_f nn.BCEWithLogitsLoss()
l1_loss nn.L1Loss()def forword_dis(reala, realb):lambda_dis 0.5fakeb net_generator(reala)pred0 net_discriminator(reala, fakeb)pred1 net_discriminator(reala, realb)loss_d loss_f(pred1, ops.ones_like(pred1)) loss_f(pred0, ops.zeros_like(pred0))loss_dis loss_d * lambda_disreturn loss_disdef forword_gan(reala, realb):lambda_gan 0.5lambda_l1 100fakeb net_generator(reala)pred0 net_discriminator(reala, fakeb)loss_1 loss_f(pred0, ops.ones_like(pred0))loss_2 l1_loss(fakeb, realb)loss_gan loss_1 * lambda_gan loss_2 * lambda_l1return loss_gand_opt nn.Adam(net_discriminator.trainable_params(), learning_rateget_lr(),beta10.5, beta20.999, loss_scale1)
g_opt nn.Adam(net_generator.trainable_params(), learning_rateget_lr(),beta10.5, beta20.999, loss_scale1)grad_d value_and_grad(forword_dis, None, net_discriminator.trainable_params())
grad_g value_and_grad(forword_gan, None, net_generator.trainable_params())def train_step(reala, realb):loss_dis, d_grads grad_d(reala, realb)loss_gan, g_grads grad_g(reala, realb)d_opt(d_grads)g_opt(g_grads)return loss_dis, loss_ganif not os.path.isdir(ckpt_dir):os.makedirs(ckpt_dir)g_losses []
d_losses []
data_loader dataset.create_dict_iterator(output_numpyTrue, num_epochsepoch_num)for epoch in range(epoch_num):for i, data in enumerate(data_loader):start_time datetime.datetime.now()input_image Tensor(data[input_images])target_image Tensor(data[target_images])dis_loss, gen_loss train_step(input_image, target_image)end_time datetime.datetime.now()delta (end_time - start_time).microsecondsif i % 2 0:print(ms per step:{:.2f} epoch:{}/{} step:{}/{} Dloss:{:.4f} Gloss:{:.4f} .format((delta / 1000), (epoch 1), (epoch_num), i, steps_per_epoch, float(dis_loss), float(gen_loss)))d_losses.append(dis_loss.asnumpy())g_losses.append(gen_loss.asnumpy())if (epoch 1) epoch_num:mindspore.save_checkpoint(net_generator, ckpt_dir Generator.ckpt)6、模型推理
导入模型训练保存的权重
from mindspore import load_checkpoint, load_param_into_netparam_g load_checkpoint(ckpt_dir Generator.ckpt)
load_param_into_net(net_generator, param_g)
dataset ds.MindDataset(./dataset/dataset_pix2pix/train.mindrecord, columns_list[input_images, target_images], shuffleTrue)
data_iter next(dataset.create_dict_iterator())
predict_show net_generator(data_iter[input_images])
plt.figure(figsize(10, 3), dpi140)
for i in range(10):plt.subplot(2, 10, i 1)plt.imshow((data_iter[input_images][i].asnumpy().transpose(1, 2, 0) 1) / 2)plt.axis(off)plt.subplots_adjust(wspace0.05, hspace0.02)plt.subplot(2, 10, i 11)plt.imshow((predict_show[i].asnumpy().transpose(1, 2, 0) 1) / 2)plt.axis(off)plt.subplots_adjust(wspace0.05, hspace0.02)
plt.show()图像翻译效果如下
Reference
昇思官方文档-Pix2Pix实现图像转换 昇思大模型平台 AI 助你无码看片生成对抗网络GAN大显身手