当前位置: 首页 > news >正文

智能建站平台zui设计师做网站

智能建站平台z,ui设计师做网站,网站建设与规划实验心得体会,工程建设标准化期刊是什么级别前置数学知识 1、先验概率和后验概率 先验概率#xff1a;根据以往经验和分析得到的概率,它往往作为“由因求果”问题中的“因”出现#xff0c;如 q ( x t ∣ x t − 1 ) q(x_t|x_{t-1}) q(xt​∣xt−1​) 后验概率#xff1a;指在得到“结果”的信息后重新修正的概率,是…前置数学知识 1、先验概率和后验概率 先验概率根据以往经验和分析得到的概率,它往往作为“由因求果”问题中的“因”出现如 q ( x t ∣ x t − 1 ) q(x_t|x_{t-1}) q(xt​∣xt−1​) 后验概率指在得到“结果”的信息后重新修正的概率,是“执果寻因”问题中的“因, 如 p ( x t − 1 ∣ x t ) p(x_{t-1}|x_t) p(xt−1​∣xt​) 2、条件概率设 A A A、 B B B为任意两个事件若 P ( A ) 0 P(A)0 P(A)0,称在已知事件 A A A发生的条件下事件 B B B发生的概率为条件概率记为 P ( B ∣ A ) P(B|A) P(B∣A) P ( B ∣ A ) P ( A , B ) P ( A ) P(B|A)\frac{P(A,B)} {P(A)} P(B∣A)P(A)P(A,B)​ 3、乘法公式 P ( A , B ) P ( B ∣ A ) P ( A ) P(A,B)P(B|A)P(A) P(A,B)P(B∣A)P(A) 4、乘法公式一般形式 P ( A , B , C ) P ( C ∣ B , A ) P ( B , A ) P ( C ∣ B , A ) P ( B ∣ A ) P ( A ) P(A,B,C)P(C|B,A)P(B,A)P(C|B,A)P(B|A)P(A)\\ P(A,B,C)P(C∣B,A)P(B,A)P(C∣B,A)P(B∣A)P(A) 5、贝叶斯公式 P ( A ∣ B ) P ( B ∣ A ) P ( A ) P ( B ) P(A|B)\frac{P(B|A)P(A)}{P(B)} P(A∣B)P(B)P(B∣A)P(A)​ 6、多元贝叶斯公式: P ( A ∣ B , C ) P ( A , B , C ) P ( B , C ) P ( B ∣ A , C ) P ( A , C ) P ( B , C ) P ( B ∣ A , C ) P ( A ∣ C ) P ( C ) P ( B ∣ C ) P ( C ) P ( B ∣ A , C ) P ( A ∣ C ) ) P ( B ∣ C ) P(A|B,C)\frac{P(A,B,C)}{P(B,C)}\frac{P(B|A,C)P(A,C)}{P(B,C)}\frac{P(B|A,C)P(A|C)P(C)}{P(B|C)P(C)}\frac{P(B|A,C)P(A|C))}{P(B|C)} P(A∣B,C)P(B,C)P(A,B,C)​P(B,C)P(B∣A,C)P(A,C)​P(B∣C)P(C)P(B∣A,C)P(A∣C)P(C)​P(B∣C)P(B∣A,C)P(A∣C))​ 7、正态分布的叠加性当有两个独立的正态分布变量 N 1 N_{1} N1​和 N 2 N_{2} N2​,它们的均值和方差分别为 μ 1 \mu_{1} μ1​, μ 2 \mu_{2} μ2​和 σ 1 2 \sigma_{1}^2 σ12​, σ 2 2 \sigma_{2}^2 σ22​它们的和为 N a N 1 b N 2 Na N_{1}b N_{2} NaN1​bN2​的均值和方差可以表示如下 E ( N ) E ( a N 1 b N 2 ) a μ 1 b μ 2 V a r ( N ) V a r ( a N 1 b N 2 ) a 2 σ 1 2 b 2 σ 2 2 E(N)E(aN_{1}bN_{2})a\mu_{1}b\mu_{2}\\ Var(N)Var(aN_{1}bN_{2})a^2\sigma_{1}^2b^2\sigma_{2}^2 E(N)E(aN1​bN2​)aμ1​bμ2​Var(N)Var(aN1​bN2​)a2σ12​b2σ22​ 相减时 E ( N ) E ( a N 1 − b N 2 ) a μ 1 − b μ 2 V a r ( N ) V a r ( a N 1 − b N 2 ) a 2 σ 1 2 b 2 σ 2 2 E(N)E(aN_{1}-bN_{2})a\mu_{1}-b\mu_{2}\\ Var(N)Var(aN_{1}-bN_{2})a^2\sigma_{1}^2b^2\sigma_{2}^2 E(N)E(aN1​−bN2​)aμ1​−bμ2​Var(N)Var(aN1​−bN2​)a2σ12​b2σ22​ 8、重参数化:从 N ( μ , σ 2 ) N(\mu,\sigma^2) N(μ,σ2) 采样等价于从 N ( 0 , 1 ) N(0,1) N(0,1)采样一个 ϵ \epsilon ϵ, ϵ ⋅ σ μ \epsilon\cdot\sigma\mu ϵ⋅σμ 9、高斯分布的概率密度函数 f ( x ) 1 2 π σ e − ( x − μ ) 2 2 σ 2 f(x)\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}} f(x)2π ​σ1​e−2σ2(x−μ)2​ 10、高斯分布的KL散度公式 K L ( p ∣ q ) l o g σ 2 σ 1 σ 2 ( μ 1 − μ 2 ) 2 2 σ 2 2 − 1 2 KL(p|q)log\frac{\sigma_2}{\sigma_1}\frac{\sigma^2(\mu_1-\mu_2)^2}{2\sigma_2^2}-\frac{1}{2} KL(p∣q)logσ1​σ2​​2σ22​σ2(μ1​−μ2​)2​−21​ 11、二次函数配方 a x 2 b x a ( x b 2 a ) 2 c ax^2bxa(x\frac{b}{2a})^2c ax2bxa(x2ab​)2c 12、随机变量的期望公式 设 X X X是随机变量 Y g ( X ) Yg(X) Yg(X),则 E ( Y ) E [ g ( X ) ] { ∑ k 1 ∞ g ( x k ) p k ∫ − ∞ ∞ g ( x ) p ( x ) d x E(Y)E[g(X)] \begin{cases} \displaystyle\sum_{k1}^\infty g(x_k)p_k\\ \displaystyle\int_{-\infty}^{\infty}g(x)p(x)dx \end{cases} E(Y)E[g(X)]⎩ ⎨ ⎧​k1∑∞​g(xk​)pk​∫−∞∞​g(x)p(x)dx​ 13、KL散度公式 K L ( p ( x ) ∣ q ( x ) ) E x ∼ p ( x ) [ p ( x ) q ( x ) ] ∫ p ( x ) p ( x ) q ( x ) d x KL(p(x)|q(x))E_{x \sim p(x)}[\frac{p(x)}{q(x)}]\int p(x) \frac{p(x)}{q(x)}dx KL(p(x)∣q(x))Ex∼p(x)​[q(x)p(x)​]∫p(x)q(x)p(x)​dx 介绍DDPM 2020年Berkeley提出DDPMDenoising Diffusion Probabilistic Models简称扩散模型是AIGC的核心算法在生成图像的真实性和多样性方面均超越了GAN而且训练过程稳定。缺点是计算成本较高实时推理比较困难但也有相关技术在时间和空间维度上降低计算量。 扩散模型包括两个过程前向扩散过程(前向加噪过程)和反向去噪过程。 前向过程和反向过程都是马尔可夫链全过程大约需要1000步其中反向过程用来生成数据它的推导过程可以描述成 前向扩散的过程 前向扩散过程是对原始数据逐渐增加高斯噪声直至变成标准高斯分布的过程。 从原始数据集采样 x 0 ∼ q ( x 0 ) x_0\sim q(x_0) x0​∼q(x0​)按照预定义的noise schedule策略添加随机噪声得到一系列噪声图像 x 1 , x 2 , … , x T x_1,x_2,\dots,x_T x1​,x2​,…,xT​用概率表示为 q ( x 1 : T ∣ x 0 ) ∏ t 1 T q ( x t ∣ x t − 1 ) q ( x t ∣ x t − 1 ) N ( x t ; α t x t − 1 , β t I ) \begin{aligned} q(x_{1:T}|x_{0})\prod_{t1}^{T}q(x_t|x_{t-1}) \\q(x_{t}|x_{t-1})\mathcal{N}(x_t;\sqrt{\alpha_t}x_{t-1},\beta_{t}I)\\ \end{aligned} q(x1:T​∣x0​)q(xt​∣xt−1​)​t1∏T​q(xt​∣xt−1​)N(xt​;αt​ ​xt−1​,βt​I)​ 进行重参数化(前置知识数学知识8)得到 x t α t x t − 1 β t ϵ t ϵ t ∼ N ( 0 , I ) α t 1 − β t \begin{aligned} x_{t}\sqrt{\alpha_{t}}x_{t-1}\sqrt{\beta_{t}}\epsilon_{t} \space \space \space \space \epsilon_{t}\sim \mathcal{N}(0,I) \\ \alpha_{t}1-\beta_{t} \end{aligned} xt​αt​​αt​ ​xt−1​βt​ ​ϵt​    ϵt​∼N(0,I)1−βt​​ 利用上述公式进行迭代推导 x t α t x t − 1 β t ϵ t α t ( α t − 1 x t − 2 β t − 1 ϵ t − 1 ) β t ϵ t ( α t … α 1 ) x 0 ( α t … α 2 ) β 1 ϵ 1 ( α t … α 3 ) β 2 ϵ 2 ⋯ α t β t − 1 ϵ t − 1 β t ϵ t \begin{aligned} x_{t}\sqrt{\alpha_{t}} x_{t-1}\sqrt{\beta_{t}}\epsilon_{t}\\ \sqrt{\alpha_{t}}(\sqrt{\alpha_{t-1}}x_{t-2}\sqrt{\beta_{t-1}}\epsilon_{t-1})\sqrt{\beta_{t}}\epsilon_{t}\\ \sqrt{(\alpha_{t}\dots\alpha_{1})}x_{0}\sqrt{(\alpha_{t}\dots\alpha_{2})\beta_{1}}\epsilon_{1}\sqrt{(\alpha_{t}\dots\alpha_{3})\beta_{2}}\epsilon_{2}\dots\sqrt{\alpha_{t}\beta_{t-1}}\epsilon_{t-1}\sqrt{\beta_{t}}\epsilon_{t} \end{aligned} xt​​αt​ ​xt−1​βt​ ​ϵt​αt​ ​(αt−1​ ​xt−2​βt−1​ ​ϵt−1​)βt​ ​ϵt​(αt​…α1​) ​x0​(αt​…α2​)β1​ ​ϵ1​(αt​…α3​)β2​ ​ϵ2​⋯αt​βt−1​ ​ϵt−1​βt​ ​ϵt​​ 设 α t ˉ α 1 α 2 … α t \bar{\alpha_{t}}\alpha_{1}\alpha_{2}\dots\alpha_{t} αt​ˉ​α1​α2​…αt​ 根据正态分布的叠加性得到 x t α t ˉ x 0 1 − α t ˉ ϵ ϵ ∼ N ( 0 , I ) q ( x t ∣ x 0 ) N ( x t ; α t ˉ x 0 , 1 − α t ˉ I ) x_{t}\sqrt{\bar{\alpha_{t}}}x_{0}\sqrt{1-\bar{\alpha_{t}}}\epsilon \space \space\space \epsilon\sim \mathcal{N}(0,I)\\ \textcolor{REd}{q(x_{t}|x_{0})\mathcal{N}(x_{t};\sqrt{\bar{\alpha_{t}}}x_{0},\sqrt{1-\bar{\alpha_{t}}}I)} xt​αt​ˉ​ ​x0​1−αt​ˉ​ ​ϵ   ϵ∼N(0,I)q(xt​∣x0​)N(xt​;αt​ˉ​ ​x0​,1−αt​ˉ​ ​I) 这个公式表示任意步骤 t t t的噪声图像 x t x_t xt​ 都可以通过 x 0 x_0 x0​直接加噪得到后面需要用到。 注上述前向过程在代码实现时是一步到位的 反向去噪过程,神经网络拟合过程 反向去噪过程就是数据生成过程它首先是从标准高斯分布中采样得到一个噪声样本再一步步地迭代去噪最后得到数据分布中的一个样本。 如果知道反向过程的每一步真实的条件分布 q ( x t − 1 ∣ x t ) q(x_{t-1}|x_t) q(xt−1​∣xt​)那么从一个随机噪声开始逐步采样就能生成一个真实的样本。但是真实的条件分布利用贝叶斯公式 q ( x t − 1 ∣ x t ) q ( x t ∣ x t − 1 ) q ( x t − 1 ) q ( x t ) q(x_{t-1}|x_{t}) \frac{q(x_{t}|x_{t-1})q(x_{t-1})}{q(x_{t})} q(xt−1​∣xt​)q(xt​)q(xt​∣xt−1​)q(xt−1​)​ 无法直接求解原因是其中 q ( x t − 1 ) q(x_{t-1}) q(xt−1​) , q ( x t ) q(x_{t}) q(xt​) 未知因此无法从 x t x_{t} xt​ 推导到 x t − 1 {x_{t-1}} xt−1​所以必须通过神经网络** p θ ( x t − 1 ∣ x t ) p_\theta(x_{t-1}|x_t) pθ​(xt−1​∣xt​)来近似。为了简化起见将反向过程也定义为一个马尔卡夫链且服从高斯分布**建模如下: p θ ( x 0 : T ) p ( x T ) ∏ t 1 T p θ ( x t − 1 ∣ x t ) p θ ( x t − 1 ∣ x t ) N ( x t − 1 ; μ θ ( x t , t ) , ∑ θ ( x t , t ) ) p_\theta(x_{0:T})p(x_T)\prod_{t1}^Tp_\theta(x_{t-1}|x_t)\\ p_\theta(x_{t-1}|x_t)N(x_{t-1};\mu_\theta(x_t,t),\sum_\theta(x_t,t)) pθ​(x0:T​)p(xT​)t1∏T​pθ​(xt−1​∣xt​)pθ​(xt−1​∣xt​)N(xt−1​;μθ​(xt​,t),θ∑​(xt​,t)) --------------------下面这段讲解与上面有些跳脱是为损失函数做铺垫------------------------------ 虽然真实条件分布 q ( x t − 1 ∣ x t ) q(x_{t-1}|x_t) q(xt−1​∣xt​)无法直接求解但是加上已知条件 x 0 x_0 x0​的后验分布$q(x_{t-1}|x_{t},x_{0}) $却可以通过贝叶斯公式求解再结合前向马尔科夫性质可得 q ( x t − 1 ∣ x t , x 0 ) q ( x t ∣ x t − 1 , x 0 ) q ( x t − 1 ∣ x 0 ) q ( x t ∣ x 0 ) q ( x t ∣ x t − 1 ) q ( x t − 1 ∣ x 0 ) q ( x t ∣ x 0 ) q(x_{t-1}|x_{t},x_{0}) \frac{q(x_{t}|x_{t-1},x_{0})q(x_{t-1}|x_{0})}{q(x_{t}|x_{0})}\frac{q(x_{t}|x_{t-1})q(x_{t-1}|x_{0})}{q(x_{t}|x_{0})} q(xt−1​∣xt​,x0​)q(xt​∣x0​)q(xt​∣xt−1​,x0​)q(xt−1​∣x0​)​q(xt​∣x0​)q(xt​∣xt−1​)q(xt−1​∣x0​)​ 因此可以得到 q ( x t − 1 ∣ x 0 ) α ˉ t − 1 x 0 1 − α ˉ t − 1 ϵ ∼ N ( α ˉ t − 1 x 0 , ( 1 − α ˉ t − 1 ) I ) q ( x t ∣ x 0 ) α ˉ t x 0 1 − α ˉ t ϵ ∼ N ( α ˉ t x 0 , ( 1 − α ˉ t ) I ) q ( x t ∣ x t − 1 ) α t x t − 1 β t ϵ ∼ N ( α t x t − 1 , β t I ) \begin{aligned} q(x_{t-1}|x_{0})\sqrt{\bar{\alpha}_{t-1}}x_{0}\sqrt{1-\bar{\alpha}_{t-1}}\epsilon\sim \mathcal{N}(\sqrt{\bar{\alpha}_{t-1}}x_{0},(1-\bar{\alpha}_{t-1})I)\\ q(x_{t}|x_{0})\sqrt{\bar{\alpha}_{t}}x_{0}\sqrt{1-\bar{\alpha}_{t}}\epsilon\sim \mathcal{N}(\sqrt{\bar{\alpha}_{t}}x_{0},(1-\bar{\alpha}_{t})I)\\ q(x_{t}|x_{t-1})\sqrt{\alpha}_{t}x_{t-1}\beta_{t}\epsilon\sim \mathcal{N}(\sqrt{\alpha}_{t}x_{t-1},\beta_{t}I) \end{aligned} q(xt−1​∣x0​)q(xt​∣x0​)q(xt​∣xt−1​)​αˉt−1​ ​x0​1−αˉt−1​ ​ϵ∼N(αˉt−1​ ​x0​,(1−αˉt−1​)I)αˉt​ ​x0​1−αˉt​ ​ϵ∼N(αˉt​ ​x0​,(1−αˉt​)I)α ​t​xt−1​βt​ϵ∼N(α ​t​xt−1​,βt​I)​ 所以 q ( x t − 1 ∣ x t , x 0 ) ∝ e x p ( − 1 2 ( ( x t − α t x t − 1 ) 2 β t ) ( x t − 1 − α ˉ t − 1 x 0 ) 2 1 − α ˉ t − 1 − ( x t − α ˉ t x 0 ) 2 1 − α ˉ t ) e x p ( − 1 2 ( α t β t 1 1 − α ˉ t − 1 ) x t − 1 2 − ( 2 α t β t x t 2 α t ˉ 1 − α t ˉ x 0 ) x t − 1 C ( x t , x 0 ) ) \begin{aligned} q(x_{t-1}|x_{t},x_{0}) \propto exp(-\frac{1}{2}(\frac{(x_{t}-\sqrt{\alpha_{t}}x_{t-1})^2}{\beta_{t}})\frac{(x_{t-1}-\sqrt{\bar{\alpha}}_{t-1}x_{0})^2}{1-\bar{\alpha}_{t-1}}-\frac{(x_{t}-\sqrt{\bar{\alpha}_{t}}x_{0})^2}{1-\bar{\alpha}_{t}})\\ exp(-\frac{1}{2}(\frac{\alpha_{t}}{\beta_{t}}\frac{1}{1-\bar{\alpha}_{t-1}})x_{t-1}^2-(\frac{2\sqrt{\alpha_{t}}}{\beta_{t}}x_{t}\frac{2\sqrt{\bar{\alpha_{t}}}}{1-\bar{\alpha_{t}}}x_{0})x_{t-1}C(x_{t},x_{0})) \end{aligned} q(xt−1​∣xt​,x0​)​∝exp(−21​(βt​(xt​−αt​ ​xt−1​)2​)1−αˉt−1​(xt−1​−αˉ ​t−1​x0​)2​−1−αˉt​(xt​−αˉt​ ​x0​)2​)exp(−21​(βt​αt​​1−αˉt−1​1​)xt−12​−(βt​2αt​ ​​xt​1−αt​ˉ​2αt​ˉ​ ​​x0​)xt−1​C(xt​,x0​))​ 通过配方就可以得到 β ~ t 1 / ( α t β t 1 1 − α ˉ t − 1 ) 1 − α ˉ t − 1 1 − α ˉ t β t μ ~ t ( α t β t x t α ˉ t 1 − α t ˉ x 0 ) / ( α t β t 1 1 − α ˉ t − 1 ) α t ( 1 − α ˉ t − 1 ) 1 − α t ˉ x t α ˉ t − 1 β t 1 − α ˉ t x 0 \widetilde{\beta}_t1/(\frac{\alpha_{t}}{\beta_{t}}\frac{1}{1-\bar{\alpha}_{t-1}})\frac{1-\bar{\alpha}_{t-1}}{1-\bar{\alpha}_{t}}\beta_{t}\\ \widetilde{\mu}_t(\frac{\sqrt\alpha_{t}}{\beta_{t}}x_{t}\frac{\sqrt{\bar{\alpha}_{t}}}{1-\bar{\alpha_{t}}}x_{0})/(\frac{\alpha_{t}}{\beta_{t}}\frac{1}{1-\bar{\alpha}_{t-1}})\frac{\sqrt{\alpha_{t}}(1-\bar{\alpha}_{t-1})}{1-\bar{\alpha_{t}}}x_{t}\frac{\sqrt{\bar{\alpha}_{t-1}}\beta_{t}}{1-\bar{\alpha}_{t}}x_{0} β ​t​1/(βt​αt​​1−αˉt−1​1​)1−αˉt​1−αˉt−1​​βt​μ ​t​(βt​α ​t​​xt​1−αt​ˉ​αˉt​ ​​x0​)/(βt​αt​​1−αˉt−1​1​)1−αt​ˉ​αt​ ​(1−αˉt−1​)​xt​1−αˉt​αˉt−1​ ​βt​​x0​ 又因为 x 0 1 α ˉ t ( x t − β t 1 − α ˉ t ϵ ) x_0 \frac{1}{\sqrt{\bar\alpha_t}}(x_t- \frac{\beta_t}{\sqrt{1-\bar \alpha_t} }\epsilon)\\ x0​αˉt​ ​1​(xt​−1−αˉt​ ​βt​​ϵ) 可以得 μ ~ t 1 α t ( x t − β t ( 1 − α t ) ϵ ) \widetilde{\mu}_t\frac{1}{\sqrt{\alpha_t} }(x_t-\frac{\beta_t}{\sqrt{(1-\alpha_t)}}\epsilon) μ ​t​αt​ ​1​(xt​−(1−αt​) ​βt​​ϵ) ---------------------------------------------------------------------------------------------- 采样过程模型训练完后的预测过程 μ θ ( x t , t ) 1 α t ( x t − β t ( 1 − α t ) ϵ θ ( x t , t ) ) x t − 1 ∼ p θ ( x t − 1 ∣ x t ) x t − 1 1 α t ( x t − β t ( 1 − α t ) ϵ θ ( x t , t ) ) β ~ t z z ∼ N ( 0 , I ) \mu_\theta(x_t,t)\frac{1}{\sqrt{\alpha_t} }(x_t-\frac{\beta_t}{\sqrt{(1-\alpha_t)}}\epsilon_\theta(x_t,t))\\ x_{t-1}\sim p_\theta(x_{t-1}|x_t)\\ x_{t-1}\frac{1}{\sqrt{\alpha_t} }(x_t-\frac{\beta_t}{\sqrt{(1-\alpha_t)}}\epsilon_\theta(x_t,t))\sqrt{\widetilde{\beta}_t}z \space \space\space\space z\sim N(0,I) μθ​(xt​,t)αt​ ​1​(xt​−(1−αt​) ​βt​​ϵθ​(xt​,t))xt−1​∼pθ​(xt−1​∣xt​)xt−1​αt​ ​1​(xt​−(1−αt​) ​βt​​ϵθ​(xt​,t))β ​t​ ​z    z∼N(0,I) 这里用z是为了和之前的 ϵ \epsilon ϵ区别开 损失函数 https://blog.csdn.net/weixin_45453121/article/details/131223653 Code import torch import torchvision import matplotlib.pyplot as plt import torch.nn.functional as F from torchvision import transforms from torch.utils.data import DataLoader import numpy as np from torch.optim import Adam from torch import nn import math from torchvision.utils import save_imagedef show_images(data, num_samples20, cols4): Plots some samples from the dataset plt.figure(figsize(15,15))for i, img in enumerate(data):if i num_samples:breakplt.subplot(int(num_samples/cols) 1, cols, i 1)plt.imshow(img[0])def linear_beta_schedule(timesteps, start0.0001, end0.02):return torch.linspace(start, end, timesteps)def get_index_from_list(vals, t, x_shape):Returns a specific index t of a passed list of values valswhile considering the batch dimension.batch_size t.shape[0]out vals.gather(-1, t.cpu())#print(out:,out)#print(out.shape:,out.shape)return out.reshape(batch_size, *((1,) * (len(x_shape) - 1))).to(t.device)def forward_diffusion_sample(x_0, t, devicecpu):Takes an image and a timestep as input andreturns the noisy version of itnoise torch.randn_like(x_0)sqrt_alphas_cumprod_t get_index_from_list(sqrt_alphas_cumprod, t, x_0.shape)sqrt_one_minus_alphas_cumprod_t get_index_from_list(sqrt_one_minus_alphas_cumprod, t, x_0.shape)# mean variancereturn sqrt_alphas_cumprod_t.to(device) * x_0.to(device) \ sqrt_one_minus_alphas_cumprod_t.to(device) * noise.to(device), noise.to(device)def load_transformed_dataset(IMG_SIZE):data_transforms [transforms.Resize((IMG_SIZE, IMG_SIZE)),transforms.ToTensor(), # Scales data into [0,1]transforms.Lambda(lambda t: (t * 2) - 1) # Scale between [-1, 1]]data_transform transforms.Compose(data_transforms)train torchvision.datasets.MNIST(root./Data,transformdata_transform,trainTrue)test torchvision.datasets.MNIST(root./Data, transformdata_transform, trainFalse)return torch.utils.data.ConcatDataset([train, test])def show_tensor_image(image):reverse_transforms transforms.Compose([transforms.Lambda(lambda t: (t 1) / 2),transforms.Lambda(lambda t: t.permute(1, 2, 0)), # CHW to HWCtransforms.Lambda(lambda t: t * 255.),transforms.Lambda(lambda t: t.numpy().astype(np.uint8)),transforms.ToPILImage(),])#Take first image of batchif len(image.shape) 4:image image[0, :, :, :]plt.imshow(reverse_transforms(image))class Block(nn.Module):def __init__(self, in_ch, out_ch, time_emb_dim, upFalse):super().__init__()self.time_mlp nn.Linear(time_emb_dim, out_ch)if up:self.conv1 nn.Conv2d(2*in_ch, out_ch, 3, padding1)self.transform nn.ConvTranspose2d(out_ch, out_ch, 4, 2, 1)else:self.conv1 nn.Conv2d(in_ch, out_ch, 3, padding1)self.transform nn.Conv2d(out_ch, out_ch, 4, 2, 1)self.conv2 nn.Conv2d(out_ch, out_ch, 3, padding1)self.bnorm1 nn.BatchNorm2d(out_ch)self.bnorm2 nn.BatchNorm2d(out_ch)self.relu nn.ReLU()def forward(self, x, t):#print(ttt:,t.shape)# First Convh self.bnorm1(self.relu(self.conv1(x)))# Time embeddingtime_emb self.relu(self.time_mlp(t))# Extend last 2 dimensionstime_emb time_emb[(..., ) (None, ) * 2]# Add time channelh h time_emb# Second Convh self.bnorm2(self.relu(self.conv2(h)))# Down or Upsamplereturn self.transform(h)class SinusoidalPositionEmbeddings(nn.Module):def __init__(self, dim):super().__init__()self.dim dimdef forward(self, time):device time.devicehalf_dim self.dim // 2embeddings math.log(10000) / (half_dim - 1)embeddings torch.exp(torch.arange(half_dim, devicedevice) * -embeddings)embeddings time[:, None] * embeddings[None, :]embeddings torch.cat((embeddings.sin(), embeddings.cos()), dim-1)# TODO: Double check the ordering herereturn embeddingsclass SimpleUnet(nn.Module):A simplified variant of the Unet architecture.def __init__(self):super().__init__()image_channels 1 #灰度图为1彩色图为3down_channels (64, 128, 256, 512, 1024)up_channels (1024, 512, 256, 128, 64)out_dim 1 #灰度图为1 彩色图为3time_emb_dim 32# Time embeddingself.time_mlp nn.Sequential(SinusoidalPositionEmbeddings(time_emb_dim),nn.Linear(time_emb_dim, time_emb_dim),nn.ReLU())# Initial projectionself.conv0 nn.Conv2d(image_channels, down_channels[0], 3, padding1)# Downsampleself.downs nn.ModuleList([Block(down_channels[i], down_channels[i1], \time_emb_dim) \for i in range(len(down_channels)-1)])# Upsampleself.ups nn.ModuleList([Block(up_channels[i], up_channels[i1], \time_emb_dim, upTrue) \for i in range(len(up_channels)-1)])# Edit: Corrected a bug found by Jakub C (see YouTube comment)self.output nn.Conv2d(up_channels[-1], out_dim, 1)def forward(self, x, timestep):# Embedd timet self.time_mlp(timestep)# Initial convx self.conv0(x)# Unetresidual_inputs []for down in self.downs:x down(x, t)residual_inputs.append(x)for up in self.ups:residual_x residual_inputs.pop()# Add residual x as additional channelsx torch.cat((x, residual_x), dim1)x up(x, t)return self.output(x)def get_loss(model, x_0, t):x_noisy, noise forward_diffusion_sample(x_0, t, device)noise_pred model(x_noisy, t)return F.l1_loss(noise, noise_pred)torch.no_grad() def sample_timestep(x, t):Calls the model to predict the noise in the image and returnsthe denoised image.Applies noise to this image, if we are not in the last step yet.betas_t get_index_from_list(betas, t, x.shape)sqrt_one_minus_alphas_cumprod_t get_index_from_list(sqrt_one_minus_alphas_cumprod, t, x.shape)sqrt_recip_alphas_t get_index_from_list(sqrt_recip_alphas, t, x.shape)# Call model (current image - noise prediction)model_mean sqrt_recip_alphas_t * (x - betas_t * model(x, t) / sqrt_one_minus_alphas_cumprod_t)posterior_variance_t get_index_from_list(posterior_variance, t, x.shape)if t 0:# As pointed out by Luis Pereira (see YouTube comment)# The ts are offset from the ts in the paperreturn model_meanelse:noise torch.randn_like(x)return model_mean torch.sqrt(posterior_variance_t) * noisetorch.no_grad() def sample_plot_image(IMG_SIZE):# Sample noiseimg_size IMG_SIZEimg torch.randn((1, 1, img_size, img_size), devicedevice) #生成第T步的图片plt.figure(figsize(15,15))plt.axis(off)num_images 10stepsize int(T/num_images)for i in range(0,T)[::-1]:t torch.full((1,), i, devicedevice, dtypetorch.long)#print(t:,t)img sample_timestep(img, t)# Edit: This is to maintain the natural range of the distributionimg torch.clamp(img, -1.0, 1.0)if i % stepsize 0:plt.subplot(1, num_images, int(i/stepsize)1)plt.title(str(i))show_tensor_image(img.detach().cpu())plt.show()if __name__ __main__:# Define beta scheduleT 300betas linear_beta_schedule(timestepsT)# Pre-calculate different terms for closed formalphas 1. - betasalphas_cumprod torch.cumprod(alphas, axis0)# print(alphas_cumprod.shape)alphas_cumprod_prev F.pad(alphas_cumprod[:-1], (1, 0), value1.0)# print(alphas_cumprod_prev)# print(alphas_cumprod_prev.shape)sqrt_recip_alphas torch.sqrt(1.0 / alphas)sqrt_alphas_cumprod torch.sqrt(alphas_cumprod)sqrt_one_minus_alphas_cumprod torch.sqrt(1. - alphas_cumprod)posterior_variance betas * (1. - alphas_cumprod_prev) / (1. - alphas_cumprod)# print(posterior_variance.shape)IMG_SIZE 32BATCH_SIZE 16data load_transformed_dataset(IMG_SIZE)dataloader DataLoader(data, batch_sizeBATCH_SIZE, shuffleTrue, drop_lastTrue)model SimpleUnet()print(Num params: , sum(p.numel() for p in model.parameters()))device cuda if torch.cuda.is_available() else cpumodel.to(device)optimizer Adam(model.parameters(), lr0.001)epochs 1 # Try more!for epoch in range(epochs):for step, batch in enumerate(dataloader): #由于batch 是包含标签的所以取batch[0]#print(batch[0].shape)optimizer.zero_grad()t torch.randint(0, T, (BATCH_SIZE,), devicedevice).long()loss get_loss(model, batch[0], t)loss.backward()optimizer.step()if epoch % 1 0 and step %5 0:print(fEpoch {epoch} | step {step:03d} Loss: {loss.item()} )sample_plot_image(IMG_SIZE)参考文献 https://zhuanlan.zhihu.com/p/630354327](https://zhuanlan.zhihu.com/p/630354327) https://blog.csdn.net/weixin_45453121/article/details/131223653 https://www.cnblogs.com/risejl/p/17448442.html https://zhuanlan.zhihu.com/p/569994589?utm_id0
http://www.dnsts.com.cn/news/19965.html

相关文章:

  • 家具网站建设策划方案哈尔滨信息工程学院地址
  • 东莞网站关键词优化收费网页游戏排行榜2017前十名
  • 备案 手机网站软件开发外包报价
  • 用jsp做学校网站wordpress f5
  • seo网站设计点击软件免费表格模板下载
  • 科技成果展示网站建设方案制作一个网站怎么做的
  • nofollow标签对网站收录的影响谷歌seo算法规则
  • 百度给做网站吗做网站难么
  • 网站建设logo网站密码是什么情况
  • 详情页在线设计网站推荐顺德外贸网站建设
  • 济南正规的网站制作大数据营销是什么
  • 公司找人做网站wordpress特定账户注册
  • 安全生产门户网站建设wordpress页面模板是哪个
  • 网站订单模板网站开发加盟商怎么做
  • 知行网站建设wordpress调用分类标题
  • 上海住建部网站wordpress app插件
  • 网站开发 组织架构福安城乡建设与规划局网站
  • jsp网站怎么做的好看网站购物系统制作雨辰资讯电子商务类网站开发
  • 服装网站开发方案武陟县住房和城乡建设局网站
  • 创意灵感网站重庆网站推广产品
  • 为什么要更新网站修改wordpress入口地址
  • 网站的title网页制作网页
  • 网站首页结构布局哈尔滨建设网站的免费咨询
  • 一个企业做网站的目的网站推广运营公司
  • 达人设计网官方网站python做网站设计
  • 网站图片相册代码wordpress模板制作
  • 公司网站运营公司排名wordpress仿58同城
  • 绵阳网站建设信赖辉煌外贸公司网站怎么做
  • 计算机网站建设维护的基本知识摄影公司网站
  • 手机网站报价单模板下载最近的热点新闻