7.网络优化与正则化

image-20250619202031965

image-20250619202112131

7.1神经网络优化的特点

image-20250619202520742

image-20250619202656219

image-20250619202800500

所以找个平坦最小值就好了,不一定需要全局最小值

image-20250619202850140

image-20250619202938548

7.2 优化算法改进

image-20250619203117763

image-20250619203146508

image-20250619203422727

image-20250619203453878

image-20250619203712390

7.3 动态学习率

image-20250619203749004

image-20250619203829729

image-20250619203842400

image-20250619203942843

总体趋势还是减少的。时不时变大是为了找到更好的局部最优

image-20250619204056543

image-20250619204120051

image-20250619204234445

7.4 梯度方向优化

image-20250619204841103

效果比随机梯度要好

image-20250619204857349

image-20250619205036035

image-20250619205049830

image-20250619205120833

image-20250619205308436

7.5 参数初始化

image-20250619205516613

image-20250619205543929

image-20250619205731412

image-20250619205816067

image-20250619205912177

上图通常用在循环网络中

7.6 数据预处理

image-20250619210046839

问题就是会对参数初始化产生一定的影响,也会对优化产生一定的影响

image-20250619210213042

image-20250619210331758

标准差为0的数据没啥意义,直接就扔了

7.7 逐层规范化

image-20250619210415982

image-20250619210614571

image-20250619210657902

image-20250619210819352

image-20250619210845672

image-20250619211017841

7.8 超参数优化

image-20250619211206730

image-20250619211411234

image-20250619211727490

7.9 正则化

image-20250619212852039

image-20250619212919360

image-20250619213041301

image-20250619213123612

image-20250619213157959

7.10 暂退法

可以提高网络的泛化能力

image-20250619213339320

image-20250619213351117

image-20250619213445112

image-20250619213500484

7.11 l1和l2正则化

image-20250619213739900

image-20250619213809078

image-20250619213820241

image-20250619213907454

7.12 数据增强

可以增强模型泛化能力

image-20250619213947042

image-20250619214025829

image-20250619214117339