14.2. 实例分析¶
14.2.1. SSD300目标检测网络分析¶
SSD300目标检测网络的计算量主要在卷积层, 下表给出了SSD300网络的各层计算量分析结果, 包括输入输出大小, 参数量(params), 乘累加量(Fused multiply–add, Madd), 浮点数运算量(FLOPs)以及内存读写量.
| number | module name | input shape | output shape | params | memory(MB) | MAdd | Flops | MemRead(B) | MemWrite(B) | duration[%] | MemR+W(B) | 
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | vgg.0 | 3 300 300 | 64 300 300 | 1792.0 | 21.97 | 311,040,000.0 | 161,280,000.0 | 1087168.0 | 23040000.0 | 9.40% | 24127168.0 | 
| 1 | vgg.1 | 64 300 300 | 64 300 300 | 0.0 | 21.97 | 5,760,000.0 | 5,760,000.0 | 23040000.0 | 23040000.0 | 0.30% | 46080000.0 | 
| 2 | vgg.2 | 64 300 300 | 64 300 300 | 36928.0 | 21.97 | 6,635,520,000.0 | 3,323,520,000.0 | 23187712.0 | 23040000.0 | 9.76% | 46227712.0 | 
| 3 | vgg.3 | 64 300 300 | 64 300 300 | 0.0 | 21.97 | 5,760,000.0 | 5,760,000.0 | 23040000.0 | 23040000.0 | 0.26% | 46080000.0 | 
| 4 | vgg.4 | 64 300 300 | 64 150 150 | 0.0 | 5.49 | 4,320,000.0 | 5,760,000.0 | 23040000.0 | 5760000.0 | 21.20% | 28800000.0 | 
| 5 | vgg.5 | 64 150 150 | 128 150 150 | 73856.0 | 10.99 | 3,317,760,000.0 | 1,661,760,000.0 | 6055424.0 | 11520000.0 | 7.03% | 17575424.0 | 
| 6 | vgg.6 | 128 150 150 | 128 150 150 | 0.0 | 10.99 | 2,880,000.0 | 2,880,000.0 | 11520000.0 | 11520000.0 | 0.16% | 23040000.0 | 
| 7 | vgg.7 | 128 150 150 | 128 150 150 | 147584.0 | 10.99 | 6,635,520,000.0 | 3,320,640,000.0 | 12110336.0 | 11520000.0 | 6.33% | 23630336.0 | 
| 8 | vgg.8 | 128 150 150 | 128 150 150 | 0.0 | 10.99 | 2,880,000.0 | 2,880,000.0 | 11520000.0 | 11520000.0 | 0.04% | 23040000.0 | 
| 9 | vgg.9 | 128 150 150 | 128 75 75 | 0.0 | 2.75 | 2,160,000.0 | 2,880,000.0 | 11520000.0 | 2880000.0 | 4.44% | 14400000.0 | 
| 10 | vgg.10 | 128 75 75 | 256 75 75 | 295168.0 | 5.49 | 3,317,760,000.0 | 1,660,320,000.0 | 4060672.0 | 5760000.0 | 2.32% | 9820672.0 | 
| 11 | vgg.11 | 256 75 75 | 256 75 75 | 0.0 | 5.49 | 1,440,000.0 | 1,440,000.0 | 5760000.0 | 5760000.0 | 0.04% | 11520000.0 | 
| 12 | vgg.12 | 256 75 75 | 256 75 75 | 590080.0 | 5.49 | 6,635,520,000.0 | 3,319,200,000.0 | 8120320.0 | 5760000.0 | 3.26% | 13880320.0 | 
| 13 | vgg.13 | 256 75 75 | 256 75 75 | 0.0 | 5.49 | 1,440,000.0 | 1,440,000.0 | 5760000.0 | 5760000.0 | 0.02% | 11520000.0 | 
| 14 | vgg.14 | 256 75 75 | 256 75 75 | 590080.0 | 5.49 | 6,635,520,000.0 | 3,319,200,000.0 | 8120320.0 | 5760000.0 | 3.01% | 13880320.0 | 
| 15 | vgg.15 | 256 75 75 | 256 75 75 | 0.0 | 5.49 | 1,440,000.0 | 1,440,000.0 | 5760000.0 | 5760000.0 | 0.03% | 11520000.0 | 
| 16 | vgg.16 | 256 75 75 | 256 38 38 | 0.0 | 1.41 | 1,108,992.0 | 1,440,000.0 | 5760000.0 | 1478656.0 | 1.74% | 7238656.0 | 
| 17 | vgg.17 | 256 38 38 | 512 38 38 | 1180160.0 | 2.82 | 3,406,823,424.0 | 1,704,151,040.0 | 6199296.0 | 2957312.0 | 1.70% | 9156608.0 | 
| 18 | vgg.18 | 512 38 38 | 512 38 38 | 0.0 | 2.82 | 739,328.0 | 739,328.0 | 2957312.0 | 2957312.0 | 0.02% | 5914624.0 | 
| 19 | vgg.19 | 512 38 38 | 512 38 38 | 2359808.0 | 2.82 | 6,813,646,848.0 | 3,407,562,752.0 | 12396544.0 | 2957312.0 | 3.86% | 15353856.0 | 
| 20 | vgg.20 | 512 38 38 | 512 38 38 | 0.0 | 2.82 | 739,328.0 | 739,328.0 | 2957312.0 | 2957312.0 | 0.02% | 5914624.0 | 
| 21 | vgg.21 | 512 38 38 | 512 38 38 | 2359808.0 | 2.82 | 6,813,646,848.0 | 3,407,562,752.0 | 12396544.0 | 2957312.0 | 4.16% | 15353856.0 | 
| 22 | vgg.22 | 512 38 38 | 512 38 38 | 0.0 | 2.82 | 739,328.0 | 739,328.0 | 2957312.0 | 2957312.0 | 0.02% | 5914624.0 | 
| 23 | vgg.23 | 512 38 38 | 512 19 19 | 0.0 | 0.71 | 554,496.0 | 739,328.0 | 2957312.0 | 739328.0 | 0.99% | 3696640.0 | 
| 24 | vgg.24 | 512 19 19 | 512 19 19 | 2359808.0 | 0.71 | 1,703,411,712.0 | 851,890,688.0 | 10178560.0 | 739328.0 | 1.98% | 10917888.0 | 
| 25 | vgg.25 | 512 19 19 | 512 19 19 | 0.0 | 0.71 | 184,832.0 | 184,832.0 | 739328.0 | 739328.0 | 0.02% | 1478656.0 | 
| 26 | vgg.26 | 512 19 19 | 512 19 19 | 2359808.0 | 0.71 | 1,703,411,712.0 | 851,890,688.0 | 10178560.0 | 739328.0 | 1.03% | 10917888.0 | 
| 27 | vgg.27 | 512 19 19 | 512 19 19 | 0.0 | 0.71 | 184,832.0 | 184,832.0 | 739328.0 | 739328.0 | 0.02% | 1478656.0 | 
| 28 | vgg.28 | 512 19 19 | 512 19 19 | 2359808.0 | 0.71 | 1,703,411,712.0 | 851,890,688.0 | 10178560.0 | 739328.0 | 1.00% | 10917888.0 | 
| 29 | vgg.29 | 512 19 19 | 512 19 19 | 0.0 | 0.71 | 184,832.0 | 184,832.0 | 739328.0 | 739328.0 | 0.02% | 1478656.0 | 
| 30 | vgg.30 | 512 19 19 | 512 19 19 | 0.0 | 0.71 | 1,478,656.0 | 184,832.0 | 739328.0 | 739328.0 | 1.40% | 1478656.0 | 
| 31 | vgg.31 | 512 19 19 | 1024 19 19 | 4719616.0 | 1.41 | 3,406,823,424.0 | 1,703,781,376.0 | 19617792.0 | 1478656.0 | 6.17% | 21096448.0 | 
| 32 | vgg.32 | 1024 19 19 | 1024 19 19 | 0.0 | 1.41 | 369,664.0 | 369,664.0 | 1478656.0 | 1478656.0 | 0.04% | 2957312.0 | 
| 33 | vgg.33 | 1024 19 19 | 1024 19 19 | 1049600.0 | 1.41 | 757,071,872.0 | 378,905,600.0 | 5677056.0 | 1478656.0 | 2.00% | 7155712.0 | 
| 34 | vgg.34 | 1024 19 19 | 1024 19 19 | 0.0 | 1.41 | 369,664.0 | 369,664.0 | 1478656.0 | 1478656.0 | 0.02% | 2957312.0 | 
| 35 | L2Norm | 512 38 38 | 512 38 38 | 512.0 | 2.82 | 0.0 | 0.0 | 0.0 | 0.0 | 0.22% | 0.0 | 
| 36 | extras.0 | 1024 19 19 | 256 19 19 | 262400.0 | 0.35 | 189,267,968.0 | 94,726,400.0 | 2528256.0 | 369664.0 | 0.28% | 2897920.0 | 
| 37 | extras.1 | 256 19 19 | 512 10 10 | 1180160.0 | 0.20 | 235,929,600.0 | 118,016,000.0 | 5090304.0 | 204800.0 | 0.37% | 5295104.0 | 
| 38 | extras.2 | 512 10 10 | 128 10 10 | 65664.0 | 0.05 | 13,107,200.0 | 6,566,400.0 | 467456.0 | 51200.0 | 0.18% | 518656.0 | 
| 39 | extras.3 | 128 10 10 | 256 5 5 | 295168.0 | 0.02 | 14,745,600.0 | 7,379,200.0 | 1231872.0 | 25600.0 | 0.39% | 1257472.0 | 
| 40 | extras.4 | 256 5 5 | 128 5 5 | 32896.0 | 0.01 | 1,638,400.0 | 822,400.0 | 157184.0 | 12800.0 | 0.18% | 169984.0 | 
| 41 | extras.5 | 128 5 5 | 256 3 3 | 295168.0 | 0.01 | 5,308,416.0 | 2,656,512.0 | 1193472.0 | 9216.0 | 1.05% | 1202688.0 | 
| 42 | extras.6 | 256 3 3 | 128 3 3 | 32896.0 | 0.00 | 589,824.0 | 296,064.0 | 140800.0 | 4608.0 | 0.13% | 145408.0 | 
| 43 | extras.7 | 128 3 3 | 256 1 1 | 295168.0 | 0.00 | 589,824.0 | 295,168.0 | 1185280.0 | 1024.0 | 0.12% | 1186304.0 | 
| 44 | loc.0 | 512 38 38 | 16 38 38 | 73744.0 | 0.09 | 212,926,464.0 | 106,486,336.0 | 3252288.0 | 92416.0 | 0.31% | 3344704.0 | 
| 45 | loc.1 | 1024 19 19 | 24 19 19 | 221208.0 | 0.03 | 159,694,848.0 | 79,856,088.0 | 2363488.0 | 34656.0 | 0.23% | 2398144.0 | 
| 46 | loc.2 | 512 10 10 | 24 10 10 | 110616.0 | 0.01 | 22,118,400.0 | 11,061,600.0 | 647264.0 | 9600.0 | 0.51% | 656864.0 | 
| 47 | loc.3 | 256 5 5 | 24 5 5 | 55320.0 | 0.00 | 2,764,800.0 | 1,383,000.0 | 246880.0 | 2400.0 | 0.13% | 249280.0 | 
| 48 | loc.4 | 256 3 3 | 16 3 3 | 36880.0 | 0.00 | 663,552.0 | 331,920.0 | 156736.0 | 576.0 | 0.11% | 157312.0 | 
| 49 | loc.5 | 256 1 1 | 16 1 1 | 36880.0 | 0.00 | 73,728.0 | 36,880.0 | 148544.0 | 64.0 | 0.09% | 148608.0 | 
| 50 | conf.0 | 512 38 38 | 84 38 38 | 387156.0 | 0.46 | 1,117,863,936.0 | 559,053,264.0 | 4505936.0 | 485184.0 | 0.70% | 4991120.0 | 
| 51 | conf.1 | 1024 19 19 | 126 19 19 | 1161342.0 | 0.17 | 838,397,952.0 | 419,244,462.0 | 6124024.0 | 181944.0 | 0.57% | 6305968.0 | 
| 52 | conf.2 | 512 10 10 | 126 10 10 | 580734.0 | 0.05 | 116,121,600.0 | 58,073,400.0 | 2527736.0 | 50400.0 | 0.23% | 2578136.0 | 
| 53 | conf.3 | 256 5 5 | 126 5 5 | 290430.0 | 0.01 | 14,515,200.0 | 7,260,750.0 | 1187320.0 | 12600.0 | 0.14% | 1199920.0 | 
| 54 | conf.4 | 256 3 3 | 84 3 3 | 193620.0 | 0.00 | 3,483,648.0 | 1,742,580.0 | 783696.0 | 3024.0 | 0.11% | 786720.0 | 
| 55 | conf.5 | 256 1 1 | 84 1 1 | 193620.0 | 0.00 | 387,072.0 | 193,620.0 | 775504.0 | 336.0 | 0.09% | 775840.0 | 
| 56 | softmax | 8732 21 | 8732 21 | 0.0 | 0.70 | 550,115.0 | 0.0 | 0.0 | 0.0 | 0.06% | 0.0 | 
| total | 26285486.0 | 207.65 | 62,782,359,651.0 | 31,435,153,596.0 | 0.0 | 0.0 | 100.00% | 542786664.0 | 
SSD300网络总的计算量见下表, 可见网络含有26,285,486参数, 内存占用为 \(207.65 {\rm MB}\), 总乘累加次数为 \(62.78 {\rm G}\), 总的浮点数运算次数为 \(31.44 {\rm GFLOPs}\).
| Total params | Total memory | Total MAdd | Total FLOPs | Total MemR+W | 
|---|---|---|---|---|
| 26,285,486 | 207.65MB | 62.78GMAdd | 31.44GFlops | 517.64MB |