14.2. 实例分析¶
14.2.1. SSD300目标检测网络分析¶
SSD300目标检测网络的计算量主要在卷积层, 下表给出了SSD300网络的各层计算量分析结果, 包括输入输出大小, 参数量(params), 乘累加量(Fused multiply–add, Madd), 浮点数运算量(FLOPs)以及内存读写量.
number |
module name |
input shape |
output shape |
params |
memory(MB) |
MAdd |
Flops |
MemRead(B) |
MemWrite(B) |
duration[%] |
MemR+W(B) |
---|---|---|---|---|---|---|---|---|---|---|---|
0 |
vgg.0 |
3 300 300 |
64 300 300 |
1792.0 |
21.97 |
311,040,000.0 |
161,280,000.0 |
1087168.0 |
23040000.0 |
9.40% |
24127168.0 |
1 |
vgg.1 |
64 300 300 |
64 300 300 |
0.0 |
21.97 |
5,760,000.0 |
5,760,000.0 |
23040000.0 |
23040000.0 |
0.30% |
46080000.0 |
2 |
vgg.2 |
64 300 300 |
64 300 300 |
36928.0 |
21.97 |
6,635,520,000.0 |
3,323,520,000.0 |
23187712.0 |
23040000.0 |
9.76% |
46227712.0 |
3 |
vgg.3 |
64 300 300 |
64 300 300 |
0.0 |
21.97 |
5,760,000.0 |
5,760,000.0 |
23040000.0 |
23040000.0 |
0.26% |
46080000.0 |
4 |
vgg.4 |
64 300 300 |
64 150 150 |
0.0 |
5.49 |
4,320,000.0 |
5,760,000.0 |
23040000.0 |
5760000.0 |
21.20% |
28800000.0 |
5 |
vgg.5 |
64 150 150 |
128 150 150 |
73856.0 |
10.99 |
3,317,760,000.0 |
1,661,760,000.0 |
6055424.0 |
11520000.0 |
7.03% |
17575424.0 |
6 |
vgg.6 |
128 150 150 |
128 150 150 |
0.0 |
10.99 |
2,880,000.0 |
2,880,000.0 |
11520000.0 |
11520000.0 |
0.16% |
23040000.0 |
7 |
vgg.7 |
128 150 150 |
128 150 150 |
147584.0 |
10.99 |
6,635,520,000.0 |
3,320,640,000.0 |
12110336.0 |
11520000.0 |
6.33% |
23630336.0 |
8 |
vgg.8 |
128 150 150 |
128 150 150 |
0.0 |
10.99 |
2,880,000.0 |
2,880,000.0 |
11520000.0 |
11520000.0 |
0.04% |
23040000.0 |
9 |
vgg.9 |
128 150 150 |
128 75 75 |
0.0 |
2.75 |
2,160,000.0 |
2,880,000.0 |
11520000.0 |
2880000.0 |
4.44% |
14400000.0 |
10 |
vgg.10 |
128 75 75 |
256 75 75 |
295168.0 |
5.49 |
3,317,760,000.0 |
1,660,320,000.0 |
4060672.0 |
5760000.0 |
2.32% |
9820672.0 |
11 |
vgg.11 |
256 75 75 |
256 75 75 |
0.0 |
5.49 |
1,440,000.0 |
1,440,000.0 |
5760000.0 |
5760000.0 |
0.04% |
11520000.0 |
12 |
vgg.12 |
256 75 75 |
256 75 75 |
590080.0 |
5.49 |
6,635,520,000.0 |
3,319,200,000.0 |
8120320.0 |
5760000.0 |
3.26% |
13880320.0 |
13 |
vgg.13 |
256 75 75 |
256 75 75 |
0.0 |
5.49 |
1,440,000.0 |
1,440,000.0 |
5760000.0 |
5760000.0 |
0.02% |
11520000.0 |
14 |
vgg.14 |
256 75 75 |
256 75 75 |
590080.0 |
5.49 |
6,635,520,000.0 |
3,319,200,000.0 |
8120320.0 |
5760000.0 |
3.01% |
13880320.0 |
15 |
vgg.15 |
256 75 75 |
256 75 75 |
0.0 |
5.49 |
1,440,000.0 |
1,440,000.0 |
5760000.0 |
5760000.0 |
0.03% |
11520000.0 |
16 |
vgg.16 |
256 75 75 |
256 38 38 |
0.0 |
1.41 |
1,108,992.0 |
1,440,000.0 |
5760000.0 |
1478656.0 |
1.74% |
7238656.0 |
17 |
vgg.17 |
256 38 38 |
512 38 38 |
1180160.0 |
2.82 |
3,406,823,424.0 |
1,704,151,040.0 |
6199296.0 |
2957312.0 |
1.70% |
9156608.0 |
18 |
vgg.18 |
512 38 38 |
512 38 38 |
0.0 |
2.82 |
739,328.0 |
739,328.0 |
2957312.0 |
2957312.0 |
0.02% |
5914624.0 |
19 |
vgg.19 |
512 38 38 |
512 38 38 |
2359808.0 |
2.82 |
6,813,646,848.0 |
3,407,562,752.0 |
12396544.0 |
2957312.0 |
3.86% |
15353856.0 |
20 |
vgg.20 |
512 38 38 |
512 38 38 |
0.0 |
2.82 |
739,328.0 |
739,328.0 |
2957312.0 |
2957312.0 |
0.02% |
5914624.0 |
21 |
vgg.21 |
512 38 38 |
512 38 38 |
2359808.0 |
2.82 |
6,813,646,848.0 |
3,407,562,752.0 |
12396544.0 |
2957312.0 |
4.16% |
15353856.0 |
22 |
vgg.22 |
512 38 38 |
512 38 38 |
0.0 |
2.82 |
739,328.0 |
739,328.0 |
2957312.0 |
2957312.0 |
0.02% |
5914624.0 |
23 |
vgg.23 |
512 38 38 |
512 19 19 |
0.0 |
0.71 |
554,496.0 |
739,328.0 |
2957312.0 |
739328.0 |
0.99% |
3696640.0 |
24 |
vgg.24 |
512 19 19 |
512 19 19 |
2359808.0 |
0.71 |
1,703,411,712.0 |
851,890,688.0 |
10178560.0 |
739328.0 |
1.98% |
10917888.0 |
25 |
vgg.25 |
512 19 19 |
512 19 19 |
0.0 |
0.71 |
184,832.0 |
184,832.0 |
739328.0 |
739328.0 |
0.02% |
1478656.0 |
26 |
vgg.26 |
512 19 19 |
512 19 19 |
2359808.0 |
0.71 |
1,703,411,712.0 |
851,890,688.0 |
10178560.0 |
739328.0 |
1.03% |
10917888.0 |
27 |
vgg.27 |
512 19 19 |
512 19 19 |
0.0 |
0.71 |
184,832.0 |
184,832.0 |
739328.0 |
739328.0 |
0.02% |
1478656.0 |
28 |
vgg.28 |
512 19 19 |
512 19 19 |
2359808.0 |
0.71 |
1,703,411,712.0 |
851,890,688.0 |
10178560.0 |
739328.0 |
1.00% |
10917888.0 |
29 |
vgg.29 |
512 19 19 |
512 19 19 |
0.0 |
0.71 |
184,832.0 |
184,832.0 |
739328.0 |
739328.0 |
0.02% |
1478656.0 |
30 |
vgg.30 |
512 19 19 |
512 19 19 |
0.0 |
0.71 |
1,478,656.0 |
184,832.0 |
739328.0 |
739328.0 |
1.40% |
1478656.0 |
31 |
vgg.31 |
512 19 19 |
1024 19 19 |
4719616.0 |
1.41 |
3,406,823,424.0 |
1,703,781,376.0 |
19617792.0 |
1478656.0 |
6.17% |
21096448.0 |
32 |
vgg.32 |
1024 19 19 |
1024 19 19 |
0.0 |
1.41 |
369,664.0 |
369,664.0 |
1478656.0 |
1478656.0 |
0.04% |
2957312.0 |
33 |
vgg.33 |
1024 19 19 |
1024 19 19 |
1049600.0 |
1.41 |
757,071,872.0 |
378,905,600.0 |
5677056.0 |
1478656.0 |
2.00% |
7155712.0 |
34 |
vgg.34 |
1024 19 19 |
1024 19 19 |
0.0 |
1.41 |
369,664.0 |
369,664.0 |
1478656.0 |
1478656.0 |
0.02% |
2957312.0 |
35 |
L2Norm |
512 38 38 |
512 38 38 |
512.0 |
2.82 |
0.0 |
0.0 |
0.0 |
0.0 |
0.22% |
0.0 |
36 |
extras.0 |
1024 19 19 |
256 19 19 |
262400.0 |
0.35 |
189,267,968.0 |
94,726,400.0 |
2528256.0 |
369664.0 |
0.28% |
2897920.0 |
37 |
extras.1 |
256 19 19 |
512 10 10 |
1180160.0 |
0.20 |
235,929,600.0 |
118,016,000.0 |
5090304.0 |
204800.0 |
0.37% |
5295104.0 |
38 |
extras.2 |
512 10 10 |
128 10 10 |
65664.0 |
0.05 |
13,107,200.0 |
6,566,400.0 |
467456.0 |
51200.0 |
0.18% |
518656.0 |
39 |
extras.3 |
128 10 10 |
256 5 5 |
295168.0 |
0.02 |
14,745,600.0 |
7,379,200.0 |
1231872.0 |
25600.0 |
0.39% |
1257472.0 |
40 |
extras.4 |
256 5 5 |
128 5 5 |
32896.0 |
0.01 |
1,638,400.0 |
822,400.0 |
157184.0 |
12800.0 |
0.18% |
169984.0 |
41 |
extras.5 |
128 5 5 |
256 3 3 |
295168.0 |
0.01 |
5,308,416.0 |
2,656,512.0 |
1193472.0 |
9216.0 |
1.05% |
1202688.0 |
42 |
extras.6 |
256 3 3 |
128 3 3 |
32896.0 |
0.00 |
589,824.0 |
296,064.0 |
140800.0 |
4608.0 |
0.13% |
145408.0 |
43 |
extras.7 |
128 3 3 |
256 1 1 |
295168.0 |
0.00 |
589,824.0 |
295,168.0 |
1185280.0 |
1024.0 |
0.12% |
1186304.0 |
44 |
loc.0 |
512 38 38 |
16 38 38 |
73744.0 |
0.09 |
212,926,464.0 |
106,486,336.0 |
3252288.0 |
92416.0 |
0.31% |
3344704.0 |
45 |
loc.1 |
1024 19 19 |
24 19 19 |
221208.0 |
0.03 |
159,694,848.0 |
79,856,088.0 |
2363488.0 |
34656.0 |
0.23% |
2398144.0 |
46 |
loc.2 |
512 10 10 |
24 10 10 |
110616.0 |
0.01 |
22,118,400.0 |
11,061,600.0 |
647264.0 |
9600.0 |
0.51% |
656864.0 |
47 |
loc.3 |
256 5 5 |
24 5 5 |
55320.0 |
0.00 |
2,764,800.0 |
1,383,000.0 |
246880.0 |
2400.0 |
0.13% |
249280.0 |
48 |
loc.4 |
256 3 3 |
16 3 3 |
36880.0 |
0.00 |
663,552.0 |
331,920.0 |
156736.0 |
576.0 |
0.11% |
157312.0 |
49 |
loc.5 |
256 1 1 |
16 1 1 |
36880.0 |
0.00 |
73,728.0 |
36,880.0 |
148544.0 |
64.0 |
0.09% |
148608.0 |
50 |
conf.0 |
512 38 38 |
84 38 38 |
387156.0 |
0.46 |
1,117,863,936.0 |
559,053,264.0 |
4505936.0 |
485184.0 |
0.70% |
4991120.0 |
51 |
conf.1 |
1024 19 19 |
126 19 19 |
1161342.0 |
0.17 |
838,397,952.0 |
419,244,462.0 |
6124024.0 |
181944.0 |
0.57% |
6305968.0 |
52 |
conf.2 |
512 10 10 |
126 10 10 |
580734.0 |
0.05 |
116,121,600.0 |
58,073,400.0 |
2527736.0 |
50400.0 |
0.23% |
2578136.0 |
53 |
conf.3 |
256 5 5 |
126 5 5 |
290430.0 |
0.01 |
14,515,200.0 |
7,260,750.0 |
1187320.0 |
12600.0 |
0.14% |
1199920.0 |
54 |
conf.4 |
256 3 3 |
84 3 3 |
193620.0 |
0.00 |
3,483,648.0 |
1,742,580.0 |
783696.0 |
3024.0 |
0.11% |
786720.0 |
55 |
conf.5 |
256 1 1 |
84 1 1 |
193620.0 |
0.00 |
387,072.0 |
193,620.0 |
775504.0 |
336.0 |
0.09% |
775840.0 |
56 |
softmax |
8732 21 |
8732 21 |
0.0 |
0.70 |
550,115.0 |
0.0 |
0.0 |
0.0 |
0.06% |
0.0 |
total |
26285486.0 |
207.65 |
62,782,359,651.0 |
31,435,153,596.0 |
0.0 |
0.0 |
100.00% |
542786664.0 |
SSD300网络总的计算量见下表, 可见网络含有26,285,486参数, 内存占用为 \(207.65 {\rm MB}\), 总乘累加次数为 \(62.78 {\rm G}\), 总的浮点数运算次数为 \(31.44 {\rm GFLOPs}\).
Total params |
Total memory |
Total MAdd |
Total FLOPs |
Total MemR+W |
---|---|---|---|---|
26,285,486 |
207.65MB |
62.78GMAdd |
31.44GFlops |
517.64MB |