TyranitarX Connect.

InceptionNet

Word count: 587Reading time: 2 min
2019/07/12 Share

InceptionNet又称googleNet,最初的设计思路是增加网络宽度:InceptionNet核心结构包括多个分支,分别对应不同的感受野。大的感受野适用大的目标,小的感受野适用小目标,如此网络具备了scale不变性。
不同感受野最终通过concat合并在一起,为了避免通道数爆炸,在每个分支上引入1x1卷积降低通道数目。

1.提出原因

更深层次的网络更容易过拟合,更深的网络拥有更大计算量。之前的dropout 实现了稀疏网络减少了参数,但是并没有减少计算量。

2.模型结构

v1结构

  • 分组卷积

    一层上同时使用多种卷积核,可以看到各种层级的feature。各组之间feature计算不相互交叉,减少计算量。

v2结构

  • 引入3×3 的视野域同等卷积替换
    )

v3结构

  • 3×3不是最小卷积
    引入 3×3 =1×3 和3×1 参数降低33%

v4结构

  • 引入skip connection (即残差连接)

代码实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
def inception_block(x,output_channel_for_each_path,name):
''' inception block implementation'''
'''
Args:
-x: 输入
-output_channel_for_each_path: 每个组的输出
-name: block名称
'''
with tf.variable_scope(name):
conv1_1 = tf.layers.conv2d(x,
output_channel_for_each_path[0],
(1,1),
strides = (1,1),
padding = 'same',
activation = tf.nn.relu,
name = 'conv1_1'
)

conv3_3 = tf.layers.conv2d(x,
output_channel_for_each_path[1],
(3,3),
strides = (1,1),
padding = 'same',
activation = tf.nn.relu,
name = 'conv3_3'
)

conv5_5 = tf.layers.conv2d(x,
output_channel_for_each_path[2],
(5,5),
strides = (1,1),
padding = 'same',
activation = tf.nn.relu,
name = 'conv5_5'
)
max_pooling = tf.layers.max_pooling2d(x,
(2,2),
(2,2),
name = 'max_pooling'
)
max_pooling_shape = max_pooling.get_shape().as_list()[1:]
input_shape = x.get_shape().as_list()[1:]
width_padding = (input_shape[0] - max_pooling_shape[0]) // 2
height_padding = (input_shape[1] - max_pooling_shape[1]) // 2
padded_pooling = tf.pad(max_pooling,
[
[0,0],
[width_padding,width_padding],
[height_padding,height_padding],
[0,0]
])
concat_layer = tf.concat([conv1_1,conv3_3,conv5_5,padded_pooling], axis = 3)

return concat_layer

x = tf.placeholder(tf.float32, [None, 3072])
y = tf.placeholder(tf.int64, [None])

x_image = tf.reshape(x, [-1, 3, 32, 32])
x_image = tf.transpose(x_image, perm=[0, 2, 3, 1])

conv1 = tf.layers.conv2d(x_image,32,(3,3),strides = (1,1),padding = 'same',activation = tf.nn.relu,name = 'conv1')
pooling1 = tf.layers.max_pooling2d(conv1,(2,2),(2,2),name = 'pooling1')

inception_block1a = inception_block(pooling1,[16,16,16],'inception1')
inception_block1b = inception_block(inception_block1a,[16,16,16],'inception2')

pooling2 = tf.layers.max_pooling2d(inception_block1b,(2,2),(2,2),name = 'pooling2')

inception_block2a = inception_block(pooling2,[16,16,16],'inception3')
inception_block2b = inception_block(inception_block2a,[16,16,16],'inception4')

pooling3 = tf.layers.max_pooling2d(inception_block2b,(2,2),(2,2),name = 'pooling3')

# flatten [None , 4 * 4 * 32] 全连接层
flatten = tf.layers.flatten(pooling3)

y_ = tf.layers.dense(flatten, 10)

loss = tf.losses.sparse_softmax_cross_entropy(labels=y, logits=y_)
predict = tf.math.argmax(y_, 1)

correct_prediction = tf.equal(predict, y)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float64))

with tf.name_scope('train_op'):
train_op = tf.train.GradientDescentOptimizer(0.01
).minimize(loss)
CATALOG
  1. 1. 1.提出原因
  2. 2. 2.模型结构
    1. 2.1. v1结构
    2. 2.2. v2结构
    3. 2.3. v3结构
    4. 2.4. v4结构
    5. 2.5. 代码实现