
conv = common . convolutional (conv , (3, 3, 128, 256) )
conv = common . convolutional (conv , (1, 1, 256, 128) )
conv = common . convolutional (conv , (3, 3, 128, 256) )
conv = common . convolutional (conv , (1, 1, 256, 128) )
conv_sobj_branch = common . convolutional(conv , (3, 3, 128, 256))
conv_sbbox = common.convolutional(conv_sobj_branch , (1, 1, 256, 3*( NUM_CLASS +5)), activate=
,→ False , bn=False)
return [conv_sbbox , conv_mbbox , conv_lbbox]
End of Function
def decode ( conv_output , i=0):
"""
return tensor of shape [batch_size , output_size , output_size , anchor_per_scale , 5 + num_classes]
contains (x, y, w, h, score , probability)
"""
conv_output is the output of YOLOv3 (conv_sbbox, conv_mbbox or conv_lbbox).
conv_shape = tf.shape(conv_output)
batch_size = conv_shape[0]
output_size = conv_shape[1]
conv_output = tf. reshape(conv_output , (batch_size , output_size , output_size , 3, 5 + NUM_CLASS ))
conv_raw_dxdy = conv_output [:, :, :, :, 0:2]
conv_raw_dwdh = conv_output [:, :, :, :, 2:4]
conv_raw_conf = conv_output [:, :, :, :, 4:5]
conv_raw_prob = conv_output [:, :, :, :, 5: ]
y = tf.tile(tf.range(output_size , dtype=tf.int32)[:, tf.newaxis], [1, output_size ])
x = tf.tile(tf.range(output_size , dtype=tf.int32)[tf.newaxis , :], [output_size , 1])
For example, let’s take output_size = 13, then
y = np.tile(np.arange(13)[:, np.newaxis], [1, 13])
and
x = np.tile(np.arange(13)[np.newaxis, :], [13, 1])
are respectively:
[[ 0 0 0 0 0 0 0 0 0 0 0 0 0] [[ 0 1 2 3 4 5 6 7 8 9 10 11 12]
[ 1 1 1 1 1 1 1 1 1 1 1 1 1] [ 0 1 2 3 4 5 6 7 8 9 10 11 12]
[ 2 2 2 2 2 2 2 2 2 2 2 2 2] [ 0 1 2 3 4 5 6 7 8 9 10 11 12]
[ 3 3 3 3 3 3 3 3 3 3 3 3 3] [ 0 1 2 3 4 5 6 7 8 9 10 11 12]
[ 4 4 4 4 4 4 4 4 4 4 4 4 4] [ 0 1 2 3 4 5 6 7 8 9 10 11 12]
[ 5 5 5 5 5 5 5 5 5 5 5 5 5] [ 0 1 2 3 4 5 6 7 8 9 10 11 12]
[ 6 6 6 6 6 6 6 6 6 6 6 6 6] [ 0 1 2 3 4 5 6 7 8 9 10 11 12]
[ 7 7 7 7 7 7 7 7 7 7 7 7 7] [ 0 1 2 3 4 5 6 7 8 9 10 11 12]
[ 8 8 8 8 8 8 8 8 8 8 8 8 8] [ 0 1 2 3 4 5 6 7 8 9 10 11 12]
[ 9 9 9 9 9 9 9 9 9 9 9 9 9] [ 0 1 2 3 4 5 6 7 8 9 10 11 12]
[10 10 10 10 10 10 10 10 10 10 10 10 10] [ 0 1 2 3 4 5 6 7 8 9 10 11 12]
[11 11 11 11 11 11 11 11 11 11 11 11 11] [ 0 1 2 3 4 5 6 7 8 9 10 11 12]
[12 12 12 12 12 12 12 12 12 12 12 12 12]] [ 0 1 2 3 4 5 6 7 8 9 10 11 12]]
For x and y we expand dimension again along the last axis (break every single element into a bracketed element)
before concatenation:
xy_grid = tf.concat([x[:, :, tf. newaxis], y[:, :, tf.newaxis ]], axis=-1)
At this point, xy_grid is (13, 13, 2) dimensional.
xy_grid = tf.tile(xy_grid [tf.newaxis , :, :, tf.newaxis , :], [ batch_size , 1, 1, 3, 1])
xy_grid = tf.cast(xy_grid , tf.float32)
Now xy_grid is (batch_size, 13, 13, 3, 2) dimensional. Recall that
5