57

CenterNet Keypoint Triplets for Object Detection Review

 4 years ago
source link: https://towardsdatascience.com/centernet-keypoint-triplets-for-object-detection-review-a314a8e4d4b0?gi=8ba9c038b65e
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

The CenterNet paper is a follow-up to the CornerNet . The CornerNet uses a pair of corner key-points to overcome the drawbacks of using anchor-based methods. However, the performance of the CornerNet is still restricted when detecting the boundary of the objects since it has a weak ability referring to the global information of the object. The authors of the CenterNet paper analyzes the performance of the CornerNet. They found out that the false discovery rate of CornerNet on the MS-COCO validation set is high(especially on small objects) due to the proportion of the incorrect bounding boxes.

MZbe6rr.png!web

Diagram of the CenterNet

CenterNet tries to overcome the restrictions encountered in CornerNet. As it’s mentioned in the name, the network uses additional information (centeredness information) to perceive the visual patterns within each proposed region. Now instead of using two corner information, it uses triplets to localize objects. The work states that if a predicted bounding box has a high IoU with the ground-truth box, then the probability that the center keypoint in its central region is predicted as the same class is high, and vice versa. During inference, given the corner points as the proposals, the network verifies whether the corner proposal is indeed an object by checking if there’s a center key point of the same class falling within its central region. The additional use of object centeredness keeps the network as one stage detector but inherits the functionality of RoI polling like it’s used in two-stage detectors.

The diagram of the CenterNet shows an addition of a branch that detects the center heatmap. The branch predicting the corner points works in a similar way as it is described in the CornerNet paper. The CornerNet has two output channels, each used to predict the top left and bottom right corner of the objects. It also predicts the embedding and a group of offsets learn to remap the corners from the heatmaps to the input image.

Center Pooling

6Z3UVr2.png!web

A new pooling method is proposed to capture richer and more recognizable visual patterns. This method is required since the center point of the object does not necessarily convey very recognizable visual patterns. The figure above shows how center pooling is performed. Given the feature map from the backbone layer, we determine if a pixel in the feature map is a center keypoint. The pixel in the feature map itself does not contain enough centeredness information of the object. Therefore, the maximum value of both horizontal and vertical directions are found and added together. By doing this, the authors claim that better detection of center keypoints is made.

Cascade Corner Pooling

In the CornerNet paper, corner pooling is proposed to capture local appearance features in the corner points of the objects. Unlike center pooling that takes maximum values in both horizontal and vertical directions, corner pooling only takes the maximum values in boundary directions.

Enq2uyI.png!web

a) center pooling taking max values in both horizontal and vertical directions b) corner pooling taking max values in boundary directions c) cascade corner pooling taking max values in both boundary directions and internal directions of objects

However, only taking the maximum values in boundary directions makes detection of the corner points sensitive to the edges. To address the problem, Cascade Corner Pooling is proposed. The method is different in a way that, instead of taking the maximum values only in boundary directions, it first looks along a boundary to find a boundary maximum value, then looks inside along the location of the boundary maximum value to find an internal maximum value, and finally, adds two maximum values together.

a) center pooling module and b) the cascade top corner pooling module

Both the center pooling and the cascade corner pooling can be easily achieved by combining the corner pooling in different directions. Figure a) above shows the structure of the center pooling module, and b) shows the structure of a cascade top corner pooling module. Compared with the top corner pooling in CornerNet, left corner pooling is added before the top corner pooling.

Results

The table above shows the performance of CenterNet on MS-COCO dataset. The results demonstrate that CenterNet strengthens the weakness found in CornerNet, and outperforms most of the one-stage methods.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK