Analysing object detectors from the perspective of co-occurring object categories
01 January 2018
The accuracy of state-of-the-art Faster R-CNN and YOLO object detectors are evaluated and compared on a special masked MS COCO data-set to measure how much their predictions rely on contextual information encoded at object category level. Category level representation of context is motivated by the fact that it could be an adequate way transfer knowledge between visual and no-visual domains. According to our measurements, current detectors usually do not build strong dependency on contextual information at category level, however, when they does, they does it in a similar way, suggesting that contextual dependence of object categories is an independent property that can be relevant to be transferred.