Publication detail

BoxCars: Improving Fine-Grained Recognition of Vehicles using 3-D Bounding Boxes in Traffic Surveillance

SOCHOR, J. ŠPAŇHEL, J. HEROUT, A.

Original Title

BoxCars: Improving Fine-Grained Recognition of Vehicles using 3-D Bounding Boxes in Traffic Surveillance

English Title

BoxCars: Improving Fine-Grained Recognition of Vehicles using 3-D Bounding Boxes in Traffic Surveillance

Type

journal article in Web of Science

Language

en

Original Abstract

In this paper, we focus on fine-grained recognition of vehicles mainly in traffic surveillance applications. We propose an approach that is orthogonal to recent advancements in fine-grained recognition (automatic part discovery, bilinear pooling). Also, in contrast to other methods focused on fine-grained recognition of vehicles, we do not limit ourselves to a frontal/rear viewpoint, but allow the vehicles to be seen from any viewpoint. Our approach is based on 3D bounding boxes built around the vehicles. The bounding box can be automatically constructed from traffic surveillance data. For scenarios where it is not possible to use precise construction, we propose a method for an estimation of the 3D bounding box. The 3D bounding box is used to normalize the image viewpoint by "unpacking" the image into a plane. We also propose to randomly alter the color of the image and add a rectangle with random noise to a random position in the image during the training of Convolutional Neural Networks. We have collected a large fine-grained vehicle dataset BoxCars116k, with 116k images of vehicles from various viewpoints taken by numerous surveillance cameras. We performed a number of experiments which show that our proposed method significantly improves CNN classification accuracy (the accuracy is increased by up to 12 percentage points and the error is reduced by up to 50% compared to CNNs without the proposed modifications). We also show that our method outperforms state-of-the-art methods for fine-grained recognition.

English abstract

In this paper, we focus on fine-grained recognition of vehicles mainly in traffic surveillance applications. We propose an approach that is orthogonal to recent advancements in fine-grained recognition (automatic part discovery, bilinear pooling). Also, in contrast to other methods focused on fine-grained recognition of vehicles, we do not limit ourselves to a frontal/rear viewpoint, but allow the vehicles to be seen from any viewpoint. Our approach is based on 3D bounding boxes built around the vehicles. The bounding box can be automatically constructed from traffic surveillance data. For scenarios where it is not possible to use precise construction, we propose a method for an estimation of the 3D bounding box. The 3D bounding box is used to normalize the image viewpoint by "unpacking" the image into a plane. We also propose to randomly alter the color of the image and add a rectangle with random noise to a random position in the image during the training of Convolutional Neural Networks. We have collected a large fine-grained vehicle dataset BoxCars116k, with 116k images of vehicles from various viewpoints taken by numerous surveillance cameras. We performed a number of experiments which show that our proposed method significantly improves CNN classification accuracy (the accuracy is increased by up to 12 percentage points and the error is reduced by up to 50% compared to CNNs without the proposed modifications). We also show that our method outperforms state-of-the-art methods for fine-grained recognition.

Keywords

fine-grained recognition, traffic surveillance, 3D bounding boxes, convolutional neural networks

Released

07.03.2018

Publisher

NEUVEDEN

Location

NEUVEDEN

ISBN

1524-9050

Periodical

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

Year of study

2019

Number

1

State

US

Pages from

97

Pages to

108

Pages count

12

URL

Documents

BibTex


@article{BUT146507,
  author="Jakub {Sochor} and Jakub {Špaňhel} and Adam {Herout}",
  title="BoxCars: Improving Fine-Grained Recognition of Vehicles using 3-D Bounding Boxes in Traffic Surveillance",
  annote="In this paper, we focus on fine-grained recognition of vehicles mainly in traffic
surveillance applications. We propose an approach that is orthogonal to recent
advancements in fine-grained recognition (automatic part discovery, bilinear
pooling). Also, in contrast to other methods focused on fine-grained recognition
of vehicles, we do not limit ourselves to a frontal/rear viewpoint, but allow the
vehicles to be seen from any viewpoint. Our approach is based on 3D bounding
boxes built around the vehicles. The bounding box can be automatically
constructed from traffic surveillance data. For scenarios where it is not
possible to use precise construction, we propose a method for an estimation of
the 3D bounding box. The 3D bounding box is used to normalize the image viewpoint
by "unpacking" the image into a plane. We also propose to randomly alter the
color of the image and add a rectangle with random noise to a random position in
the image during the training of Convolutional Neural Networks. We have collected
a large fine-grained vehicle dataset BoxCars116k, with 116k images of vehicles
from various viewpoints taken by numerous surveillance cameras. We performed
a number of experiments which show that our proposed method significantly
improves CNN classification accuracy (the accuracy is increased by up to 12
percentage points and the error is reduced by up to 50% compared to CNNs without
the proposed modifications). We also show that our method outperforms
state-of-the-art methods for fine-grained recognition.",
  address="NEUVEDEN",
  chapter="146507",
  doi="10.1109/TITS.2018.2799228",
  edition="NEUVEDEN",
  howpublished="print",
  institution="NEUVEDEN",
  number="1",
  volume="2019",
  year="2018",
  month="march",
  pages="97--108",
  publisher="NEUVEDEN",
  type="journal article in Web of Science"
}