• No results found

Multi-view Traffic Sign Detection, Recognition and 3D Localisation

N/A
N/A
Protected

Academic year: 2021

Share "Multi-view Traffic Sign Detection, Recognition and 3D Localisation"

Copied!
26
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Multi-view Traffic Sign Detection, Recognition and 3D Localisation

Radu Timofte, Karel Zimmermann, and Luc Van Gool

(2)

Problem definition

Input: Large set of views and corresponding camera locations

Output: List of traffic signs

(3)

Outline

Single view

• Segmentation – very fast bounding box selection process with FN -> 0.

– Traffic signs are designed to be well distinguishable from background => have distinctive colors and shapes.

• Detection – Adaboost classifiers of bounding boxes.

• Recognition – based on SVM classifiers.

Multi-view

• Global optimization – over single-view detections constrained by 3D geometry

(4)

Color-based segmentation (thresholding)

• Estimation of connected components of a thresholded image (T = [0.5,0.2,-0.4,1.0]T)

(5)

Shape-based segmentation

• Searching for specific shapes (rectangles, circles, triangles).

– Not all traffic signs are locally threshold separable.

– More time consuming, many responses for small shapes (every small region is approximately some basic shape).

(6)

Learning segmentation

• There are thousands of possible settings of such methods, e.g. different projections from color space.

• Learning is searching for a reasonable subset of these methods/settings.

• Optimal trade-off among FN, FP and the number of methods:

T* = argmin (FP(T) + K1FN(T)+K2card(T))

• Boolean Linear Programming selects ≈ 50 methods out of 10000 in 2 hours.

• Segmentation results for example:

FNBB = 1.5%, FP = 3443/ 2Mpxl image, (FNTS = 0.5%)

(7)

How does the output of

segmentation looks like?

(8)

Detection

• Detection: suppression of bounding boxes which does not like a traffic sign.

– Haar features computed on each channel of HSI space.

– Separated shape-specific cascades of Adaboost classifiers.

• Detection (+segmentation) results:

(9)

How does the output of detection

looks like?

(10)

3D optimization - introduction

• Single view detection and recognition is just

preprocessing, the final decision is subject of the global optimization over multiple views.

• The idea is based on Minimum Description

Length, i.e. explaining detected bounding boxes by the lowest number of real world traffic signs.

• If detections satisfy some visual and geometrical constrains, then all of these detections are

explainable by one real world traffic sign.

(11)

Minimum Description Length in 3D

(12)

Minimum Description Length in 3D

(13)

Minimum Description Length in 3D

(14)

Minimum Description Length in 3D

(15)

Minimum Description Length in 3D

(16)

Minimum Description Length in 3D

(17)

Minimum Description Length in 3D

(18)

Minimum Description Length in 3D

(19)

Minimum Description Length in 3D

(20)

Problem formulation

max XT X

(21)

Example with 16 views

(22)

Example with 16 views

(23)

Results

• The summary of 3D results:

• The average accuracy of 3D localisation is of 24.54 cm.

(24)

Visualisation of 3D results in one

camera

(25)

3D visualisation

(26)

Conclusions

• Traffic Sign Detection, Recognition and 3D Localisation is a challenging problem.

• We propose a multi-view scheme, which combines 2D and 3D analysis.

• The main contributions are:

– Boolean Linear Programming formulation for fast candidate extraction in 2D

– Minimum Description Length formulation for best 3D hypothesis selection

• Work in progress…

Referenties

GERELATEERDE DOCUMENTEN

All of these ten complete sign-symmetric signed graphs can be obtained by joining a vertex to a complete signed graph of order 8 whose negative edges induce a self-complementary

Multiview Traffic Signs Detection / Recognition Introduction Single-view Segmentation Detection and Recognition Multi-view 2D optimization 3D optimization Geometrical

Meaning of the above used abbreviations is the following colour means method described in Section 3.1, TMSER stands for TMSER(ǫ, ∆) = TMSER(0.1, 0.2) , shape is Section 3.2. FN-BB

The research question was: to what extent does the institutional setting of a host country determine the lobbying activities of European commercial diplomats in the case of

Using the generic HOG detection algorithm, we train a different detector for each class from the positive object samples and a common set of negative samples in the form of

Zoals vermeld is een grachtsite te zien op historische kaarten en worden deze sporen geïdentifi­ ceerd met het Oude Herlegem of Beerlegem, terwijl Nieuw Herlegem zich vanaf de late

To provide increased flexibility it is also possible to import external spike templates into a recording. This feature allows for the creation of ground-truth data without the need

In zo’n vergelijking kunnen ook meerdere afgeleiden (van verschillende orde) voorkomen. In de eerste drie DV - en zijn de coëfficiënten van de afgeleiden en de functie