• No results found

Towards real-time video de-bayering and

N/A
N/A
Protected

Academic year: 2021

Share "Towards real-time video de-bayering and "

Copied!
22
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)
(2)

URBAN WP 1

Towards real-time video de-bayering and

compression: benchmarking and initial

optimizations

(3)

Work package 1: Digital Surveying

• Digital Surveying

– Processing on board of the van

• Multi-video capture

– de-bayering and coding

– Processing on the servers

• 3d reconstruction

– decoding, automatic feature extraction and matching…

(4)

Patrice Rondao Alface

imec restricted 2008 4

Work package 1: Digital Surveying

• Main Goal:

Algorithm optimizations for lower computing complexity and improved efficiency

bitrate quality

performancie

Moore’s law?

ILP, memory and power walls

Parallelization

(5)

Hardware and Parallelization

• Parallel programming

– Not all algorithms are parallelizable (e.g. entropy coding…)

– Modifications to enable parallelism at the expense of quality performances – Different algorithms call for different programming patterns for different

strategies and hardware…

• CPU:

• Multithread/Multicore: OpenMP

• SIMD: MMX/SSE e.g. Intel’s Performance Primitives lib. (IPP)

• GPU

– General Purpose GPU programming – CUDA programming model

(6)

Task 1.1: On-board the recording van

Benchmark on video compression tools

(7)

D1.1.1 quality analysis and selection of compression tools to optimize

• Benchmark of state-of-the-art video compression algorithms:

– Encoders:

• DCT- and block-based algorithms: AVC (H.264)

– AVC (H.264) : reference software JM 13.0, Intel IPP and IMEC encoders

• DWT- based algorithms:

– JPEG 2000 : Kakadu 5.2.4 – PGF : pgfconsole 1.0

– Data:

• 8 x 200 frames at 12 fps in 1628 x 1236 format

(8)

Patrice Rondao Alface

imec restricted 2008 8

D1.1.1 quality analysis and selection of compression tools to optimize

• Related benchmarks:

European Broadcasting Union (EBU) has organized objective and subjective quality experiments for different image/video HD resolutions between

JPEG2000, H.264 (AVC) and MPEG-2. Results due to Feb. 2009

– Rate Distortion (objective metrics only)

• Marpe et al. 2005:

– low and intermediate resolutions: AVC Intra Main Profile, for HD: JPEG2000

• Topiwala 2005 and JVT team of ITU-T 2004:

– AVC HP Intra offers R-D gain around 0.2 and 1 dB in PSNR over JPEG2000

• Ouaret et Ebrahimi 2006:

– JPEG2000 is very competitive with AVC HP Intra

– Objective and subjective assessment:

• Cho et al. 2007 :

– at high bitrates (high quality) AVC distortions are less perceived than for JPEG2000

– At low bitrates, blocking effects of AVC are more annoying than distortions caused by wavelets

(9)

D1.1.1 quality analysis and selection of compression tools to optimize

• Data

Front views

Right views

Left views

Rear views

(10)

Patrice Rondao Alface

imec restricted 2008 10

D1.1.1 quality analysis and selection of compression tools to optimize

• Front and rear views

• Quality measured with SNR on Y component

• Total bits for 200 frames

PSNR (dB)

bits

(11)

D1.1.1 quality analysis and selection of compression tools to optimize

• Speed measurements

– Intel Core 2 Quad CPU @ 2.4 GHz, 2 GB RAM

– CPUs do not reach the real-time constraint at the desired quality

Speed

(fps)

(12)

Patrice Rondao Alface

imec restricted 2008 12

D1.1.1 quality analysis and selection of compression tools to optimize

• Temporal Analysis

PSNR (dB)

Frame number

(13)

D1.1.1 quality analysis and selection of compression tools to optimize

• Conclusions on the benchmark

– Compression performances highly depend on the content:

• Quality - Similar results with Cho et al 2007 (subjective tests):

– For lower bitrates wavelets outperform AVC Baseline Profile and AVC Intra High Profile

– Vice versa for higher bitrates

• Speed/complexity:

– Data dependency!

– Kakadu, PGF and Intel’s AVC are the fastest but do not reach 12 fps

– Imec AVC Baseline Profile is as fast or faster than Intel’s AVC when disabling parallelism

• Temporal stability:

– AVC shows significant variations between frames when compared to Wavelets

– Parallelization of AVC Intra or Intra High Profile on CPU (IPP / Hyperthreading with OpenMP) or GPU (CUDA)?

– More info available in deliverable D 1.1.1

(14)

Task 1.1: On-board the recording van

Towards real-time HD video encoding:

initial optimizations

(15)

Towards real-time HD video encoding

• Simplified acquisition pipeline: objective 12 fps

debayering Storage

(raw data) cameras

compression 3d reconstr.

debayering Storage

cameras

compression 3d reconstr.

on van on van

(16)

Patrice Rondao Alface

imec restricted 2008 16

Towards real-time HD video encoding

Measurements without optimizations

De-bayering : 40 fps AVC encoding : 3 fps

De-bayering is not the bottleneck but:

CPU-GPU Bandwidth limitations!

The best choice for a single functional unit is maybe not the most suitable for the end to end application

170 IPP

500 CUDA

40 CPU

De-bayering fps

195 250

YUV

De-bayering Bandwidth

CUDA

Bayered

YUV

(17)

Towards real-time HD video encoding

• H.264 AVC Intra

(18)

Patrice Rondao Alface

imec restricted 2008 18

Towards real-time HD video encoding

• Intra prediction

– 4x4

– 16x16

(19)

Towards real-time HD video encoding

• Parallelization Opportunities in H.264

– Data hierarchical organization and data- domain decomposition

– Intra coding : frames and slices are independently coded

– Scalability

• CPU: in frames but not in slices (rate distortion performance drops with the number of slices)

• GPU: in blocks but dependencies – Load-balance

• CPU and GPU: Intra “full” search

enables better load-balance than

(20)

Patrice Rondao Alface

imec restricted 2008 20

Towards real-time HD video encoding

• Using IPP blocks

• Using frame parallelism, preliminary results

– With 4 CPUs:

• speed-up 3.2x, 7fps – With 2 CPUs:

• speed-up 1.8x, 4fps

1.4x Intra prediction (one mode)

1.3x De-blocking filter

1.3x Integer Transform and Quantization

3.5x SAD

Module Speed-up module

(21)

Towards real-time HD video encoding

• Conclusion and Next steps

– Combining frame and slice parallelism – Mapping to GPU with CUDA

– Combining OpenMP and CUDA – Prototypes

• M12:

– D1.1.2: Implementation of the flexible compression framework using OpenMAX DL

• M18:

– D1.1.3: Implementation of an optimal compression engine that fits in the GeoAutomation tool-chain.

– Task 2: Processing on the servers

• Planning to improve feature detection with CUDA (SURF or SIFT)

(22)

Patrice Rondao Alface

imec restricted 2008 22

Referenties

GERELATEERDE DOCUMENTEN

In simulated scenarios without a true covariate effect (Effect cov = 0), the false positive rate was quantified as the percentage of datasets for which a significant association (p

Toch blijkt dit niet altijd zo vanzelfsprekend in de langdurende zorg, omdat bij een andere culturele achtergrond soms andere normen en waarden van toepassing zijn..

Since the linear algebra approach links the algebraic variety V (I) to the nullspace of the matrix M δ , it should come as no surprise that the number of solutions of the

In agreement with the separation of the linearly independent monomials into regular and singular roots, the column echelon basis of the null space H can be partitioned

The number of words spoken during the interrogations were counted and compared to those in the written reports; changes in the texts were noted; nonverbal behaviors of both

To ensure that the analysis results carry over to the Creol model, we define conformance between the Real-Time Creol model and its timed automaton abstraction with respect to the

The correct functionality of a real-time system depends on the correct choice of the scheduling strategy; then, one may improve the quality of service by improving on the

Time at your service : schedulability analysis of real-time and distributed services..