Multimedia Systems

Research Topics

Multimedia Signal Processing for Big Data

Clearly, we live in the age of a data deluge: users of social networks upload hundreds of millions of multimedia items every single day. Decoding the human genome requires analyzing 3 billion base pairs. The large hadron collider at CERN records about 800 gigabytes per second. It is expected that the global internet traffic in 2016 exceeds the Zettabyte barrier significantly (one Zettabyte equals 1021 bytes).

In view of this data deluge we face the fundamental question of how to process and how to interpret big data. Our research focuses on the concept of sparsity: loosely speaking, sparsity refers to the fact that a seemingly complex object can be represented by few numbers with respect to a suitable representation system. This idea is the core behind some rather popular every-day technologies like JPEG or MP3 that exploit the fact that natural images and sounds allow for a sparse representation.

Compressed Sensing (CS):

Compressed sensing is about (re-)sampling (digital) signals at a sampling rate far below the classical sampling theorem; perfect (or at least very high quality) recovery is still possible, with suitable structure (such as “sparsity”) in the signal. In practical applications (such as MRI scanners) the vector dimension of the signals can be very large (100000 and more), so extremely efficient recovery algorithms are crucial as well as good dictionaries to perform the sampling process, which is in fact a multiplication with a measurement matrix (the dictionary).

We work on fast recovery of very high-dimensional signals with different types of “structure” (including sparsity and density) by iterative algorithms, in particular the application of approximate message passing (AMP) to problems in image processing but also in communications. An example is the detection of radio-frequency identification tags: the classical problem of “collisions” is re-interpreted as a compressed sensing measurement, and the application of AMP allows to detect a sparse selection of RFID tags form a huge list that in fact forms the compressed sensing measurement matrix.

 Machine Learning:

While CS exploits sparsity to actively compress the observed signals, one can also use the raw data directly for extracting the inherent (sparse) structure to obtain a sparse model for the observed data. This can be cast as an (unsupervised) machine learning problem based on unlabeled data. In particular, we focus on two specific machine learning methods:

Dictionary Learning: the idea is to represent the observed signals as sparse linear combinations of a single underlying dictionary. Thus, a sort of analog sparse source coding is performed, with the dictionary representing the source code. The problem is that the underlying dictionary is unknown and has to be determined based on the data. We investigate fundamental limits on how accurately this is possible.

Sparse Graphical Models: A popular way of representing complex systems with a large number of components (nodes) and complicated interactions between them are graphical models. If we assume that the observed data are realizations of a random vector with a fixed probability distribution, the problem of graphical model selection is to determine the underlying graphical model. In particular, we are interested in the conditional independence graph (CIG) of a random process. Two nodes of a CIG are connected if the corresponding random variables are conditionally dependent, given the remaining variables.

Multimedia Transmission

Source and Channel Coding:

Error-correction channel coding has a key role in digital communication systems. Due to delay constraints of the applications, codes with small-to-medium block size are of particular interest. A special problem, which is very important in practice, is the realisation of a flexible “adjustable” code rate, as the time-variant fading channels necessarily require adaptive modulation and coding. We investigate Low-Density Parity-Check (LDPC) codes specifically constructed for those situations. A key problem of code design is to avoid short cycles in the code graph in order to allow for better decoding results. On the one hand we investigate code designs that avoid short cycles by construction, on the other hand we have introduced a transformation-based representation of quasi-cyclic LDPC codes that allows for a simple cycle analysis that can be used in numerical code design. We also consider extensions of those techniques to design codes with adjustable code rate.

Another field of research in coding are codes and soft-encoding algorithms for systems in which the transmission from the source to the sink is supported by relays. One problem is that the data received and decoded at the relay may not be error free. Therefore it is desirable to keep and forward the decoded soft-information when the data is re-encoded at the relay. A particular problem in this context is how to combine information received directly and received from the relay.

Beyond the conceptual side of research in algorithms, we also work on implementing channel encoders and decoder by programmable hardware (FPGA, Field-Programmable Gate Arrays). The goal is a highly efficient realisation of practically relevant LDPC channel encoders and decoders in order to investigate their performance at low bit-error rates such as 10-6 … 10-8 and to compare measurements with analytical results.

Cross-Layer Design:

The classical design paradigm for communication networks is the Open System Interconnection (OSI) reference model. The basic notion is to define separate network layers with clear-cut interfaces and to optimise those layers independently. The major advantage of this principle lies in its simplicity and its universal applicability. The drawback is that the best possible performance will not always be achieved, and this particularly applies to wireless networks. This is to say that, in the latter, there is the notion of “Multiuser Diversity” which allows for very significant performance gains due to the fact that many users are served by the system who all have time varying channels, one of which is very likely to have high quality at any time instant. High gains in total throughput can be achieved when at each time instant the user is served who currently has the best channel. This concept does, however, not necessarily maximise quality-of-service for all users. A good example is delay: if the latency constraints of an application are very stringent (e.g. phone call), there may not be enough time to wait for a better channel, and in those situations the user will have to be scheduled for channel access even though the channel might be of bad quality. Compared to throughput maximisation this implies an “inefficient” use of more power. Our research in the field has the goal to find, by novel Cross-Layer scheduling and resource-allocation algorithms, good trade-offs between multiuser diversity gains and quality-of-service for all users. The solutions we investigate are guided by concepts from multiuser information theory; the theoretical results have to be adapted for the problem at hand to include practical constraints (such as delay limits).