• People
  • Research
  • Projects
  • Publications
  • Resources
ViCoS Lab

Multivariate online kernel density estimation

Subtopic of Online learning with mixture models

Researchers

Matej Kristan, PhD
Matej Kristan, PhD
Danijel Skočaj, PhD
Danijel Skočaj, PhD
Aleš Leonardis, PhD
Aleš Leonardis, PhD

We propose a novel approach for online estimation of probability density functions. Our approach is based on the kernel density estimation (KDE) and produces models that enable online adaptation, which at the same time maintain a low (or bounded) complexity that scales sublinearly with the observed samples. As a result we get an Online Gaussian Mixture Model.

Key idea 1: maintain a non-parametric model of the data itself in a form of a sample distribution ps(x) and use this model to calculate the kernel density estimate pkde(x) of the target distribution when required.

Key idea 2: each new observation as a distribution in a form of a Dirac-delta function and we model the sample distribution by a mixture of Gaussian and Dirac-delta functions.

Key idea 3: each component carries with it its detailed model in a form of a two-component mixture. This is used during online operation to detect whether a component is oversmoothing the distribution. In that case, the component is replaced by its detailed model.

Update in a nutshell: (1) we update the sample model with the observed data-point. (2) the updated model is used to recalculate the optimal bandwidth for the KDE. (3) the sample distribution is refined and compressed.

Estimation of N-D distributions:

3D distributions can be fully visualized, or via sets of 2D projections. The image below shows a 3D reference distributions from which we drew 10k samples (left). The KDE estimated from these and the respective projections are also shown (right).

Mode detection:

The oKDE toolbox contains some potprocessing tools for analysis of the estimated density. Below we show an example of mode detection in the oKDE and reclustering of the components that form each of the detected modes. The reclustering is performed either by moment matching (matching first two moments of a Gaussian to the subset of components in GMM) or by L2 distance minimization between (minimizing the l2 distance between a Gaussian and the subset of components in the GMM).

Density estimation examples

A 1D nonstationary density estimation:

A 3D and 2D statinary density estimation:

Online resources

The Java implementation of the oKDE (implemented by Jonas Luthke) is available from the following GIT hub repository:

  • Github project: https://github.com/joluet/okde-java
  • library - https://github.com/joluet/okde-java
  • example - https://github.com/joluet/okde-java-example

The C++ implementation of a faster and stable oKDE by Fereira, de Matos nad Riberio is available from the following repository:

  • Link to source and installation: link

Matlab code:

  • A toolbox for online KDE: link
  • Fast batch multivariate bandwidth (kernel size) estimation: link or this link
  • 2D batch bandwidth estimation code: link
  • Unscented Hellinger distance between Gaussian mixture models: link

Publications

Faculty of Computer and Information Science

Visual Cognitive Systems Laboratory

University of Ljubljana

Faculty of Computer and Information Science

Večna pot 113
SI-1000 Ljubljana
Slovenia
Tel.: +386 1 479 8245