Next: Results Up: Stochastic Time-Frequency Dictionaries for Previous: Discrete Gabor dictionary

Matching Pursuit with stochastic dictionaries

In forming the dictionary used for MP decomposition, using any fixed scheme of subsampling the parameter space introduces statistical bias in the resulting parametrization. This bias becomes apparent only when statistics comes into play, like in parameterization of large amounts of data--examples are given in the next section. Unbiased MP decompositions can be obtained by an analogue of Monte Carlo methods.

We propose MP with stochastic dictionaries, where the parameters of a dictionary's atoms are randomized before each decomposition. A stochastic dictionary is constructed for a given signal length and chosen ''resolutions'' in time, frequency and scale ( $\Delta t, \Delta \omega$ and $\Delta s$ ). The space of parameters , $\omega$ and is thereby divided into $\frac{\Pi}{\Delta \omega}\frac{N^2}{\Delta t \Delta s}$ bricks of size $\Delta t$ by $\Delta \omega$ by $\Delta s$ each, where $\omega \in (0, \pi)$ . In each of those bricks, one time-frequency atom is chosen by drawing its parameters from flat distributions within the given ranges of continuous parameters.

Ater the first iteration, the dictionary is reduced to a subset $D_\alpha$ , which is constructed by choosing an arbitrary percentage of atoms having the largest correlations with the original signal. In each iteration, parameters of the atom $g_{\gamma_i}$ chosen from $D_\alpha$ (or from in the first iteration) are further optimized by a complete search in a dense dictionary $D_{\gamma_i}$ constructed from parameters in the neighborhood of $\gamma_i$ .

Numerical implementation uses the formula for fast correlation updates given in [1] and calculation of some of the products in the FFT domain (for long atoms). In spite of that, the computational complexity of this aggregate procedure was empirically approximated as ${\mathcal O}\left(N {\mathrm size}(D)\right)$ per iteration, where and ${\mathrm size}(D)$ stand for sizes of the signal and the dictionary, correspondingly.

We must stress at this point that the whole idea was not driven by a quest for improving the speed/compression ratio. We are interested in attaining possibly exact parametrization of certain signal structures, free from bias and with a constant time-frequency resolution, primarily for research purposes. Routine applications, like e.g. parametrization of selected EEG structures for clinical purposes, must be built upon a solid framework, where speed improvements at the cost of introducing artifacts, such as the one presented in the next section, are out of the question. Therefore, optimization of this implementation was limited by the basic requirements: stability and uniform, predictable resolution. The latter led to a choice of uniform (before randomization) grids of parameters , $\omega$ and , contrary to the generally preferred structured dictionaries [1][6].

Footnotes

... ''resolutions'': The resolution of the matching pursuit is hard to define in general, since the procedure is nonlinear and signal-dependent. It should be related to the distance between neighboring dictionary waveforms available for decomposition. In the described procedure, this distance does not exceed twice the value of the corresponding parameter ( $\Delta t$ , $\Delta \omega$ or $\Delta s$ ).

Next: Results Up: Stochastic Time-Frequency Dictionaries for Previous: Discrete Gabor dictionary

Piotr J. Durka 2001-03-23