next up previous
Next: Time-frequency energy distribution Up: The method Previous: Matching Pursuit algorithm


Discrete dyadic Gabor dictionary

A waveform (atom) from a time-frequency dictionary can be expressed as translation ($u$), dilation ($s$) and modulation ($\omega$) of a window function $g(t) \in L^2(R)$
\begin{displaymath}
g_\gamma (t) = \frac {1} {\sqrt{s}} \: g \left ( \frac {t - u} {s} \right )
e^{i \omega t}
\end{displaymath} (7)

Optimal time-frequency resolution is obtained for gaussian $g(t)$, which for the analysis of real-valued discrete signals gives a dictionary of Gabor functions (sine-modulated gaussians):


\begin{displaymath}
g_\gamma(n)=K(\gamma,\phi)e^{-\pi{ \left( {n-u} \over {s} \right) }^2}
\sin(2 \pi \frac \omega N (n-u)+\phi))
\end{displaymath} (8)

The value of $K(\gamma, \phi)$ is such that $\vert\vert g_{\gamma, \phi}\vert\vert=1$. Complete sampling of discrete parameters $u=1\dots N, \omega=1\dots
N/2, s=1\dots N$, where $N$ is the signal's size in points, produces a huge dictionary even for relatively small $N$. Therefore in the "classical" implementation proposed by Mallat and Zhang [Mallat and Zhang, 1993] dictionary's atoms parameters are chosen from dyadic sequences. For a discrete signal of length $N=2^L$ sampling is governed by a new parameter--octave $j$ (integer). Scale $s$, corresponding to atom's width in time, is chosen from dyadic sequence $s=2^j, 0 \leq j \leq L$. Parameters $u$ and $\omega$, corresponding to atom's position in time and frequency, respectively, are sampled for each octave with an interval $s = 2^j$.

Size of this dictionary (and the resolution of decomposition) can be increased by oversampling by $2^l$ ($l\geq 0$) the time and frequency parameters $u$ and $\omega$. Resulting dictionary has $O(2^{2l}N
\log_2 N )$ waveforms, so the computational complexity increases with oversampling by $2^l$. Time and frequency resolutions increase by the same factor:


$\displaystyle \Delta t = 2^{-l} \frac {2^j} {f_s}, \quad\Delta f = 2^{-l} \frac {f_s} {2^j}$     (9)

where $f_s$ is the sampling frequency of analyzed signal. Resolution is hereby understood as the distance between centers of dictionary's atoms neighboring in time or frequency, and depends on the octave $j$ (scale $s = 2^j$). Scale $s$ in turn corresponds to the width of an atom in time (and frequency). We can define a time width of a time-frequency atom as a half-width of the window function $g(t)$:
\begin{displaymath}
T_{1/2} = 2 \frac {2^j} {f_s} \sqrt{ \frac {\ln 2} {\pi} }
\end{displaymath} (10)

In spite of the oversampling, the algorithm still looks for signal's expansion only over a relatively small subset of the possible dictionary's functions. Issues related to the particular structure of this subset will be discussed in the section ``Stochastic dictionaries''.


next up previous
Next: Time-frequency energy distribution Up: The method Previous: Matching Pursuit algorithm
Piotr J. Durka 2001-06-11