Results and Discussion

Example applications of the proposed methodology to the datasets described in sections II-E.1 and II-E.2 are presented in Figures 2-4. These plots were constructed only for some of the possible parameters; results and graphics for other parameters can be easily reproduced using the software and data available via Internet.

Figures 2 and 3 present results for the Dataset I (section II-E.1) in the frequency range 0-40Hz. Normally it is enough to investigate only the frequency range of interest, e.g. from 5 Hz up, but we wanted to show that the applied statistical procedures are robust also in the low frequencies. Since the signals were not detrended before decomposition, we have most of the energy concentrated in low frequencies. This deteriorates significantly the possibility of presentation of the whole energy spectrum at once, so for the display (panels a) we used the logarithmic scale (for all the further computations the actual values of energy were used). Statistically significant regions in Figure 2f clearly relate to the known phenomena: $\mu$ desynchronization (marked as A), desynchonization of the $\mu$ harmonic (B), post-movement $\beta$ synchronization (C) and desynchronization of the harmonic of $\beta$ (D). We observe that the low-frequency non-stationarities present e.g. around the 5th second (probably movement artifact) do not show up as a statistically significant effects.

Similarly, Figures 4 and 5 present results for the Dataset II (section II-E.2) in the same frequency range. This dataset was collected with longer inter-movement intervals so we could analyze longer epochs. As expected, we have no significant effects more than 1-2 seconds away from the movement onset, except for the two resels present in the STFT results (Figure 5b)--these can be attributed to the 5% of false discoveries (section II-D.5).

**Figure 2:** Calculating the high resolution ERD/ERS from the MP decomposition in statistically significant regions (Dataset I, sec. II-E.1). a) average time-frequency energy density approximated from the MP decomposition (eq. 5), for clarity presented in the logarithmic scale (in further computations the actual values of energy are used). Reference epoch marked by black vertical lines, movement onset in the fifth second marked by white dashed line. b) energy from (a) integrated in resels 0.25s $\times$ 2Hz c) average values of ERD/ERS calculated for the time from the end of the reference epoch to the end of the recorded epoch (black dashed vertical lines in a and b, the last resel dropped to avoid border conditions) d) ERD/ERS from (c) indicated as statistically different from the reference epoch by the pseudo- bootstrap procedure (sec II-D.3) corrected by a 5% FDR (sec. II-D.5) e) high resolution map of ERD/ERS calculated from (a) f) high resolution ERD/ERS in statistically significant regions from (d): A-- $\mu$ desynchronization, B--desynchonization of the $\mu$ harmonic, C--post-movement $\beta$ synchronization, D--harmonic of $\beta$ . Horizontal scales in seconds, vertical in Hz.
$\includegraphics[width=\textwidth]{fig/fig2.eps}$

**Figure 3:** (a) STFT estimate of power displayed in the logarithmic scale for the Dataset I (same as in Figure 2). (b) ERD/ERS calculated for the same epoch as in Figure 2), displayed for resels revealing significant change (Section II-D.2) corrected by a 5% FDR (Section II-D.5). Epochs and areas marked as in Figure 2. Horizontal scales in seconds, vertical in Hz.
$\includegraphics[width=\textwidth]{fig/fig3.eps}$

**Figure 4:** MP results for the Dataset II (sec. II-E.2), where long epochs of EEG were recorded prior to the movement to test for the absence of false positive detecions in the stationary pre-movement epoch. a) average time-frequency energy density approximated from the MP decomposition (eq. 5), for clarity presented in logarithmic scale (in further computations the actual values of energy are used). Reference epoch marked by vertical lines. b) High resolution map of ERD/ERS c) High resolution ERD/ERS in statistically significant regions from (b), resel size 0.4s $\times$ 1.25Hz: we observe the $\alpha$ desynchronization (A) and synchronization of $\beta$ in the 18-30Hz band (B), divided in two by the desynchronization of $\alpha$ harmonic in 24 Hz. Horizontal scales in seconds, vertical in Hz.
$\includegraphics[width=\textwidth]{fig/fig4.eps}$

**Figure 5:** STFT results for the Dataset II (sec. II-E.2, movement in the 12th second), the same as analyzed in Figure 4 using MP estimates. The harmonic of $\alpha$ is hardly visible in (a) and (b) and its effect is absent in (c). Two isolated resels indicated as significant can be accounted to the allowed 5% of false discoveries (section II-D.5). Horizontal scales in seconds, vertical in Hz.
$\includegraphics[width=\textwidth]{fig/fig5.eps}$

Time-frequency resolution

Due to the considerations from the section II-B, we calculated the significant changes in resels relating to the same time-frequency resolution for both MP and STFT. However, it by no means implies that the resolution of MP and STFT are leveled by this approach. Within the significant resels of size equivalent to the resolution of the STFT, we can display the fine microstructure revealed by the MP estimator. Also the energy estimated by MP within the resels of the same size as STFT gives higher values of maximum ERD/ERS. Both these effects are clearly visible in Figures 2 and 4, as compared to Figures 3 and 5. For the Dataset I ERD/ERS estimated by STFT reach -51/65%, while MP gives estimates between -90 and 409%. Similarly for Dataset II (Figures 4 and 5) we got -32/44% for STFT and -68/209% for MP.

Nevertheless, in spite of the generally better sensitivity and resolution of MP, we observe that those two methods give similar and consistent results. Taking into account the high computational cost of the MP procedure, we may consider the STFT estimator as an alternative for cases when speed is more important than sensitivity and resolution.

Statistics

In an exploratory approach to the delimination of significant ``bursts'' of energy, statistical tests for different frequency bands cannot be treated separately if we want to talk about some significance level of the whole procedure. On the other hand, dramatic loss of power incurred by the Bonferroni correction in this setup has led to neglect the issue of multiplicity, and hence the lack of a statistically correct way to delimit the significant changes over the entire time-frequequency range of interest.

Results in Figures 2-3 suggest that application of nonparametric statistics combined with properly chosen correction for multiplicity (FDR or Bonferroni-Holmes) preserves the power needed to properly detect significant changes even in the case of a low number of repetitions (57).

Figures 4-5 present the performance of these statistics in a case designed especially to contain large (over 80%) time epochs where no activation was expected. This increases artificially the size of the problem which would make the Bonferroni correction unusable, but FDR still seems to lead to perfectly reasonable results.

Among the proposed and tested methods, bootstrap estimation of the pseudo-

statistics in the reference region (section II-D.3) and FDR correction for multiple comparisons (section II-D.5) seem to be the methods of choice, offering good accuracy at a reasonable computational cost. As expected, FDR proved to be less conservative than Bonferroni-Holmes correction. It provided significances in the area coherent with other studies for the MP estimates. When applied to STFT it usually left out some significances in isolated resels (c.f. Figure 5) unrelated to known physiological phenomena, which can be accounted for by the allowed 5% of false discoveries. Application of the Bonferroni-Holmes correction cleared the dubious resels from Figure 5, but this should not be interpreted as suggesting the use of this correction for the STFT in general.

Reference epoch

The experiment providing the Dataset II (sec. II-E.2) was designed especially to allow different settings of the reference epoch, owing to the long pre-movement epoch of recorded EEG. Figures 4 and 5 present results for 2-seconds long reference epoch, positioned far away from the movement onset. This indicates the robustness of presented methodology, which gives no false positive detections in the long pre-movement epoch.

Availability of such a long pre-movement EEG allows also to test different choices of the reference epoch. We found that all the settings consistent with the general considerations from section II-D.1 (including a 11 sec long reference) give similar results, i.e. resulting statistics designates similar time-frequency area of significant changes. These figures are not presented, but experiments with different setting of this and other parameters on the datasets used in this study can be easily reproduced using the software and datasets freely available via Internet.