P. J. Durka1, H. Klekowicz, K. J. Blinowska, W. Szelenberger, Sz. Niemcewicz
Abstract
We present an efficient parametric system for the automatic
detection of EEG artifacts in polysomnographic recordings. For each of
the selected types of artifacts, a relevant parameter was calculated
for a given epoch. If any of these parameters exceeded a threshold, the
epoch was marked as an artifact. Performance of the system, evaluated on
18 overnight polysomnographic recordings, revealed concordance with
decisions of human experts close to the inter-expert agreement and the
repeatability of expert's decisions, assessed via a double-blind
test. Complete software (Matlab source code) for the presented system
is freely available from the Internet at http://brain.fuw.edu.pl/artifacts.
Experimental data
Each of the 20 polysomnographic recordings of overnight sleep of
healthy volunteers contained 21 EEG channels from the 10-20 system
plus A1/A2 references, two EOG channels, breathing, EMG and
ECG. EEG was sampled at 128 Hz and visually examined for the
presence of artifacts in fixed 4-sec epochs. The data used in this study
was free of ECG artifacts, hence this type was not taken
into account.
For the purpose of evaluation of inter-expert concordance, in three of these recordings artifacts were independently marked by two experts. One of the experts marked artifacts in the same recording after three weeks, not knowing that the filename was changed to satisfy the conditions of a double-blind test. Additionally, in two all-night recordings, different types of artifacts were marked separately, in time intervals related to their actual occurrence, not constrained by fixed epoch length. Only these two recordings were used for optimization of the default values of the thresholds.
ROC curves
In a single EEG epoch, the system can detect an artifact
(positive detection) or not (negative detection). Depending on the
``true'' status of this epoch, as indicated by the expert's decision,
these detections can be true or false, leading to four
possibilities:
(1) true positive (TP), when an artifact was detected in an epoch marked
as an artifact also by the expert,
(2)
false positive (FP), when an artifact was detected in an epoch marked by
the expert as non-artifact EEG,
(3)
true negative (TN), when no artifact was detected in an epoch marked as
EEG, and
(4)
false negative (FN)--no detection in an epoch marked by the expert as
artifact.
We can introduce strict definitions for
true positive proportion:
and
false positive proportion:
.
These two values are on the axes of the ROC (Receiver Operating
Characteristics) curves [1].
Types of artifacts and their parametrization
The above parameters are relatively insensitive to the calibration and other conditions possibly changing between recordings. Reported above values of thresholds were optimized on the two overnight recordings, where different types of artifacts were marked separately. Ranges and default values of these thresholds were unchanged for all the analyzed recordings.
Other parameters (presented below) are directly related to the signal's energy distribution in the frequency or time domain, which for EEG can obviously vary between subjects and recordings. Therefore the actual values of the thresholds for artifact detection were set relative to the statistical properties of signals in each questioned recording. To estimate these statistics, values of given parameter are calculated for all the EEG epochs in each analyzed recording. All the thresholds used for detection of the following artifacts were related to these statistics.
![]() |
(1) |
Distribution of this parameter was estimated separately for each
derivation: it was calculated for all the epochs of a given recording,
and the median
of its distribution in each channel was used
to set the default threshold (different for each channel) as
. Allowed range of this parameter was
.
Estimation of statistical distributions of the remaining parameters (reflecting abrupt slopes, electrode-pop and muscle artifacts) was slightly more complicated. Logarithmic transformation of the values provided distributions close to Gaussian. Based upon the assumption of Gaussian distribution, the variance was calculated only from the data between the first and third quartiles, i.e. neglecting the 25% of tails from each side of the distribution. Such a procedure estimates distribution for the EEG epochs not contaminated by the given type of artifact, if we assume that the parameters related to artifacts fall into the region of outliers.
![]() |
(2) |
Values of all these parameters, reflecting the presence of different types of artifacts, must be combined into the final decision regarding the analyzed epoch: artifact or ``good'' EEG. We chose a simple logical alternative, i.e. if any of the parameters exceeded the corresponding threshold, the epoch was marked as an artifact.
Setting of the mentioned threshold values draws a border between EEG and artifacts. Thresholds can be adapted/corrected for the changing environments by the encephalographers who use the system--this procedure was implemented in a graphical user interface. Each of the parameters can be changed separately from the others, so the relative sensitivity of the system to different artifacts can be controlled.
Results
|
To quantify the coherency of visual detection and its concordance with the automatic procedure, we approximated the expert-expert and expert-system ROC by curves fitted numerically to the available points: their shape reflected ROC in the idealized case of classification of items from two overlaping Gaussian distributions. This allowed for calculation of the AUC (area under the curve) parameter proposed in [1]. Resulting values were 0.954 for expert-expert and 0.915 for average of expert-system concordance. In case of a particular priority like e.g. obtaining very ``clear'' EEG at a cost of tagging some more of the ``good'' EEG as artifacts, it is easy to set the thresholds in a way to move the system to another point on the ROC curve, where the proportion of true-positive detections reaches almost arbitrary value.
Discussion
The importance of the problem of EEG artifacts is generally acknowledged
(for a review see e.g. [2]). The proposed system operates in
the space of simple parameters and gives promising, repeatable
results. Easy adjustment of settings depending on priority (very
strict rejection of artifacts or availability of more EEG data)
allows us to get almost arbitrary small ratios of either true-positive
or false-positive detections. Artifacts occurring in some electrodes
can be ignored by restricting the set of derivations taken into
account. This feature can be useful when only a subset of EEG
derivations will be used in further analysis. After setting these
parameters (thresholds and rejected derivations) the procedure is
fully repeatable and insensitive to any arbitrary factor, like e.g.
context of current sleep stage (limited repeatability of sleep
staging is discussed in [3]).
In its current form, the presented system can be used for pre-screening large and uniform datasets. However, the proposed approach has some limitations:
These simplifications were driven by the main goal of portability to different environments (montages and data formats). This goal is crucial, since the system is intended for testing in different laboratories on different types of recordings, including e.g. patients with epileptic activities and receiving CNS-activating medications. This communication is the first step towards a generally applicable and robust system, as well as an invitation for those interested in improving the software (freely available at http://brain.fuw.edu.pl/artifacts) or in cooperation in its clinical evaluation.
Acknowledgements
This work was supported by a grant of Committee for Scientific
Research (Poland) to the Institue of Experimental Physics, Warsaw
University. The EEG was recorded on equipment donated by the
AJUS&KAJUS Foundation, in memory of Prof. Andrzej Jus, M.D., the
pioneer of Polish Clinical EEG and the first person to introduce
polygraphic studies of sleep in Poland.