The brain mechanisms underlying the ability of humans to process faces have been studied extensively in the last two decades. Brain imaging techniques, particularly fMRI (functional Magnetic Resonance Imaging) that possesses high spatial resolution but limited temporal resolution, are advancing our knowledge of the spatial and anatomical organization of face-sensitive brain areas. At the other end are EEG recording techniques with high temporal resolution but poor spatial resolution. They reveal event-related potentials (ERPs) that serve as correlates for various aspects of facial processing. The best-established ERP marker of face perception is N170, a negative component that occurs roughly 170 ms after stimulus onset. Other markers, such as N250, N400 and P600, which occur later than N170, are considered to contribute to face recognition and identification. In a typical fMRI or ERP study, signals are recorded while the subject is exposed to a face or a non-face stimulus to isolate the brain areas that respond differentially to faces (fMRI), and the temporal intervals that exhibit major differences between face and a non-face responses in the data (ERP). Early studies in both fMRI and ERP employed simplistic signal processing techniques, involving multiple instance averaging and then a manual examination to detect differentiating components. More advanced statistical signal analysis techniques were first applied to fMRI signals (for a review see. Systematic analyses generally lagged behind in the ERP domain. A recent principled approach is by Moulson et al, in which the authors applied statistical classification in the form of Linear Discrimination Analysis (LDA) to face/non-face ERPs and obtained classification accuracies of about 70%.
There are two major goals for the current ERP study: First, to emphasize higher brain areas of visual processing at the expense of early visual areas by using a strategically designed control non-face stimulus. Second, and more importantly, this study seeks to apply systematic machine learning and pattern recognition techniques for classifying face and non-face responses in both the spatial and temporal domains. Toward the first major goal, we use a face and a non-face stimulus derived from the well-known vase/faces illusion.
The key feature of the face and non-face stimuli is that they share the same defining contours, differing only slightly in stimulus space. The defining contours are attributed to whatever forms the figure; as the percept alternates spontaneously, these contours are attributed alternately to the faces or to the vase.
This attribution is biased, using minimal image manipulations, toward the vase in or the faces, but the same contours are used in both cases. These shared contours result in relatively similar responses for the faces and vase stimuli and produce a difficult classification task for the ensuing ERP signals. With this choice of face and non-face stimuli, we bias the analysis towards detecting signal differences that are elicited more by the high-level percepts (face or non-face) rather than low-level image differences.
With respect to the second major goal, we note that several types of classifiers have been used in previous EEG classification studies: kNN, logistic regression and multi layer perceptron, support vector machines and LDA. In authors have used group penalty for lasso regression in frequency domain for EEG signal classification and showed the utility of grouping in that domain. The present study extends this systematic use of classifiers by using the ERP signals obtained with the faces/vase stimuli to test three major classification schemes: kNN, L1-norm logistic regression, and group lasso logistic regression. We perform the classification analysis between two classes of face and vase responses agnostically, based on purely statistical estimates, without favoring any sensors or temporal intervals. Our goal is to use the data to point to salient spatio-temporal ERP signatures most indicative of the stimuli classes. Obviously, ERP classification can provide important applications in neuroscience, such as in brain-computer interfaces (BCI) and in detecting pathologies such as epilepsy.
The main results of our tests with the three major classifier schemes are: First, kNN produced the worst performance, having to rely on all, potentially noisy, features. Second, the other two schemes were able to classify the signals with an overall accuracy of roughly 80%. Finally, the learned weights of L1-norm logistic regression and group lasso were able to locate the salient features in space (electrode position) and time in close agreement with the accepted wisdom of previous studies, confirming various markers of face perception such as N170, N400 and P600.
Applying sparse dictionary methods not only improved the classification performance but also pointed out the spatial and temporal regions of high interest. That is, the most discriminant features in time and space (channels) stand for ERPs and active regions of brain that are resposible for distinguishing between a face and non-face. Following figures illustrate them.
Mapping of L1 regression coefficients to channels (first figure) and time (second figure)
Related Publications
- [1] S. Shariat, V. Pavlovic, T. Papathomas, A. Braun and P. Sinha. “Sparse dictionary methods for EEG signal classification in face perception”. Machine Learning for Signal Processing (MLSP), 2010 IEEE International Workshop on. 2010. pp. 331 – 336.