Validation of FaceReader’s Attention Detection: high accuracy for custom expression based on head and gaze direction
Why attention is important
Attention is a crucial metric to assess across many different fields, ranging from marketing to human-computer interaction. In marketing, understanding whether consumers are paying attention to advertisements is crucial for developing effective campaigns. Because consumers who maintain high levels of attention are more likely to develop a positive perception of the brand. Additionally, a valid measure of attention is essential in fields like human-computer interaction, where understanding how people engage with content can help improve user interfaces and training tools.
Attention is often assessed with eye-trackers, however, this requires an additional calibration task and/or additional hardware (though remote eye-tracking is possible). Assessing attention purely from the face, therefore has an added benefit that only a webcam is required. With a facial analysis tool like FaceReader, the possibility of applying these attentional measures is endless. However, for this data to be useful, it’s important to validate the accuracy of these measures. In this project we determined the most beneficial approach for measuring attention and validated this custom expression of Attention.
How the study was conducted
For this project, we used two methods to fill the dataset: 1) we semi-randomly selected participants from previous studies and 2) we collected additional participants that were instructed to vary their attention (e.g. looking away from the screen, looking at their phone). This resulted in a dataset of 101 participants with an average video time of 1.5 minutes. These videos were then manually annotated based on a predetermined protocol.
Manual labeling is a process where human raters review a video, frame-by-frame, to categorize participant’s behavior. In this context, focusing on whether the participant is paying attention or not.This method is considered the gold standard in behavioral research because it relies on human judgment to identify subtle cues. The manual labeling was done by our intern Nicholas Keattch. In addition two colleagues assessed 10% of the dataset to determine the inter-rater reliability (reaching a high agreement score of 99%).
It must be noted that attention is a complex and multidimensional process. It includes for example, covert vs. overt or external vs. internal attention. In this case, we are only measuring the attention that is visibly present (i.e. external overt attention). Whereas when people assess their own attention, they will likely judge their cognitive attentional effort (related to internal attention). Although these overlap, it means we expect that our measures will not be highly correlated with self-reported attention. Thus it is important to measure attention in an implicit (visual attention) and explicit (self-report) way to capture the full story.
Gaze and pose can be used for accurate attention detection
Several custom expressions were created to assess which would have the best quality. To determine the quality of each custom expression, we calculated the sensitivity (i.e. the accuracy for attention classifications), the specificity (i.e. the accuracy for non-attention classifications). There were some participants who had less ideal situations, e.g. due to insufficient lighting or only partly being visible, therefore we take the median score to reduce the effect of outliers.
The custom expressions functionality in the Action Unit module allows for many simple and complex operations. All four example custom expressions work by setting a threshold for some of the inputs (e.g. action units, head orientation) available in FaceReader. Generally, when sensitivity increases, specificity decreases: you either more likely classify attention or non-attention. For example, with a horizontal gaze angle threshold of 25 (i.e. this means that when someone has a horizontal gaze angle of 25 degrees they are considered not to have attention, see below), specificity is higher than when this threshold is set to 30 degrees. This allows for the user to determine what they find more important. The results of the analysis (see graph) indicated that using only the head rotation measures (yaw and pitch) gave high sensitivity (0.99), but very low specificity (0.28). Adding the gaze direction improved the specificity a great deal (0.59-0.85).
In some cases, a person can move their face away from the screen, but still keep their eyes focused on the screen (see example below). By adding an additional calculation it is possible to control for these more difficult situations (V4 in the graph). This also makes it possible to control for other existing biases, however, it does make the custom expression a bit more difficult to understand. In conclusion, this is a well-validated custom expression to use with a high average accuracy of 0.87, optimizing sensitivity (0.88) and specificity (0.85).
How to use attention in your own study
Are you excited to incorporate this metric in your own marketing or psychological research? You can easily do so! If you already own FaceReader desktop software, including the Action Unit module, you can get started right away. If you are interested in using one of our already created custom expressions, just contact us. To purchase FaceReader desktop, you can contact our partner Noldus.
You can also use the attention metric in FaceReader Online. FaceReader Online is an online platform tailored for professionals in advertising, market research, and UX design. This platform helps decode consumer responses to different creative materials. By using this metric you can assess which content draws attention, supporting well-informed creative decisions. Request a trial and get started instantly.