Automatic human affect recognition is a key step towards more natural human-computer interaction. Recent trends include recognition in the wild using a fusion of audiovisual and physiological sensors, a challenging setting for conventional machine learning algorithms. Since 2010, novel deep learning algorithms have been applied increasingly in this field. ... In this paper, we review the literature on human affect recognition between 2010 and 2017, with a special focus on approaches using deep neural networks. By classifying a total of 950 studies according to their usage of shallow or deep architectures, we are able to show a trend towards deep learning. Reviewing a subset of 233 studies that employ deep neural networks, we comprehensively quantify their applications in this field. We find that deep learning is used for learning of (i) spatial feature representations, (ii) temporal feature representations, and (iii) joint feature representations for multimodal sensor data. Exemplary state-of-the-art architectures illustrate the recent progress. Our findings show the role deep architectures will play in human affect recognition, and can serve as a reference point for researchers working on related applications. Read more
The rising prevalence of non-communicable diseases calls for more sophisticated approaches to support individuals in engaging in healthy lifestyle behaviors, particularly in terms of their dietary intake. Building on recent advances in information technology, user assistance systems hold the potential of combining active and passive data collection methods to ... monitor dietary intake and, subsequently, to support individuals in making better decisions about their diet. In this paper, we review the state-of-the-art in active and passive dietary monitoring along with the issues being faced. Building on this groundwork, we propose a research framework for user assistance systems that combine active and passive methods with three distinct levels of assistance. Finally, we outline a proof-of-concept study using video obtained from a 360-degree camera to automatically detect eating behavior from video data as a source of passive dietary monitoring for decision support. Read more
Remote photoplethysmography (rPPG) allows remote measurement of the heart rate using low-cost RGB imaging equipment. In this study, we review the development of the field of rPPG since its emergence in 2008. We also classify existing rPPG approaches and derive a framework that provides an overview of modular steps. ... Based on this framework, practitioners can use our classification to design algorithms for an rPPG approach that suits their specific needs. Researchers can use the reviewed and classified algorithms as a starting point to improve particular features of an rPPG algorithm. Read more
As a source of valuable information about a person’s affective state, heart rate data has the potential to improve both understanding and experience of human-computer interaction. Conventional methods for measuring heart rate use skin contact methods, where a measuring device must be worn by the user. In an Information ... Systems setting, a contactless approach without interference in the user’s natural environment could prove to be advantageous. We develop an application that fulfils these conditions. The algorithm is based on remote photoplethysmography, taking advantage of the slight skin color variation that occurs periodically with the user’s pulse. When evaluating this application in an Information Systems setting with various arousal levels and naturally moving subjects, we achieve an average root mean square error of 7.32 bpm for the best performing configuration. We find that a higher frame rate yields better results than a larger size moving measurement window. Regarding algorithm specifics, we find that a more detailed algorithm using the three RGB signals slightly outperforms a simple algorithm using only the green signal. Read more
In this demo, the small version of our 2D CNN for intake gesture recognition runs directly in your browser. It takes one raw video frame at a time as input to predict the frame-level probability of an intake event. These probabilities are displayed in the graph on the right.
For best results, place device on a table and sit such that the upper body fills most of the video.