Detecting Cognitive Impairment and Psychological Well-being among Older Adults Using Facial, Acoustic, Linguistic, and Cardiovascular Patterns Derived from Remote Conversations


Abstract

Abstract

INTRODUCTION: The aging society urgently requires scalable methods to monitor cognitive decline and identify social and psychological factors indicative of dementia risk in older adults.

METHODS: Our machine learning models captured facial, acoustic, linguistic, and cardiovascular features from 39 individuals with normal cognition or Mild Cognitive Impairment derived from remote video conversations and classified cognitive status, social isolation, neuroticism, and psychological well-being.

RESULTS: Our model could distinguish Clinical Dementia Rating Scale of 0.5 (vs. 0) with 0.78 area under the receiver operating characteristic curve (AUC), social isolation with 0.75 AUC, neuroticism with 0.71 AUC, and negative affect scales with 0.79 AUC.

DISCUSSION: Our findings demonstrate the feasibility of remotely monitoring cognitive status, social isolation, neuroticism, and psychological well-being. Speech and language patterns were more useful for quantifying cognitive impairment, whereas facial expression and cardiovascular patterns using remote photoplethysmography were more useful for quantifying personality and psychological well-being.

1 INTRODUCTION↩︎

The number of older adults living with Alzheimer’s disease and related dementias (ADRD) is expected to rise by nearly 13 million in the U.S. by the year 2060[1], [2]. Mild Cognitive Impairment (MCI) often precedes ADRD. It is characterized by cognitive decline that is greater than expected for an individual’s age and education level, while the person remains capable of performing daily activities independently. [^3;4^](https://www.overleaf.com/project/662bcacc3a3c94e578ec2f82#_bookmark10). Along with cognition, emotional well-being such as anxiety, loneliness, or depressive symptoms potentially have bidirectional links with cognitive impairments, due to shared neurobiological links and behavioral mechanisms that impair brain functions [3][13]. Additionally, mood disorders, social isolation, and negative emotions often co-occur with or even precede MCI and can further accelerate cognitive decline. [12], [14], [15]. Cognitive impairment and psychological well-being significantly impact the lives of elderly individuals, highlighting the importance of early detection, monitoring, and intervention of these symptoms to maintain their quality of life.

Clinical tools, such as the Montreal Cognitive Assessment (MoCA) and Clinical Dementia Rating (CDR), have become widely accepted as standardized methods for evaluating and monitoring cognitive impairment, which assess a range of cognitive functions[16], [17]. However, traditional cognitive assessments are not sensitive enough to identify MCI or monitor the progression during the early MCI stage[18][20]. In standard geriatric care, subtle behavioral changes related to psychological well-being, such as reduced social engagement and declining mental health, are often overlooked. These factors, however, can signal early dementia risk and offer opportunities for timely intervention [5], [21], highlighting the urgent need for innovative approaches to monitor them [22][26]. The expected shortage of care services further exacerbates this situation: it is expected that over 1 million additional direct care workers will be needed by 2031, and the U.S. must nearly triple its geriatricians by 2050 [1]. The current global shortage of mental health professionals and geriatricians further complicates disparities in care, especially in underserved regions [27], [28]. Even developed countries are facing such challenges, with twenty states in the U.S. identified as "dementia neurology deserts"[29], let alone developing countries.

The recent widespread adoption of video conferencing platforms for telehealth [30] presents an opportunity to utilize these tools to prescreen cognitive impairment and identify other associated factors, including social isolation and psychological well-being of older adults remotely [31][34]. Recent advancements in artificial intelligence (AI) have spurred research into using telehealth platforms to quantify psychological-relevant behaviors in individuals with MCI through facial, audio, and text analysis [35][37]. Despite the active research in computerized interview analysis [32][34], there are still relatively few publications that focus on quantifying cognitive abilities, social isolation, and psychological well-being in older adults automatically through remote interviews.

Furthermore, with the recent emergence of generative AI models, so-called foundation AI models trained on vast amounts of video, audio, and text data from internet sources, it is imperative to explore whether these innovations can be effectively leveraged to quantify behaviors in older adults [38][40]. Advances in machine learning in video analysis also enable non-contact assessments of cardiovascular features, such as remote Photoplethysmography (rPPG) from facial videos [41], which could serve as another valuable modality for assessing cognitive impairment and psychological well-being in older adults. For example, rPPG allows the extraction of heart rate variability (HRV) features that are found to be linked with anxiety and other negative emotions [42], [43].

This work aims to investigate the feasibility of quantifying the psychological well-being, social network, and cognitive ability of individuals living with normal cognition or MCI, utilizing digital markers extracted from facial, acoustic, linguistic, and cardiovascular patterns detected by foundation AI models pretrained from large-scale datasets available from the internet. By objectively quantifying various modalities associated with cognitive decline through a scalable, remote, and automated assessment system, we expect this work to provide a step toward enhancing the accessibility, reducing the disparities of mental health and dementia services [44], and promoting evidence-based therapeutics [45].

2 METHODS↩︎

2.1 Participants and Interview Protocol↩︎

Table 1: Demographics and Clinical Characteristics of the Participants. For sex, "M" stands for male, and "F" is for female. For race, "C" denotes Caucasians, "A" denotes African American, and "O" denotes other. The years of education reflect the total number of academic years completed in formal education. The CDR categories include no impairment (CDR=0) and questionable dementia (CDR=0.5).
Type Clinical Diagnosis Normal Cognition (NC) MCI Combined (NC+MCI)
Demographics Counts
Sex (F/M) /2 /8 /10
Race (C/A/O) /1/0 /3/1 /4/1
Age ± 4.39 ± 4.81 80.69 \(\pm\) 4.6
Education (years) ± 2.12 ± 2.54 15.44 ± 2.34
Cognitive Ability MoCA ± 2.51 ± 3.57 24.54 ± 3.61
CDR (no / questionable dementia) /3 /16 /19
Social Network,Personality,PsychologicalWell-being LSNS-6 ± 6.08 ± 5.71 14.13 ± 5.86
Neuroticism ± 9.16 ± 7.89 16.44 ± 8.41
Negative affect ± 8.29 ± 12.82 48.57 ± 10.93
Social satisfaction ± 10.70 ± 13.12 49.02 ± 11.77
Psychological well-being ± 7.17 ± 11.12 50.87 ± 9.99

The data source for this project was the Internet-Based Conversational Engagement Clinical Trial (I-CONECT) (NCT02871921) study [46], [47]. This behavioral intervention aimed to enhance cognitive functions by providing social interactions (conversational interactions) to older subjects living in social isolation. The study was based on the accumulating evidence that social isolation is a risk factor for dementia [48]. The study recruited older adults (>75 years old) with MCI or normal cognition from 2 sites: Portland, Oregon, which focused on Caucasian participants, and Detroit, Michigan, which focused on African American participants. The experimental group participated in 30-minute video chats with trained conversational specialists four times a week, along with weekly 10-minute phone check-ins for 6 months. In contrast, the control group only received weekly 10-minute phone check-ins. Conversations are semi-structured with predetermined themes each day, ranging from historical events to leisure activities, using pictures to promote conversations. The participants with severe depressive symptoms (GDS-15 >=7) [49] were excluded. Exclusion criteria included a clinical diagnosis of dementia. Clinical diagnoses were made through a consensus process involving neurologists and neuropsychologists, using the National Alzheimer’s Coordinating Center (NACC) Uniform Data Set Version 3 (UDS-3). [50], [51]. Inclusion criteria required participants to be socially isolated according to at least one of the following: (1) a score of 12 or less on the 6-item Lubben Social Network Scale (LSNS-6) [52], (2) engaging in conversations lasting 30 minutes or more no more than twice a week, based on self-report, or (3) responding “often” to at least one question on the 3-item UCLA Loneliness Scale [53]. The intervention results showed that the global cognitive functions improved significantly among the intervention group (i.e., video-chats engaged group) compared with the control group after 6 months of intervention with Cohen’s d of 0.73. The topline results of this behavioral intervention were published earlier [47], [54].

The study faced the COVID-19 pandemic during the trial recruitment, and cognitive tests were administered by telephone during the pandemic. MoCA was changed to Telephone MoCA during this period. Out of 94 subjects randomized into the intervention group, 52 subjects had in-person MoCA (as opposed to Telephone MoCA). Among them, 39 participants with all the available data, including transcribed data, personality, and NIH-Toolbox emotional battery assessment (discussed later), were used in the current study. The demographic characteristics of these participants are shown in 1.

2.2 Outcomes and Clinical Assessment↩︎

In this study, we aimed to classify participants with cognitive assessments derived from various scales. The MoCA scores were dichotomized into ‘high’ and ‘low’ categories using a cutoff of 24, which is based on the median score of our participants. A high score indicates better cognitive function. The Normal Cognition (vs. MCI) assessment is the binary encoding of clinician evaluations (NACC UDS V3, Form D1: Clinical Diagnosis Section 1). This encoding assigns a value of 1 to indicate normal cognition and 0 to indicate MCI. Regarding the CDR, participants had scores of either 0, indicating no cognitive impairment, or 0.5, indicating questionable or very mild dementia. These scores were dichotomized accordingly.

Social network and Psychological well-being assessments included LSNS-6 [52] for the amount of social interaction, neuroticism from the NEO Five-Factor Inventory (Neuro) [55], and NIH Toolbox Emotional Battery (NIHTB-EB) [56]. The latter has three composite scores: negative affect, social satisfaction, and psychological well-being [57]. For the LSNS-6 items version, which is used here, the cutoff score of 12 is the suggested threshold to define social isolation [58]. For neuroticism, our participants’ median score of 16 is used to dichotomize our participants into groups that have higher or lower negative emotional reactivity to stressful stimuli. From NIHTB-EB, negative affect, social satisfaction, and psychological well-being composite scores [57] were dichotomized with medians of 44.10, 48.66, and 53.70, respectively, to group our participants into high and low-score groups.

2.3 Predictors and Multimodal Analysis System for Remote Interview↩︎

Figure 1: Overview of the processing pipeline illustrating the extraction and analysis of language, audio, facial, and cardiovascular patterns from conversation video data. The pipeline integrates state-of-the-art feature representation models for each modality, including LLaMA-65B for language feature embedding, RoBERTa[59] and WavLM[39] for sentiment and audio processing, DINOv2 for facial feature extraction, and rPPG for cardiovascular signal estimation.

2.3.0.0.1 Overall Pipeline

Our proposed multimodal analysis framework uses facial, acoustic, linguistic, and cardiovascular patterns to quantify the cognitive function and psychological well-being of the participants during remote interviews (i.e., semi-structured conversations). The participant segment of the interview recordings for the facial, vocal, linguistic, and cardiovascular patterns are extracted. The extracted time-series multimodal features are aggregated over the video using temporal pooling or Hidden Markov Model (HMM). The video-level features are processed with binary classifiers, logistic regression, and/or gradient-boosting classifiers to classify the dichotomized (high or low) rating scales of the cognitive, social network, personality, and psychological well-being assessments. The overall pipeline is shown in 1.

2.3.0.0.2 Preprocessing

Our video data records both the participant’s and moderator’s activities during the remote interview, which requires segmenting the participant portion of the recording for analysis. Our video recording shows the indicator of the speaker on the screen, either as a moderator or participant ID (starting with "C" followed by 4 digits). We used optical character recognition (OCR), namely EasyOCR[60][62], and only the frames indicating participant ID are kept for further analysis. From participant video segments, we used RetinaFace[63], a facial detection model, for detecting, tracking, and segmenting participants’ faces. For the language analysis, the participants’ speeches were transcribed with the transcription models [64] specifically developed for older adults.

2.3.0.0.3 Facial Biomarkers

We extracted generic facial expression features using DINOv2 [38], a foundation model for facial representation with 1024-dimensional visual embeddings. We also extracted facial emotion, landmark, and action units with facial analysis pipelines used for previous mental health studies[65][67]. Facial emotion included 7 categories included being neutral, happy, sad, surprised, fearful, disgusted, and angry [66]. Overall, our facial features were extracted at a 1 Hz sampling rate.

2.3.0.0.4 Cardiovascular Biomarkers

rPPG signals indicate physiological information by capturing subtle variations in skin color that result from blood volume changes in peripheral blood vessels. These signals are extracted from video recordings of a person’s face. We extracted rPPG signals using the pyVHR package [68], a rPPG extraction model. To estimate heart rate from rPPG signals, we analyzed the power spectral density of rPPG signals every six-second intervals, advancing 1 second at a time. The final HRV features were derived by taking the 5th, 25th, 50th, 75th, and 95th quantile of the estimated beats per minute, representing statistical properties of HRV during the interview, using the pyVHR package.

2.3.0.0.5 Acoustic Biomarkers

We first downsampled the audio to 16k Hz for the acoustic feature extraction. Then, we extracted generic acoustic features from vocal tones using the WavLM[39] model, a foundation model for human speech analysis, every 20ms. We also extracted hand-crafted statistical acoustic features every 100ms using PyAudioAnalysis package [69], such as spectral energy or entropy, which was effective for depression and anxiety analysis [65].

2.3.0.0.6 Linguistic Biomarkers

We encoded the entire interview transcript for participants using LLaMA-65B[40] to capture the high-level context of text representations in an 8196-dimensional vector. We also extracted 7 emotions (neutral, happiness, sadness, surprise, fear, disgust, and anger) and positive and negative sentiments in utterance level using RoBERTa models[59], [70], [71]. Both models are large language foundation models (LLMs).

2.3.0.0.7 Participant-level Feature Aggregation

Once all modality features are extracted, they are aggregated over time to represent the entire interview sequence for the participant outcomes. Specifically, we extracted statistical features, such as average or standard deviation, over the entire video sequence. We also trained a two-state Hidden Markov Model (HMM) with Gaussian observation models to capture the temporal dynamics of each biomarker time series using the SSM package [72]. The dimensionality of the observations was determined by the length of the feature set corresponding to the patient with the longest sequence. 1 shows the details of statistical features used in our study.

2.3.0.0.8 Multi-modal Fusion and Classification

We applied a late-fusion approach for classification, which was more effective than the early-fusion approaches in previous studies [65]. For each modality, we first used a logistic regression classifier with L2 regularization or a gradient-boosting classifier to classify the dichotomized ratings (high and low) for cognitive impairment and other outcomes. Then, we aggregated the classification scores from all modalities using majority voting or average scores approaches. The majority voting outputs the final prediction by voting classification results from all modalities. The average score first averages the probability of the predicted class for all modalities and makes the final decision with the class with the higher probability.

sectionPreprocessing

2.4 Experiment↩︎

For evaluating our multimodal analysis system, we used participant-independent 5-fold cross-validation with 20 repetitions. For each fold, we used 64%, 16%, and 20%, as training, validation, and testing splits, respectively. For the evaluation metric, we used an area under the receiver operating characteristic curve (AUROC or AUC) and accuracy, following previous work [65]. We also evaluated the standard deviation of the performance from all cross-validated models to measure the statistical significance of the model performances.

We evaluated unimodal and multimodal fusion models in various combinations to understand the relevance of each modality and the interplay between the modalities for assessing cognitive function and psychological well-being in older adults. For multimodal fusion, we explored 1) acoustic and language fusion, 2) face and cardiovascular fusion, and 3) all modalities combined. In addition to majority voting and average score, we also studied the selected score voting approach, which was effective in previous work [65] for the late-fusion approach. Differently from majority voting, which considers all modalities, the selected voting only includes the classification results for the modality that achieved AUC > 0.5 for the validation set. This was to exclude noise from classifiers performing poorly due to irrelevant features for multimodal fusion.

We also evaluated the classification performance using demographic variables, such as years of education, gender, age, and race, to understand their predictability in cognitive impairment and other outcomes. Years of education and age are used as continuous real-valued variables, and gender and race are represented as categorical one-hot encoders for our participants. We separately evaluated the age-only classifier in addition to the demographic classifier, according to previous work using age for prescreening MCI[73]. We also included demographic variables for all multimodal fusion analyses, assuming this information is usually available from participants’ input in real-world deployment scenarios.

3 Results↩︎

Our experiment results for quantifying cognitive functions are shown in [table:uni95cog] for each modality and [table:multi95cog] for multimodal fusion analysis. The results for quantifying social network, neuroticism, and psychological well-being are shown in [table:uni95psych] for each modality and [table:multi95psych] for multimodal fusion.

3.1 Quantifying Cognitive Assessment↩︎

width=0.8

width=0.8

3.1.0.0.1 MoCA

Our results show that differentiating those with high vs. low MoCA scores (4th column in [table:uni95cog] & [table:multi95cog]) is best when using linguistic modality, LLaMA-65B alone (0.64 AUC and 0.63 accuracy). This was followed by acoustic features (0.63 AUC and 0.58 accuracy) and demographic variables (0.62 AUC and 0.59 accuracy). Facial features and HRV features derived from rPPG were not useful indicators of quantifying MoCA. All multimodal fusion approaches in [table:multi95cog] underperformed the best unimodal approach using the linguistic feature, LLaMA-65B.

3.1.0.0.2 MCI diagnosis

Quantifying MCI diagnosis (5th column in [table:uni95cog] & [table:multi95cog]) was also most effective when using language-based sentiment and emotion features (0.66 AUC and 0.69 accuracy), followed by acoustic and language fusion model (0.66 AUC and 0.63 accuracy).

3.1.0.0.3 CDR

For CDR (6th column in [table:uni95cog] & [table:multi95cog]), the acoustic and language fusion model performed the best (0.78 AUC and 0.74 accuracy).

3.2 Quantifying Social Network and Psychological Well-being Assessment↩︎

width=

width=

3.2.0.0.1 LSNS

Our results showed that language-based emotion and sentiment features were most effective for quantifying social network score, LSNS (4th column in [table:uni95psych] & [table:multi95psych]) with 0.75 AUC and 0.73 accuracy. Facial emotion, landmark, and action unit features showed the second-best performance with 0.6 AUC and 0.59 accuracy. When fusing modalities, only the all-modality fusion with the majority vote matched the facial model with 0.6 AUC and 0.56 accuracy.

3.2.0.0.2 Neuroticism

For quantifying neuroticism (5th column in [table:uni95psych] & [table:multi95psych]), multimodal fusion from all modalities performed effectively with 0.71 AUC and 0.65 accuracy. This was followed by features from facial emotion, landmark, and action units (0.69 AUC and 0.66 accuracy), which contributed the most when fusing multiple modalities.

3.2.0.0.3 Negative Affect

Facial and cardiovascular fusion performed effectively when quantifying negative affect (6th column in [table:uni95psych] & [table:multi95psych]) with 0.79 AUC and 0.75 accuracy. Cardiovascular features alone showed 0.76 AUC and 0.67 accuracy, contributing most when fused with other facial features.

3.2.0.0.4 Social Satisfaction

Facial emotion, landmark, and action unit features were most useful when quantifying social satisfaction (7th column in [table:uni95psych] & [table:multi95psych]) during remote interviews (0.68 AUC and 0.63 accuracy).

3.2.0.0.5 Psychological Well-being

When quantifying overall psychological well-being (8th column in [table:uni95psych] & [table:multi95psych]), facial emotion, landmark, and action unit features were most effective with 0.66 AUC and 0.61 accuracy. This was followed by cardiovascular features (0.62 AUC and 0.60 accuracy), but fusing facial and cardiovascular features showed no improvement compared to when only using cardiovascular features.

4 Discussion↩︎

4.1 Identifying Cognitive States.↩︎

Overall, identifying low MoCA scores and clinical diagnosis of MCI was challenging with our multimodal analysis system, showing 0.64 and 0.66 AUCs, respectively. This is possibly due to the narrow range of MoCA in this group: The majority (68%) of our subjects had MoCA scores between 21 and 28, with a median score of 24, having limited variability of scores to detect distinguishable features associated with MoCA \(\leq24\) vs. \(>24\). Conversely, quantifying CDR (0 vs. 0.5) proved more effective, achieving an AUC of 0.78. The CDR assesses functional outcomes in daily life (e.g., forgetfulness of events, functionality in shopping, and participation in volunteer or social groups), reflecting cognitive performance but not necessarily linked to cognitive testing scores. These functional aspects are likely better captured by acoustic and linguistic features from remote interviews compared to measures like MoCA and clinical MCI diagnoses." It is also worth noting that, while age has been a significant factor for predicting MoCA scores[74], [75] in the general population, it does not show any predictive capacity in our analysis. This discrepancy may be due to the participants’ narrow age range (Overall, 80.69 \(\pm\) 4.6) shown in 1.

Our analysis indicates that the language- and audio-based approaches were effective for quantifying cognitive impairment, compared to facial and HRV features. This is consistent with previous work reporting the effectiveness of speech and text analysis for quantifying cognitive impairment  [76][81]. The proposed pipeline can potentially serve marginalized communities without internet connectivity for video transmission, such as rural areas or low- and middle-income countries[82][84], for prescreening older adults with high risk of cognitive impairment, using only acoustic and linguistic features. Regarding the facial analysis, previous work reported that facial expressions are significantly different and heterogeneous among individuals with MCI [85], [86] We consider this heterogeneity made the model underperform through facial features.

4.2 Quantifying Social Network and Psychological Well-being Assessment.↩︎

Overall, quantifying social isolation (LSNS) was most effective using language features. This is consistent with prior research indicating the utility of text sentiment analysis in understanding the social health of individuals [81], [87], [88]. Other scales for psychological well-being required facial videos to quantify them, and cardiovascular measures were most effective at quantifying negative affect, similar to those reported in mental health studies for various populations using wearable-based cardiovascular health monitoring [89][91]. Negative affects, such as mood disturbances, anxiety, and depression, can be an early sign of dementia, and these emotional changes often precede noticeable cognitive decline and may be linked to neurodegenerative processes in the brain [92]. Our study demonstrates that contactless cardiovascular measures have the potential to quantify behavioral symptoms often accompanied by MCI, which calls for further exploration.

Our findings show that facial, acoustic, and linguistic features from foundation models (DINOv2, WavLM, and LLaMA-65B) significantly underperformed, with an average absolute decrease of 22% AUC compared to the best models using other features from facial expression, acoustic, and linguistic-based emotion signals. Those foundation models are trained to capture generic facial appearance, acoustic waveforms, and language embeddings by training the models with a large-scale database available from the internet that contains individuals with various demographics and contexts. We suspect such features, when directly used, are not designed to capture specific behavior patterns in cognitive impairments or psychological well-being in older adults, especially in the MCI population. In future work, we will study the effect of transfer learning for those foundation models for capturing behavior patterns in older subjects with MCI [93].

4.3 Limitations & Future work↩︎

The proposed work studies the association of facial, acoustic, linguistic, and cardiovascular patterns with cognitive impairment and the associated social and psychological well-being in older adults with normal cognition or MCI. Our study demonstrated that features extracted from remotely conducted conversations can detect a broad range of symptoms linked to cognitive decline, including social engagement and emotional well-being. This remote assessment approach holds promise for the early identification of individuals at risk for cognitive decline, ultimately creating opportunities for timely interventions to slow or prevent further deterioration. Our study has several limitations.

First, this study includes subjects with normal cognition and MCI, which helps to quantify and contrast the behavior characteristics related to the early signs of symptoms of MCI. However, the proposed method is evaluated on a small number of participants (N=39), mostly whites. Our findings may not fully apply to larger populations with diverse gender, race, and ethnic backgrounds. Also, comprehensively quantifying cognitive impairment requires including various subtypes of MCI, such as amnestic or non-amnestic and single-domain or multi-domain MCI [94][96]. We excluded individuals diagnosed with ADRD. Their behavioral patterns could differ significantly from those examined in this study. [97], [98]. Individuals also experience various co-morbidity that influence behaviors or progression in cognitive impairments [99][101]. This requires our future studies to expand to diverse ethnic, racial, and gender groups with a more significant number of participants with varying conditions associated with aging, MCI, and ADRD to test the generalizability of our findings.

Second, the model was validated on the conversations captured during the 1st week out of 48 weeks of intervention, i.e., cross-sectional study. With cognitive decline, older adults can manifest behavior changes over time [102], [103]. Our next exploration will take advantage of all the data and aim to identify longitudinal changes in outcomes using the features examined here. For this, we will explore model adaptation methods to tackle potential model degradation due to behavior changes over time [104]. Moreover, model personalization needs to be studied, as each patient’s rate of cognitive decline and behavior changes could depend on their personality and background [105], [106].

Third, few video recordings had significantly low quality with low resolutions or pixelation due to weak internet connectivity. The effect of those internet connectivity issues on model bias and performance degradation needs to be quantified, as internet connectivity can be less than ideal when the system is deployed in low-resource communities.

Fourth, we utilized the baseline interview data recorded during the first week of the trial to capture behaviors closest to the time of assessments on our subjects’ cognitive function and psychological well-being. However, it is important to note that participants were still becoming familiar with the study procedures during the initial sessions. As a result, participants may have exhibited heightened levels of nervousness or unfamiliarity, potentially influencing their behavior and responses in ways that do not fully represent their typical cognitive or emotional state. This consideration was not factored into our analysis. Future studies will explore data from later sessions to understand and mitigate this potential bias.

We will also explore other methods for modeling temporal dynamics of facial, cardiovascular, audio, and language features. In this work, we mainly used HMM, which showed varying performance across modalities. Contrary to our original hypothesis that modeling temporal dynamics would increase the performance, we observed significant changes in absolute AUC from -29% ([table:uni95cog], Acoustic) to +30% ([table:uni95psych], Acoustic) in performances after adding HMM-based features. We chose a two-state HMM with Gaussian observation to ensure the convergence of the model when trained on our datasets. Yet, this model can be too simple and suboptimal for modeling complex temporal dynamics of facial, cardiovascular, audio, and language variation over the entire interview period relating to various rating scales used in this study. To investigate this, we will explore state-of-the-art sequential models such as recurrent neural networks to extract temporal features in future work [107].

Acknowledgments↩︎

Hyeokhyen Kwon, Salman Seyedi, Bolaji Omofojoye, and Gari Clifford are partially funded by the National Institute on Deafness and Other Communication Disorders (grant # 1R21DC021029-01A1). Hyeokhyen Kwon and Gari Clifford are also partially supported by the James M. Cox Foundation and Cox Enterprises, Inc., in support of Emory’s Brain Health Center and Georgia Institute of Technology. Gari Clifford is partially supported by the National Center for Advancing Translational Sciences of the National Institutes of Health (NIH) under Award Number UL1TR002378. Gari Clifford and Allen Levey are partially funded by NIH grant #R56AG083845 from the National Institute on Aging. Hirko Dodge is funded by NIH grants and serves as the CEO of the I-CONNECT Foundation, a 501(c)(3) non-profit organization. The I-CONECT study received funding from the NIH: R01AG051628 and R01AG056102. The authors extend their gratitude to the participants of the I-CONECT study.

References↩︎

[1]
Better MA. Alzheimer’s disease facts and figures. Alzheimers Dement 2023; 19(4): 1598–1695.
[2]
Kinsella KG, Phillips DR. Global Aging: The Challenge of Success. Population Reference Bureau Washington, DC 2005.
[3]
Gauthier S, Reisberg B, Zaudig M, et al. Mild Cognitive Impairment. The Lancet 2006; 367. http://dx.doi.org/10.1016/S0140-6736(06)68542-5.
[4]
Petersen RC. Mild Cognitive Impairment. Continuum 2016. http://dx.doi.org/10.1212/CON.0000000000000313.
[5]
Hikichi H, Kondo K, Takeda T, Kawachi I. Social interaction and cognitive decline: Results of a 7-year community intervention. Alzheimer’s & Dementia: Translational Research & Clinical Interventions 2017; 3(1): 23–32.
[6]
Holwerda TJ, Deeg DJ, Beekman AT, et al. Feelings of loneliness, but not social isolation, predict dementia onset: results from the Amsterdam Study of the Elderly (AMSTEL). Journal of Neurology, Neurosurgery & Psychiatry 2014; 85(2): 135–142.
[7]
Ma L. Depression, Anxiety, and Apathy in Mild Cognitive Impairment: Current Perspectives. Frontiers in aging neuroscience 2020; 12: 9.
[8]
Fernández Fernández R, Martı́n JI, Antón MAM. Depression as a Risk Factor for Dementia: A Meta-Analysis. The Journal of Neuropsychiatry and Clinical Neurosciences 2024; 36(2): 101–109.
[9]
Kasper S, Bancher C, Eckert A, et al. Management of mild cognitive impairment (MCI): The need for national and international guidelines. The World Journal of Biological Psychiatry 2020.
[10]
Kuiper J, Zuidersma M, Oude Voshaar R, et al. Social relationships and risk of dementia: A systematic review and meta-analysis of longitudinal cohort studies. Ageing Research Reviews 2015. http://dx.doi.org/10.1016/j.arr.2015.04.006.
[11]
Fallahpour M, Borell L, Luborsky M, Nygård L. Leisure-activity participation to prevent later-life cognitive decline: a systematic review. Scandinavian Journal of Occupational Therapy 2015. http://dx.doi.org/10.3109/11038128.2015.1102320.
[12]
Steenland K, Karnes C, Seals R, Carnevale C, Hermida A, Levey A. Late-life depression as a risk factor for mild cognitive impairment or Alzheimer’s disease in 30 US Alzheimer’s disease centers. Journal of Alzheimer’s Disease 2012; 31(2): 265–275.
[13]
Tsormpatzoudi SO, Moraitou D, Papaliagkas V, Pezirkianidis C, Tsolaki M. Resilience in Mild Cognitive Impairment (MCI): Examining the Level and the Associations of Resilience with Subjective Wellbeing and Negative Affect in Early and Late-Stage MCI. Behavioral Sciences 2023; 13(10): 792.
[14]
Gardener H, Levin B, DeRosa J, et al. Social Connectivity is Related to Mild Cognitive Impairment and Dementia. Journal of Alzheimer’s disease 2021; 84(4): 1811–1820.
[15]
Mourao RJ, Mansur G, Malloy-Diniz LF, Castro Costa E, Diniz BS. Depressive symptoms increase the risk of progression to dementia in subjects with mild cognitive impairment: systematic review and meta-analysis. International journal of geriatric psychiatry 2016; 31(8): 905–911.
[16]
Nasreddine ZS, Phillips NA, Bédirian V, et al. The Montreal Cognitive Assessment, MoCA: a brief screening tool for mild cognitive impairment. Journal of the American Geriatrics Society 2005; 53(4): 695–699.
[17]
Morris JC. The Clinical Dementia Rating (CDR): current version and scoring rules. Neurology 1993; 43(11): 2412-2414. http://dx.doi.org/10.1212/WNL.43.11.2412-a.
[18]
Mora-Simon S, Garcia-Garcia R, Perea-Bartolome M, et al. Mild cognitive impairment: early detection and new perspectives. Revista de Neurologı́a 2012; 54(5): 303–310.
[19]
Pinto TC, Machado L, Bulgacov TM, et al. Is the Montreal Cognitive Assessment (MoCA) screening superior to the Mini-Mental State Examination (MMSE) in the detection of mild cognitive impairment (MCI) and Alzheimer’s Disease (AD) in the elderly?. International psychogeriatrics 2019; 31(4): 491–504.
[20]
Chehrehnegar N, Nejati V, Shati M, et al. Early detection of cognitive disturbances in mild cognitive impairment: a systematic review of observational studies. Psychogeriatrics 2020; 20(2): 212–228.
[21]
Zhaoyang R, Sliwinski MJ, Martire LM, Katz MJ, Scott SB. Features of daily social interactions that discriminate between older adults with and without mild cognitive impairment. The Journals of Gerontology, Series B: Psychological Sciences and Social Sciences 2024; 79(4): gbab019.
[22]
Baumeister RF, Leary MR. The need to belong: Desire for interpersonal attachments as a fundamental human motivation. Psychological Bulletin 1995; 117(3): 497-529.
[23]
Safavi R, Wearden A, Berry K. Psychological well-being in persons with dementia: The role of caregiver expressed emotion. British Journal of Clinical Psychology 2023; 62(2): 431–443.
[24]
John A, Saunders R, Desai R, et al. Associations between psychological therapy outcomes for depression and incidence of dementia. Psychological Medicine 2023; 53(11): 4869–4879.
[25]
Gates N, Valenzuela M, Sachdev PS, Singh MAF. Psychological well-being in individuals with mild cognitive impairment. Clinical Interventions in Aging 2014; 9: 779-792. http://dx.doi.org/10.2147/CIA.S58866.
[26]
Chen LY, Tsai TH, Ho A, et al. Predicting neuropsychiatric symptoms of persons with dementia in a day care center using a facial expression recognition system. Aging (Albany NY) 2022; 14(3): 1280.
[27]
Cummings JR, Zhang X, Gandré C, et al. Challenges facing mental health systems arising from the COVID-19 pandemic: Evidence from 14 European and North American countries. Health policy 2023; 136: 104878.
[28]
Liu y, Jun H, Becker A, Wallick C, Mattke S. Detection Rates of Mild Cognitive Impairment in Primary Care for the United States Medicare Population. The Journal of Prevention of Alzheimer’s Disease 2024; 11.
[29]
Rao A, Manteau-Rao M, Aggarwal NT. DEMENTIA NEUROLOGY DESERTS: WHAT ARE THEY AND WHERE ARE THEY LOCATED IN THE U.S.?. Alzheimer’s & Dementia 2017.
[30]
Worster B, Waldman L, Garber G, et al. Increasing equitable access to telehealth oncology care in the COVID-19 National Emergency: Creation of a telehealth task force. Cancer Medicine 2023; 12(3): 2842–2849.
[31]
Griffiths L, Blignault I, Yellowlees P. Telemedicine as a means of delivering cognitive-behavioural therapy to rural and remote mental health clients. Journal of Telemedicine and Telecare 2006; 12(3): 136–140.
[32]
D’Alfonso S. AI in mental health. Current opinion in psychology 2020; 36: 112–117.
[33]
Graham S, Depp C, Lee EE, et al. Artificial Intelligence for Mental Health and Mental Illnesses: An Overview. Current psychiatry reports 2019; 21: 1–18.
[34]
Garcia-Ceja E, Riegler M, Nordgreen T, Jakobsen P, Oedegaard KJ, Tørresen J. Mental health monitoring with multimodal sensing and machine learning: A survey. Pervasive and Mobile Computing 2018; 51: 1–26.
[35]
Zhou Y, Han W, Yao X, Xue J, Li Z, Li Y. Developing a machine learning model for detecting depression, anxiety, and apathy in older adults with mild cognitive impairment using speech and facial expressions: A cross-sectional observational study. International Journal of Nursing Studies 2023.
[36]
Cohen J, Richter V, Neumann M, et al. A multimodal dialog approach to mental state characterization in clinically depressed, anxious, and suicidal populations. Frontiers in Psychology 2023.
[37]
Jiang Z, Harati S, Crowell A, Mayberg HS, Nemati S, Clifford GD. Classifying Major Depressive Disorder and Response to Deep Brain Stimulation Over Time by Analyzing Facial Expressions. IEEE transactions on biomedical engineering 2020; 68(2): 664–672.
[38]
Oquab M, Darcet T, Moutakanni T, et al. DINOv2: Learning Robust Visual Features without Supervision. arXiv:2304.07193 2023.
[39]
Chen S, Wang C, Chen Z, et al. WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing. IEEE Journal of Selected Topics in Signal Processing 2022; 16(6): 1505–1518.
[40]
Touvron H, Lavril T, Izacard G, et al. LLaMA: Open and Efficient Foundation Language Models. arXiv preprint arXiv:2302.13971 2023.
[41]
Casado CA, López MB. Face2PPG: An unsupervised pipeline for blood volume pulse extraction from faces. IEEE Journal of Biomedical and Health Informatics 2023; 27(11): 5530-5541. http://dx.doi.org/10.1109/JBHI.2023.3307942.
[42]
Jung H, Yoo HJ, Choi P, et al. Changes in Negative Emotions Across Five Weeks of HRV Biofeedback Intervention were Mediated by Changes in Resting Heart Rate Variability. Applied Psychophysiology and Biofeedback 2024: 1–24.
[43]
Chalmers JA, Quintana DS, Abbott MJA, Kemp AH. Anxiety Disorders are Associated with Reduced Heart Rate Variability: A Meta-Analysis. Frontiers in psychiatry 2014; 5: 80.
[44]
Viers BR, Lightner DJ, Rivera ME, et al. Efficiency, satisfaction, and costs for remote video visits following radical prostatectomy: a randomized controlled trial. European urology 2015; 68(4): 729–735.
[45]
Cook S, Schwartz A, Kaslow N. Evidence-Based Psychotherapy: Advantages and Challenges. Neurotherapeutics 2017.
[46]
Yu K, Wild K, Potempa K, et al. The Internet-Based Conversational Engagement Clinical Trial (I-CONECT) in Socially Isolated Adults 75+ Years Old: Randomized Controlled Trial Protocol and COVID-19 Related Study Modifications. Frontiers in Digital Health 2021; 3: 714813. http://dx.doi.org/10.3389/fdgth.2021.714813.
[47]
Dodge HH, Yu K, Wu CY, et al. Internet-Based Conversational Engagement Randomized Controlled Clinical Trial (I-CONECT) Among Socially Isolated Adults 75+ Years Old With Normal Cognition or Mild Cognitive Impairment: Topline Results. The Gerontologist 2024; 64(4): gnad147.
[48]
Livingston G, Huntley J, Liu KY, et al. Dementia prevention, intervention, and care: 2024 report of the Lancet standing Commission. The Lancet 2024; 404(10452): 572–628.
[49]
Yesavage JA, Brink TL, Rose TL, et al. Development and validation of a geriatric depression screening scale: a preliminary report. Journal of psychiatric research 1982; 17(1): 37–49.
[50]
Weintraub S, Besser L, Dodge HH, et al. Version 3 of the Alzheimer Disease Centers’ Neuropsychological Test Battery in the Uniform Data Set (UDS). Alzheimer Disease and Associated Disorders 2018; 32(1): 10-17. http://dx.doi.org/10.1097/WAD.0000000000000223.
[51]
Dodge HH, Goldstein FC, Wakim NI, et al. Differentiating among stages of cognitive impairment: Comparisons of versions two and three of the National Alzheimer’s Coordinating Center (NACC) Uniform Data Set (UDS) neuropsychological test battery: Health services research / Cost-effectiveness of treatment/prevention and diagnosis. Alzheimer’s & Dementia 2020; 16: e040648.
[52]
Lubben J, Blozik E, Gillmann G, et al. Performance of an abbreviated version of the Lubben Social Network Scale among three European community-dwelling older adult populations. The Gerontologist 2006; 46(4): 503-513. http://dx.doi.org/10.1093/geront/46.4.503.
[53]
Hughes ME, Waite LJ, Hawkley LC, Cacioppo JT. A Short Scale for Measuring Loneliness in Large Surveys: Results From Two Population-Based Studies. Research on Aging 2004; 26(6): 655-672. http://dx.doi.org/10.1177/0164027504268574.
[54]
Wu CY, Yu K, Arnold S, Das S, Dodge H. Who Benefited Most from the Internet-Based Conversational Engagement RCT (I-CONECT)? Application of the Personalized Medicine Approach to a Behavioral Intervention Study. The Journal of Prevention of Alzheimer’s Disease 2024: 1–10.
[55]
McCrae RR, Costa PTJ, Martin TA. The NEO-PI-3: a more readable revised NEO Personality Inventory. Journal of Personality Assessment 2005; 84(3): 261-270.
[56]
HealthMeasures . NIH Toolbox. HealthMeasures website; 2024. Accessed: August 6, 2024.
[57]
Babakhanyan I, McKenna BS, Casaletto KB, Nowinski CJ, Heaton RK. National Institutes of Health Toolbox Emotion Battery for English- and Spanish-speaking adults: normative data and factor-based summary scores. Patient Related Outcome Measures 2018; 9: 115-127. http://dx.doi.org/10.2147/PROM.S151658.
[58]
Lubben JE. Lubben Social Network Scale (LSNS). APA PsycTests. 1984.
[59]
Liu Y, Ott M, Goyal N, et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692 2019.
[60]
EasyOCR. https://github.com/jaidedai/easyocr; . Accessed: 2024-12-11.
[61]
Baek J, Kim G, Lee J, et al. What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis. 2019.
[62]
Shi B, Bai X, Yao C. An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition. IEEE transactions on pattern analysis and machine intelligence 2016; 39(11): 2298–2304.
[63]
Serengil S, Ozpinar A. A Benchmark of Facial Recognition Pipelines and Co-Usability Performances of Modules. Journal of Information Technologies 2024; 17(2): 95-107. http://dx.doi.org/10.17671/gazibtd.1399077.
[64]
Chen L, Asgari M. Refining Automatic Speech Recognition System for Older Adults. In: IEEE. ; 2021.
[65]
Jiang Z, Seyedi S, Griner E, et al. Multimodal Mental Health Digital Biomarker Analysis From Remote Interviews Using Facial, Vocal, Linguistic, and Cardiovascular Patterns. IEEE Journal of Biomedical and Health Informatics 2024.
[66]
Ekman P. Universal Facial Expressions of Emotion. California Mental Health Research Digest 1970.
[67]
Shao Z, Liu Z, Cai J, Ma L. Deep Adaptive Attention for Joint Facial Action Unit Detection and Face Alignment. In: Springer. ; 2018: 725–740.
[68]
Boccignone G, Conte D, Cuculo V, et al. pyVHR: a Python framework for remote photoplethysmography. PeerJ Computer Science 2022; 8: e929.
[69]
Giannakopoulos T. pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis. PloS one 2015; 10(12): e0144610.
[70]
Hartmann J. Emotion English Distilroberta-Base. Retrieved June 2022; 2: 2023.
[71]
Hartmann J, Heitmann M, Siebert C, Schamp C. More than a Feeling: Accuracy and Application of Sentiment Analysis. International Journal of Research in Marketing 2023; 40(1): 75–87.
[72]
SSM: Bayesian Learning and Inference for State Space Models. https://github.com/lindermanlab/ssm; . Accessed: 2024-08-19.
[73]
Celsis P. Age-related cognitive decline, mild cognitive impairment or preclinical Alzheimer’s disease?. Annals of medicine 2000; 32(1): 6–14.
[74]
Nasreddine ZS, Phillips NA, Bédirian V, et al. The Montreal Cognitive Assessment, MoCA: a brief screening tool for mild cognitive impairment. Journal of the American Geriatrics Society 2005; 53(4): 695-699. http://dx.doi.org/10.1111/j.1532-5415.2005.53221.x.
[75]
Ciesielska N, Sokołowski R, Mazur E, Podhorecka M, Polak-Szabela A, Kędziora-Kornatowska K. Is the Montreal Cognitive Assessment (MoCA) test better suited than the Mini-Mental State Examination (MMSE) in mild cognitive impairment (MCI) detection among people aged over 60? Meta-analysis. Psychiatria Polska 2016; 50(5): 1039-1052. http://dx.doi.org/10.12740/PP/45368.
[76]
Themistocleous C, Eckerström M, Kokkinakis D. Voice quality and speech fluency distinguish individuals with Mild Cognitive Impairment from Healthy Controls. Plos one 2020; 15(7): e0236009.
[77]
Tóth L, Hoffmann I, Gosztolya G, et al. A Speech Recognition-based Solution for the Automatic Detection of Mild Cognitive Impairment from Spontaneous Speech. Current Alzheimer Research 2018; 15(2): 130–138.
[78]
Laguarta J, Subirana B. Longitudinal Speech Biomarkers for Automated Alzheimer’s Detection. frontiers in Computer Science 2021; 3: 624694.
[79]
Haulcy R, Glass J. Classifying Alzheimer’s Disease Using Audio and Text-Based Representations of Speech. Frontiers in Psychology 2021; 11: 624137.
[80]
Tang F, Chen J, Dodge HH, Zhou J. The Joint Effects of Acoustic and Linguistic Markers for Early Identification of Mild Cognitive Impairment. Frontiers in digital health 2022; 3: 702772.
[81]
Asgari M, Kaye J, Dodge H. Predicting mild cognitive impairment from spontaneous spoken utterances. Alzheimer’s & Dementia: Translational Research & Clinical Interventions 2017; 3(2): 219–228.
[82]
Graves JM, Abshire DA, Amiri S, Mackelprang JL. Disparities in Technology and Broadband Internet Access across Rurality: Implications for Health and Education. Family & community health 2021; 44(4): 257–265.
[83]
Dodoo JE, Al-Samarraie H, Alzahrani AI. Telemedicine use in Sub-Saharan Africa: Barriers and policy recommendations for Covid-19 and beyond. International Journal of Medical Informatics 2021; 151: 104467.
[84]
Kyei KA, Onajah GN, Daniels J. The emergence of telemedicine in a low-middle-income country: challenges and opportunities. ecancermedicalscience 2024; 18.
[85]
Jiang Z, Seyedi S, Haque RU, et al. Automated analysis of facial emotions in subjects with cognitive impairment. Plos one 2022; 17(1): e0262527.
[86]
Morellini L, Izzo A, Rossi S, et al. Emotion recognition and processing in patients with mild cognitive impairment: A systematic review. Frontiers in Psychology 2022; 13: 1044385.
[87]
Liu T, Meyerhoff J, Eichstaedt JC, et al. The relationship between text message sentiment and self-reported depression. Journal of affective disorders 2022; 302: 7–14.
[88]
Zhou Y, Yao X, Han W, Wang Y, Li Z, Li Y. Distinguishing apathy and depression in older adults with mild cognitive impairment using text, audio, and video based on multiclass classification and shapely additive explanations. International Journal of Geriatric Psychiatry 2022; 37(11).
[89]
Di Campli San Vito P, Shakeri G, Ross J, Yang X, Brewster S. Development of a Real-Time Stress Detection System for Older Adults With Heart Rate Data. 2023: 226–236. http://dx.doi.org/10.1145/3594806.3594817.
[90]
Shu L, Yu Y, Chen W, et al. Wearable Emotion Recognition Using Heart Rate Data from a Smart Bracelet. Sensors 2020; 20(3). http://dx.doi.org/10.3390/s20030718.
[91]
Chen YC, Hsiao CC, Zheng WD, Lee RG, Lin R. Artificial neural networks-based classification of emotions using wristband heart rate monitor data. Medicine 2019; 98(33): e16863. http://dx.doi.org/10.1097/md.0000000000016863.
[92]
Ismail Z, Agüera-Ortiz L, Brodaty H, et al. The Mild Behavioral Impairment Checklist (MBI-C): A Rating Scale for Neuropsychiatric Symptoms in Pre-Dementia Populations. Journal of Alzheimer’s disease 2017; 56(3): 929–938.
[93]
Zhuang F, Qi Z, Duan K, et al. A Comprehensive Survey on Transfer Learning. 2019. http://dx.doi.org/10.48550/ARXIV.1911.02685.
[94]
Busse A, Hensel A, Guhne U, Angermeyer M, Riedel-Heller S. Mild cognitive impairment: long-term course of four clinical subtypes. Neurology 2006; 67(12): 2176–2185.
[95]
Rapp SR, Legault C, Henderson VW, et al. Subtypes of Mild Cognitive Impairment in Older Postmenopausal Women: The Women’s Health Initiative Memory Study. Alzheimer Disease & Associated Disorders 2010; 24(3): 248–255.
[96]
Bradfield NI. Mild Cognitive Impairment: Diagnosis and Subtypes. Clinical EEG and neuroscience 2023; 54(1): 4–11.
[97]
Davis M, O’Connell T, Johnson S, et al. Estimating Alzheimer’s Disease Progression Rates from Normal Cognition Through Mild Cognitive Impairment and Stages of Dementia. Current Alzheimer Research 2018; 15(8): 777–788.
[98]
Oliveira Silva dF, Ferreira JV, Placido J, et al. Stages of mild cognitive impairment and Alzheimer’s disease can be differentiated by declines in timed up and go test: A systematic review and meta-analysis. Archives of gerontology and geriatrics 2019; 85: 103941.
[99]
Makizako H, Shimada H, Tsutsumimoto K, et al. Comorbid Mild Cognitive Impairment and Depressive Symptoms Predict Future Dementia in Community Older Adults: A 24-Month Follow-Up Longitudinal Study. Journal of Alzheimer’s Disease 2016; 54(4): 1473–1482.
[100]
Stephan BC, Brayne C, Savva GM, Matthews FE. Occurrence of medical co-morbidity in mild cognitive impairment: implications for generalisation of MCI research. Age and ageing 2011; 40(4): 501–507.
[101]
Menegon F, De Marchi F, Aprile D, et al. From Mild Cognitive Impairment to Dementia: The Impact of Comorbid Conditions on Disease Conversion. Biomedicines 2024; 12(8): 1675.
[102]
Schwertner E, Pereira JB, Xu H, et al. Behavioral and Psychological Symptoms of Dementia in Different Dementia Disorders: A Large-Scale Study of 10,000 Individuals. Journal of Alzheimer’s Disease 2022; 87(3): 1307–1318. http://dx.doi.org/10.3233/jad-215198.
[103]
Islam M, Mazumder M, Schwabe-Warf D, Stephan Y, Sutin AR, Terracciano A. Personality Changes With Dementia From the Informant Perspective: New Data and Meta-Analysis. Journal of the American Medical Directors Association 2019; 20(2): 131–137. http://dx.doi.org/10.1016/j.jamda.2018.11.004.
[104]
Liang J, He R, Tan T. A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts. 2023. http://dx.doi.org/10.48550/ARXIV.2303.15361.
[105]
Ferrari A, Micucci D, Mobilio M, Napoletano P. Deep learning and model personalization in sensor-based human activity recognition. Journal of Reliable Intelligent Environments 2022; 9(1): 27–39. http://dx.doi.org/10.1007/s40860-021-00167-w.
[106]
Li J, Washington P. A Comparison of Personalized and Generalized Approaches to Emotion Recognition Using Consumer Wearable Devices: Machine Learning Study. JMIR AI 2024; 3: e52171. http://dx.doi.org/10.2196/52171.
[107]
Lipton ZC, Berkowitz J, Elkan C. A Critical Review of Recurrent Neural Networks for Sequence Learning. 2015. http://dx.doi.org/10.48550/ARXIV.1506.00019.