当前位置:首页 期刊杂志

Development of a depression in Parkinson's disease prediction model using machin

时间:2024-08-31

Haewon Byeon,Major in Medical Big Data,College of AI Convergence,Inje University,Gimhae 50834,Gyeonsangnamdo,South Korea

Abstract BACKGROUND It is important to diagnose depression in Parkinson’s disease (DPD) as soon as possible and identify the predictors of depression to improve quality of life in Parkinson’s disease (PD) patients.AIM To develop a model for predicting DPD based on the support vector machine,while considering sociodemographic factors,health habits,Parkinson's symptoms,sleep behavior disorders,and neuropsychiatric indicators as predictors and provide baseline data for identifying DPD.METHODS This study analyzed 223 of 335 patients who were 60 years or older with PD.Depression was measured using the 30 items of the Geriatric Depression Scale,and the explanatory variables included PD-related motor signs,rapid eye movement sleep behavior disorders,and neuropsychological tests.The support vector machine was used to develop a DPD prediction model.RESULTS When the effects of PD motor symptoms were compared using “functional weight”,late motor complications (occurrence of levodopa-induced dyskinesia)were the most influential risk factors for Parkinson's symptoms.CONCLUSION It is necessary to develop customized screening tests that can detect DPD in the early stage and continuously monitor high-risk groups based on the factors related to DPD derived from this predictive model in order to maintain the emotional health of PD patients.

Key Words:Depression in Parkinson's disease;Supervised Machine Learning;Neuropsychological test;Risk factor;Support vector machine;Rapid eye movement sleep behavior disorders

INTRODUCTION

Parkinson's disease (PD) is a ty pical degenerative disease of the elderly with the second-highest incidence rate after Alzheimer's disease.The incidence rate of PD increases worldwide as the population ages.The Health Insurance Research and Assessment Service (2017)[1]reported that the number of patients with PD increased from 39265 in 2004 to 96499 in 2016,a 2.5-fold increase over 13 years,in South Korea.If the current trend in the aging population continues,the number of patients with PD will increase even further.

Although the primary symptom of PD is dyskinesia such as rigidity,it is highly likely that non-motor symptoms such as depression or cognitive impairment develop as PD progresses[2].Among these symptoms,depressive symptoms are the most common non-motor symptoms of PD and previous studies[3-6]have reported that 35%to 75% of patients with PD suffer from depression,which is much higher than the prevalence of depression in adults in the local community (< 10%)[7].Depression is known to adversely affect the quality of life in PD patients[8],and Parkinson's patients with depression have significantly higher anxiety symptoms,pessimism,suicidal thoughts,and self-condemnation compared to Parkinson's patients without depression[4,9].Leeet al[10]analyzed 4362 patients with PD and reported that the elderly with PD had a 2-fold higher risk of suicide compared to healthy elderly.Even though patients with PD frequently experience depression[6,11],only 1% of them recognize that they have depression.These results suggest that it is necessary to diagnose depression in PD (DPD) as soon as possible[7,8].Depression induced by PD requires social management and should not be considered a personal matter.As the symptoms of depression deteriorate as PD progresses,these symptoms,together with cognitive impairment,not only increase direct costs such as examinations and treatment but also raise indirect costs such as job loss due to disability and the care burden of supporting family members[12,13].Ultimately,it causes unnecessary social expenditure at the national level.However,there are few studies on the characteristics and related factors of depression,a non-motor symptom of PD,compared to the motor symptoms of PD.Many medical practitioners are still more interested in the motor symptoms of PD than the non-motor symptoms.Moreover,depression is often misdiagnosed due to the nonmotor and motor symptoms associated with PD[5].It is important to diagnose depression in PD patients as soon as possible and identify predictors of depression to improve quality of life in PD patients.

It has been reported that the duration of PD,Hoehn and Yahr phase,age,activities of daily living,low cognitive function,and sleep behavior disorder affect DPD[5,14,15].Depression frequently occurs during the early stages of PD[16].Previous studies suspected that PD might cause depression along with dyskinesia due to dopaminedeficiency[6,17].However,other studies showed that the administration of L-dopa(levodopa),a Parkinson’s treatment,did not improve depression symptoms indisputably[18].Therefore,it cannot be determined that depression is simply due to the effect of dopamine deficiency.On the other hand,many recent scholars have argued that depression in PD patients was caused by the complex interactions of multiple factors,rather than a single cause[19,20].However,these previous studies[5,16,17]are limited in determining a risk factor,while considering multiple risk factors as each study used different confounding factors or covariates and used regression models to predict a risk factor,although they were effective in exploring individual risk factors.Additionally,regression analysis requires data which satisfy many assumptions such as normality,linearity,and homoscedasticity,and disease data are highly likely to violate these assumptions.

Recent studies have used various machine learning classifiers such as the support vector machine (SVM) and decision tree as statistical classification methods to identify multiple risk factors for diseases such as depression[21].Of these,the SVM divides data into two groups linearly and explores the optimal boundary.SVM can be used for classifying nonlinear data,it has less probability of overfitting than the decision tree model,and it has high prediction accuracy even for small sample sizes[22,23].The objectives of present study were to develop a model for predicting DPD based on the SVM,while considering sociodemographic factors,health habits,Parkinson's symptoms,sleep behavior disorders,and neuropsychiatric indicators as predictors and to provide baseline data for identifying DPD.

MATERIALS AND METHODS

Subjects

The present study was conducted by analyzing the Parkinson’s Disease Epidemiology(PDE) Data provided by the National Biobank of Korea,the Centers for Disease Control and Prevention (CDC),and Republic of Korea (No.KBN-2019-005).This study was approved by the Research Ethics Review Board of the National Biobank of Korea(No.2019-005) and the Korea-CDC (No.2019-1327).The goal of the National Biobank and the structure of the data were described by Leeet al[24].The PDE data used in this study were collected at 14 university hospitals from January to December,2015,under the supervision of the Korea CDC.The PDE data consisted of health behaviors,sociodemographic factors,motor characteristics related to PD,disease history,neuropsychological test results and sleep behavior disorders.PD was diagnosed according to the idiopathic Parkinson’s disease diagnosis criteria of the United Kingdom Parkinson’s disease Society Brain Bank[25].This study analyzed 223 of 335 patients who were 60 years or older with PD,after excluding 112 subjects who had a least one missing value in the Geriatric Depression Scale (GDS)[26],a depression screening test.

Measurement

Depression was measured using the 30 items of the GDS[26],and the threshold point of depression was 10 points.Explanatory variables included PD-related motor signs (e.g.,late motor complications,bradykinesia,tremor,postural instability,and rigidity),rapid eye movement (REM) and sleep behavior disorders,the Korean Mini Mental State Examination score[27],Korean Montreal Cognitive Assessment score[28],global Clinical Dementia Rating score (CDR)[29],Korean Instrumental Activities of Daily Living score (K-IADL)[30],Untitled Parkinson`s Disease Rating total score (UPDRS)[31],UPDRS motor score[32],Hoehn and Yahr staging (H and Y staging)[33],and the Schwab and England Activities of Daily Living scale (Schwab and England ADL)[34].

SVM was used to develop a DPD prediction model.SVM is a MLalgorithm that finds an optimal decision boundary,in other words,a linear separation dividing the hyperplane (H-plane) optimally,by converting training data to a higher dimension through nonlinear mapping[35].For example,A = [b,e] and B = [c,f] are non-linearly(non-lin) separable in two dimensions.When they are mapped in three dimensions,they have linearly separable characteristics.Therefore,when appropriate non-lin mapping is conducted at sufficiently large dimensions,data with two classes can always be separated in H-plane[23].

SVM is very accurate as it can model complex nonlinear decision-making domains and tends to be overfitting less than other models,which is a major advantage of this method[36].This study used R version 3.6.1 for statistical analyses.The prediction performance (accuracy) of eight SVM models was compared using four algorithms [i.e.,a radial basis function (Gauss function),a linear algorithm,a sigmoid algorithm,and a polynomial algorithm] and two types of SVM [C-SVM (C parameter) and Nu-SVM (Nu parameter)].The prediction performance of the models was evaluated by considering overall accuracy,sensitivity,and specificity.

RESULTS

General characteristics of the 223 study subjects with PD were analyzed and are shown in Table1.The mean age of the subjects was 71.7 years (SD = 6.1).The initial age at diagnosis of PD was 70.8 years (SD = 6.3) and the training period was 7.5 years (SD =5.3).The percentage of non-smokers was 79.7%,right-handed subjects was 96.0%),and subjects without a family history of PD was 82.5%.It was found that 22.5% of the subjects had diabetes,41.3% had hypertension,and 13.3% had hyperlipidemia.In terms of cognitive characteristics,30.9% of the subjects had PDD,61% had Mild Cognitive Impairment in Parkinson’s disease (PD-MCI),and 8.1% of them had Parkinson’s disease with cognitive impairment (PD-NC).The results of the GDS confirmed that 41.7% of the patients had depression.The distribution of neuropsychological test results is presented using a density plot (Figure1).

Table1 shows the general characteristics of the study subjects with depression and the related potential factors (influencing factors) of DPD.The results of the chi-square test showed that PD patients with depression and PD patients without depression were significantly different as assessed by the Korean Mini Mental State Examination,Korean Montreal Cognitive Assessment,Global CDR score,total score of UPDRS,sum of boxes in CDR,K-IADL,motor score of UPDRS,H and Y staging,and Chwab and England ADL (P< 0.05).

Comparing the accuracy of the DPD prediction model according to the SVM classification algorithm

The fit of the model varies by the kernel type of SVM.Therefore,our study compared the prediction accuracy of eight SVM models [(C-SVM or Nu-SVM) × (Gaussian kernel,linear,polynomial,or sigmoid algorithm)] to examine the performance of the models according to various kernel types (Table2).The results of model fitting showed that the Gaussian Kernel-based Nu-SVM had the highest sensitivity (96.0%),specificity (93.3%),and mean overall accuracy (95%).On the other hand,although the polynomial-based C-SVM had the highest sensitivity (100%),it had the lowest specificity (20%) and the lowest mean overall accuracy (70%).

This study determined that the Gaussian algorithm-based Nu-SVM model,which had the highest sensitivity and overall accuracy,was the optimal model for predicting DPD and analyzed the importance of variables.The Gaussian algorithm-based Nu-SVM model utilized 34 support vectors and the “functional weight (importance of variables)” is presented in Table3.Even though the functional weight of SVM is not a value simply for comparing the magnitudes of variables’ influence or ranking the importance of variables,it is possible to compare the influence within the level of factors (e.g.,comparing the influence of gender) using it.It is also possible to understand whether the relationship between predictors and outcome variables is a risk factor or a preventive factor.The DPD prediction model revealed that the global CDR score,the sum of boxes in CDR,K-IADL,total UPDRS,motor UPDRS,age (≥ 75 years old),gender (female),education level (high school graduate or above),PD family history,smoking (21-40 packs per year),exposure to pesticides,postural instability,late motor complications (occurrence of levodopa-induced dyskinesia),late motor complications (“levodopa-induced dyskinesia” and occurrence of “wearing OFF”),and REM sleep behavior disorders were risk factors for depression.When the effects of PD motor symptoms were compared using “functional weight”,late motor complications (occurrence of levodopa-induced dyskinesia) were the most influential risk factors for DPD.

DISCUSSION

Our study developed a depression prediction model using SVM for PD patients.This study used the hospital registry data and found that 41.7% of PD patients suffered from depression.Although it is difficult to compare directly,the results of this study were similar to the results of previous studies[4-6],showing that one in two PD patientshad depression.Despite the frequent occurrence of depression among patients with PD,the Global Parkinson's Disease Survey Steering Committee (2002)[8]reported that only 1% of patients with PD recognized that they had depression.These results imply that,even though PD patients frequently experience depression,it is highly likely that many PD patients,their caregivers,and their medical practitioners do not finddepression symptoms in the patients or treat them as symptoms due to aging,and,consequently,the patients do not receive appropriate evaluation or treatment.Therefore,it is necessary to develop an education program in order to manage depression from the onset of PD so that PD patients and their caregivers can receive the correct information regarding depression and receive effective treatments.The results of this study showed that the global CDR score,the sum of boxes in CDR,KIADL,total UPDRS,motor UPDRS,age (≥ 75 years old),gender (female),education level (≥ high school graduate),PD family history,smoking (21-40 packs per year),exposure to pesticides,postural instability,late motor complications (occurrence of levodopa-induced dyskinesia),late motor complications (occurrence of “wearing OFF”and “levodopa-induced dyskinesia”),and REM sleep behavior disorders were the major predictors of DPD.A number of studies[5,14,15]exploring the risk factors for DPD reported that daily living ability,sleep behavior disorders,cognitive level,and Hoehn and Yahr[33]stages,as well as environmental factors (e.g.,social stigma and social participation),were key influencing factors of depression and these results supported the results of our study.In particular,sleep behavior disorder is known to be the most representative factor for predicting the risk of depression[37]and has a high correlation with depression[38].It is known that sleep behavior disorders are caused by dysfunction in the autonomic nervous system of patients with PD[39].Baeet al[40]evaluated factors related to DPD using structural equation modeling,and reported that a sleeping issue best predicted DPD.

Table1 General characteristics of the subjects with depression,n (%)

CDR:Clinical Dementia Rating;K-MMSE:Korean Mini-Mental State Examination;K-MoCA:Korean-Montreal Cognitive Assessment;K-IADL:Korean Instrumental Activities of Daily Living;UPDRS:Untitled Parkinson`s Disease Rating total score;H and Y:Hoehn and Yahr;ADL:Activities of Daily Living.

Table2 Comparing the accuracy of depression in Parkinson's disease prediction model,%

Table3 Functional weight (importance of variables)

Figure1 The distribution of neuropsychological tests.CDR:Clinical Dementia Rating;K-MMSE:Korean Mini-Mental State Examination;K-MoCA:Korean-Montreal Cognitive Assessment;K-IADL:Korean Instrumental Activities of Daily Living;UPDRS:Untitled Parkinson`s Disease Rating total score;H and Y:Hoehn and Yahr;ADL:Activities of Daily Living.

ADL,indicating the limitation of physical functions,is also related to early depression[41],as social relations can decrease due to limited ADL resulting in depression[42].If depression persists,the quality of life in PD patients is highly likely to decrease[43]and they have a high risk of suicide[10].Therefore,it is necessary to develop customized screening tools that can detect high-risk groups sensitively and conduct continuous monitoring in order to prevent depression and maintain emotional health.

Another finding of our study was that the prediction accuracy of the Gaussian kernel-based Nu-SVM was the highest when comparing eight SVM classification algorithms [(C-SVM or Nu-SVM) × (Gaussian kernel,linear,polynomial,or sigmoid algorithm)].The performance of non-lin SVM is affected by the kernel function and the parameters constituting it[44].Of these,the Gaussian kernel is an algorithm that maps in a specific space with infinite dimension and it had high predictive accuracy in a previous study[25].The results of this study suggest that,in the case of binary disease data with a small sample size,developing a prediction model using Gaussian kernelbased Nu-SVM will have higher predictive accuracy than SVM models based on other algorithms.

The present study was meaningful because it developed an SVM-based DPD prediction model using national Parkinson's registry data while considering sociodemographic factors,health habits,Parkinson's symptoms,and sleep behavior disorders as predictors in addition to neuropsychological indicators.The limitations of this study are as follows.First,it is difficult to generalize the results of this study as hospital registry data collected using convenience sampling was used.Future studies are needed to apply systematic sampling at the stage of recruiting subjects to minimize selection bias.Secondly,the sample size was small.Thirdly,causality could not be identified because it was a cross-sectional study.Additional longitudinal studies are required to prove causality.Fourthly,biomarkers and Parkinson's treatments related to depression were not investigated.In order to predict depression more sensitively,it is necessary to develop a predictive model including biomarkers in addition to cognitive tests and neuropsychological tests.

CONCLUSION

The results of our study can be used as baseline information to prevent DPD and establish management strategies.It is necessary to develop customized screening that can detect DPD in the early stage and continuously monitor high-risk groups based on the factors related to DPD derived from this predictive model in order to maintain the emotional health of PD patients.It is also necessary to develop customized programs for managing depression from the onset of PD.

ARTICLE HIGHLIGHTS

Research background

It is important to diagnose depression in Parkinson's disease (PD) patients as soon as possible and identify predictors of depression to improve the quality of life in PD patients.

Research motivation

It has been reported that the duration of PD,Hoehn and Yahr phase,age,activities of daily living,low cognitive function,and sleep behavior disorder affect depression in PD.However,these previous studies are limited in determining a risk factor,while considering multiple risk factors as each study used different confounding factors or covariates and used regression models to predict a risk factor,although they were effective in exploring individual risk factors.

Research objectives

The objectives of our study were to develop a model for predicting depression in Parkinson's disease (DPD) based on the support vector machine while considering sociodemographic factors,health habits,Parkinson's symptoms,sleep behavior disorders,and neuropsychiatric indicators as predictors and to provide baseline data for identifying DPD.

Research methods

The data used in this study was collected at 14 university hospitals from January to December,2015,under the supervision of the Korea Centers for Disease Control.The data consisted of health behaviors,sociodemographic factors,motor characteristics related to PD,disease history,neuropsychological test results and sleep behavior disorders.

Research results

When the effects of PD motor sympt oms were compared using “functional weight”,late motor complications (occurrence of levodopa-induced dyskinesia) were the most influential risk factors for DPD.

Research conclusions

It is necessary to develop customized screening that can detect DPD in the early stage and continuously monitor high-risk groups based on the factors related to DPD derived from this predictive model in order to maintain the emotional health of PD patients.

Research perspectives

It is also necessary to develop customized programs for managing depression from the onset of PD.

ACKNOWLEDGEMENTS

The author wishes to thank the National Biobank of Korea for providing the raw data.

免责声明

我们致力于保护作者版权,注重分享,被刊用文章因无法核实真实出处,未能及时与作者取得联系,或有版权异议的,请联系管理员,我们会立即处理! 部分文章是来自各大过期杂志,内容仅供学习参考,不准确地方联系删除处理!