Artificial intelligence and kidney transplantation

时间：2024-08-31

Nurhan Seyahi, Seyda Gul Ozcan

Nurhan Seyahi, Department of Nephrology, Istanbul University-Cerrahpaşa, Cerrahpaşa Medical Faculty, Istanbul 34098, Fatih, Turkey

Seyda Gul Ozcan, Department of Internal Medicine, Istanbul University-Cerrahpaşa, Cerrahpaşa Medical Faculty, Istanbul 34098, Fatih, Turkey

Abstract Artificial intelligence and its primary subfield, machine learning, have started to gain widespread use in medicine, including the field of kidney transplantation. We made a review of the literature that used artificial intelligence techniques in kidney transplantation. We located six main areas of kidney transplantation that artificial intelligence studies are focused on: Radiological evaluation of the allograft, pathological evaluation including molecular evaluation of the tissue, prediction of graft survival, optimizing the dose of immunosuppression, diagnosis of rejection, and prediction of early graft function. Machine learning techniques provide increased automation leading to faster evaluation and standardization, and show better performance compared to traditional statistical analysis. Artificial intelligence leads to improved computer-aided diagnostics and quantifiable personalized predictions that will improve personalized patient care.

Key Words: Artificial intelligence; Kidney transplantation; Machine learning, Neuronal networks; Deep learning; Support vector machines

INTRODUCTION

Artificial intelligence (AI) is a “buzzword” that has begun to be used increasingly in medicine, and the field of transplantation is not exempt from that. AI vests the machines with the ability to perform intelligent and cognitive tasks, spanning numerous subfields that are current and popular. Machine learning (ML) is one of the most important subfields of AI and has recently seen an increase of interest in several industries, including the healthcare industry, because of advances in Big Data technology and computing power[1].

The process of ML begins with the ability of the program to observe the collected data and compare them with previous ones to find patterns and results, and then adjust itself accordingly[2]. There are a plethora of statistical-based ML algorithms that can be used in the context of three overarching categories: Supervised learning, unsupervised learning, and reinforcement learning (Table 1)[2]. Supervised learning comprises learning patterns from labeled datasets and decodes the relationships between input variables (independent variables) and their known outputs (dependent variables). Examples of common algorithms used for supervised learning include regression analysis [linear regression, logistic regression (LR), and non-linear regression], decision trees (DT), k-nearest neighbors, artificial neural networks (ANN), and support vector machines (SVM)[2]. The proper classification of LR is contextdependent and depends on whether it is used for prediction (ML) or inferential statistics to evaluate the associations between the independent variable(s) and dependent variables (non-ML)[3]. In the case of unsupervised learning the output variables are unlabeled, and this method focuses on analyzing the relationships between input variables and revealing hidden patterns that can be obtained to create new labels regarding possible outputs[2]. In this way it is possible to discover the existing patterns in the data that we are unaware of. K-means clustering is an example of the algorithms for unsupervised learning. Reinforcement learning is the most advanced category of ML. In this method, a prediction model is built by gaining feedback through random trials of a vast number of possible input combinations and leveraging insight from previous iterations by grading their performance. Finally, Qlearning is an example of the algorithms for reinforcement[2].

In this paper, we made a review of the current English literature on the use of AI in kidney transplantation.

METHODS

We used the PubMed interface (pubmed.gov) to make a query using the combination of the following two keyword groups. The first group included the keywords “kidney transplant”, “renal transplant”, “kidney transplantation”, and “renal transplantation” and the second group included “artificial intelligence”, “machine learning”, “deep learning”, and “neural networks”. Each keyword in the same group was combined using the logical operator “OR”, while the two groups were combined using the logical operator “AND”. We excluded the review articles and ran the query in January 2021. We found 114 articles in total and manually examined them. The articles that were not directly related to kidney transplantation, dealing with other types of renal replacement therapy besides transplantation, solely using LR as the ML method, reviews, conference reports, and editorials were excluded. We also examined the references of the related articles to locate additional literature. Finally, we found 64 articles that were eligible for the review.

We grouped the articles in the following categories: Radiological evaluation (n= 6), pathological evaluation (n= 14), prediction of graft survival (n= 16), optimizing the dose of immunosuppression (n= 7), diagnosis of rejection (n= 6), prediction of early graft function (n= 6), and others (n= 9) (Table 2).

Table 1 Different machine learning categories

Table 2 Machine learning applications used in different kidney transplantation areas

APPLICATION OF AI IN KIDNEY TRANSPLANTATION

Radiological evaluation

The first paper using AI techniques for the evaluation of allografts, based on imaging techniques, was that of Hamiltonet al[4]. The authors used 99mTc-MAG3 captopril renography to evaluate the presence of renal artery stenosis in the allograft. The authors used a neural network-based classifier, and their gold standard was arterio-gram. Following the training of the neural network, they found that an accuracy of 95% could be achieved[4].

Some other papers also used AI techniques for the radiological evaluation of allografts with the aim of diagnosing acute rejection. El-Bazet al[5] investigated the early detection of acute rejection using dynamic contrast-enhanced magnetic resonance imaging (MRI). The researchers automated data acquisition from the MRI using a three-step algorithmic approach and this data feed was linked to a Bayesian supervised classifier to diagnose acute rejection[5]. The authors also studied motion correction models to account for the local motion of the kidney due to patient moving and breathing. Then, they used the perfusion curves to feed the Bayesian supervised classifier with the aim of distinguishing normal and acute rejection[6].

Three additional papers from the same group examined the utility of computeraided diagnostic (CAD) systems for the diagnosis of acute rejection[7-9]. In their first study[7], the authors used deep-learning algorithms, namely, ‘stacked non-negative constrained auto-encoders’, for the prediction of acute rejection. Their data feed was the outcomes of diffusion-weighted MRI (DW-MRI). In their second study[8], in addition to DW-MRI, creatinine clearance and creatinine values were also used for the data feed of convolutional neural network (CNN) based classifiers. In both papers, the overall accuracy for correct diagnosis of acute rejection was above 90%. The authors proposed that their results demonstrated the potential of this new CAD system to reliably diagnose renal transplant rejection.

In a third study[9], they again assessed the utility of the CAD system for the diagnosis of acute rejection using DW-MRI and blood oxygen level-dependent MRI as the image-based sources. The authors also used laboratory data consisting of creatinine and creatinine clearance. In addition, they utilized a deep learning-based classifier, namely, ‘stacked autoencoders’, to differentiate non-rejection from acute rejection in renal transplants[9]. The overall accuracy of the CAD system in detection of acute rejection was around 90%[9].

Pathological evaluation

AI applications have also been used to assess allograft biopsies, where data feed for the classification algorithms was histological findings, molecular biomarkers, or a combination of the two.

Kaziet al[10] used 12 histological features to train a Bayesian network with 110 transplant biopsies. Using the Bayesian network, a relatively inexperienced pathologist was able to make the correct diagnosis in 19 out of 21 cases. The researchers suggested that the integration of data with a computer can give a more consistent diagnosis of early acute rejection[10]. In a follow-up study, the same researchers used a simple neural network for the decision process and the authors pointed out that in Bayesian networks the ‘importance’ attached to each histological feature had to be calculated and programmed into the network at the onset and because of this approach, they have the disadvantage of relative inflexibility[11]. A neural network has the potential of greater flexibility, because the process of ‘training’ a neural network would automatically calculate what ‘weight’ should be allocated to each histological feature. The authors used 12 histological features, 100 transplant biopsies (43 with definite rejection), and 25 additional cases to train a single-layer simple neural network. Eventually, the network was able to correctly classify 19 out of the 21 new cases, leading to the conclusion that neural network technology can dramatically improve the accuracy in histological diagnosis of early acute renal allograft rejection[11].

Marshet al[12] used deep learning algorithms to evaluate intraoperative donor kidney biopsies with the aim of determining which kidneys were eligible for transplantation. The authors used CNNs as a deep learning algorithm. The primary advantage of CNN is that the models can automatically discover prominent features from the data alone, without requiring a set of handcrafted parameters and extensive input normalization. Most recently, CNNs have been explored as primary tools for glomeruli detection[13]. Different models were shown to be able to differentiate image patches containing isolated normal glomeruli from non-glomerular structures[13]. Marshet al[12] trained the network with a total of 870 sclerosed and 2997 nonsclerosed glomeruli that were labeled. The images were acquired from hematoxylin and eosin (HE)-stained frozen wedge donor biopsies. The fully conventional model in the study showed a high correlation with percent global glomerulosclerosis (R2= 0.828). The authors concluded that the performance of the CNN alone was equivalent to that of a board-certified clinical pathologist.

Liuet al[14] examined the diagnosis of T-cell-mediated kidney rejection using a data feed acquired by RNA sequencing. The authors used three ML methods called linear discriminant analysis (LDA), SVM, and random forest (RF). The molecular signature discovery data set involved five kidney transplant patients with T-cell-mediated rejection (TCMR) and five with stable renal function. The forecast models were tested on 703 biopsies with Affymetrix GeneChip expression profiles available in the public domain. The LDA predicted TCMR in 55 of the 67 biopsies labeled TCMR, and 65 of the 105 biopsies designated as antibody-mediated rejection (ABMR). The RF and SVM models showed comparable performances. These data illustrated the feasibility of using RNA sequencing for molecular diagnosis of TCMR.

Halloranet al[15] and Reeveet al[16] used molecular microscopy techniques to evaluate allograft biopsies, including molecular phenotyping with platforms such as microarrays that measure the expression of thousands of genes. To express the likelihood that particular diseases are present in the biopsy, the authors developed the TCMR score and the ABMR score assigned by classifiers (using weighted equations) built by standard ML methods. The authors also developed the Molecular Microscope Diagnostic System (MMDx) that assesses the TCMR and ABMR in a reference set of biopsy samples using ML-derived classifier algorithms[17]. Archetypal analysis and an additional 12 ML methods (individually or in ensembles) were used during the development of the MMDx. Archetype analysis is a probabilistic data-driven unsupervised statistical approach that categorizes separate groups of patients (archetypes)[17]. The ensembles made diagnoses that were both more accurate than the best individual classifiers and almost as stable as the best, in line with the previous studies from the ML literature[17]. Human experts had about 93% agreement (balanced accuracy) signing out the reports, while RF-based automated sign-outs showed similar levels of agreement (92% and 94% for predicting the expert MMDx sign-outs for TCMR and ABMR, respectively)[17].

In 451 biopsy samples where a feedback was obtained, clinicians indicated that the MMDx agreed more commonly with the clinical decision (87%) than histology (80%) (P= 0.0042)[18]. In another study, the same group of researchers explored the frequency of rejection in areas of interstitial fibrosis and tubular atrophy (i-IFTA) in kidney transplant biopsies by using histology Banff 2015 and an MMDx and concluded that i-IFTA in indication biopsies reflected current parenchymal injury, often with simultaneous ABMR but seldom with TCMR[19].

Hermsenet al[20] used whole-slide images of stained kidney transplant biopsies to develop and validate a CNN for histologic analysis in renal tissue stained with periodic acid Schiff. The researchers assessed the segmentation performance for different tissue classes and found that the best-segmented class was “glomeruli”, followed by “tubuli combined” and “interstitium”. The network detected 92.7% of all glomeruli in nephrectomy samples, with 10.4% of false positives. The authors also suggested that the CNN may have utility for quantitative studies involving kidney histopathology across.

Aubertet al[21] used archetype analysis to identify distinct groups of patients with transplant glomerulopathy. The researchers examined data from 552 biopsy samples taken from 385 patients with transplant glomerulopathy, using unsupervised archetypal analysis that integrated clinical, functional, immunologic, and histologic parameters. The authors identified five archetypes with distinct clinical, histologic, and immunologic features, as well as different outcomes (kidney allograft survival rates). The authors suggested that their approach permitted to decrease patient heterogeneity and created meaningful groups in terms of morphologic patterns, disease activity/progression, and risk of failure.

Kimet al[22] used a fully automated system using CNN to identify regions of interest and to detect C4d positive and negative peritubular capillaries in gigapixel immune-stained slides. The authors used deep-learning-assisted labeling to enhance the performance of the detection method. Using this approach, they were able to train the CNN with a small number of samples. They suggested that their system was highly reliable, efficient, and effective for the detection of renal allograft rejection.

Finally, Ligabueet al[23] evaluated the role of a CNN as a support tool for kidney immunofluorescence reporting and found that CNNs were 117 times faster than human inspectors in analyzing 180 test images. The accuracy of the CNN was comparable with that of experienced pathologists in the field.

Graft survival

Simic-Ogrizovicet al[24] used data from 27 patients and 33 variables to train an ANN to predict chronic rejection progression, and suggested that ANN seemed more reliable in the prediction of the chronic rejection course than the usual statistical methods.

Linet al[25] examined single time-point models (LR and single-output ANNs)vsmultiple time-point models (Cox models and multiple-output ANNs) to predict kidney transplant outcomes. The authors concluded that single time-point and multiple time-point models can achieve comparable area under the curve (AUC), except for multiple-output ANNs, which may perform poorly when a large proportion of observations are censored. LR can achieve similar performance as ANNs if there are no strong interactions or non-linear relationships among the predictors and the outcomes.

Aklet al[26] developed an ANN model to predict the 5-year graft survival in livingdonor kidney transplants. Estimates from the validated ANNs were compared using Cox regression-based nomograms. Researchers used data from 1581 patients for training and 319 patients for validation. The positive predictive value of graft survival was 82.1% and 43.5% for the ANNs and Cox regression-based nomogram, respectively. The authors concluded that ANNs were more accurate and sensitive than the Cox regression-based nomogram in predicting 5-year graft survival.

Lofaroet al[27] used two different classification trees to predict chronic allograft nephropathy (CAN) within 5 years after transplantation by evaluating 80 renal transplant patients’ routine blood and urine tests collected after 6 mo of follow-up, and concluded that the use of classification trees is an acceptable alternative to traditional statistical models, especially for the evaluation of interactions of risk factors.

Grecoet al[28] also used DTs to build predictive models of graft failure and retrospectively studied 194 renal transplant patients with 5 years of follow-up. The primary endpoint was graft loss within 5 years of follow-up. In the classification algorithm, the researchers studied the following parameters: Age, gender, time on dialysis, donor type, donor age, human leukocyte antigen (HLA) mismatches, delayed graft function (DGF), acute rejection episode, CAN, and body mass index and concluded that the use of DTs in clinical practice may be an acceptable alternative to the traditional statistical methods.

For the evaluation of the 3-year graft survival in kidney recipients with systemic lupus erythematosus (SLE), Tanget al[29] applied classification trees, LR, and ANNs to the data describing kidney recipients with SLE retrieved from the United States Renal Data System database. The 95% confidence interval of the area under the receiveroperator characteristic curve (AUROC) was used to quantify the discrimination capacity of the prediction models. The authors concluded that the performance of LR and classification trees was not inferior to that of more complex ANN.

Yooet al[30] assessed the predictive power of ensemble learning algorithms [survival DT, bagging, RF, and ridge and least absolute shrinkage and selection operator (LASSO)] and compared their outcomes to those of the conventional models (DT and Cox regression) to predict graft survival in a retrospective analysis of the data from a multicenter cohort of 3117 kidney transplant recipients. By means of a survival DT model, the index of concordance was found as 0.80, with the episode of acute rejection during the 1-year post-transplant being associated with a 4.27-fold increase in the risk of graft failure. In conclusion, the authors reported that ML methods may provide flexible and practical tools for predicting graft survival.

In a cross-sectional study, Nematollahiet al[31] examined the 5-year graft survival in 717 patients, using a multilayer perceptron of ANN (MLP-ANNs), LR, and SVMs to construct prediction models. The authors assessed the validity of the models using different evaluation tools such as AUC, accuracy, sensitivity, and specificity and concluded that the SVM and MLP-ANN models could efficiently be used for survival prediction in kidney transplant recipients.

Tapaket al[32] compared the LR and ANN approaches to predict graft survival in their data set from a retrospective study of 378 patients. According to their analysis, the ANN model outperformed LR in the prediction of kidney transplantation failure. The ANN model showed a higher total accuracy (0.75vs0.55) and better area under the ROC curve (0.88vs0.75) when compared to LR.

Zhouet al[33] assessed the association of 17 proteins with allograft rejection in a cohort of 47 patients. The researchers used the LASSO variable selection method to select the significant proteins that predict the hazard of allograft loss. Conventional model selection techniques accept the strategy of best subset selection or some of the stepwise variants. Though, such a strategy is computationally unreasonable when the number of predictors is large. As demonstrated, the subset selection method may be numerically unstable, thus the developing model may suffer from poor prediction accuracy. As one of the most popular variable selection methods, LASSO is able to overcome the computational hurdle of the subset selection approach. The authors deduced that KIM-1 and VEGF-R2 had individual significant positive associations with the hazard of renal failure.

In a study conducted to predict the future values of estimated glomerular filtration rate (eGFR) for kidney recipients, Rashidi Khazaeeat al[34] developed and validated an ANN-based model (multilayer perceptron network) using three static covariates of the recipients’ gender and the donors’ age and gender, as well as 11 dynamic covariates of the recipients including current age, time since transplant, serum creatinine, fasting blood sugar, weight, and blood pressures available at each visit. The development and validation datasets included 72.7% and 27.3% of the 25811 records from the historical visit data of 675 adult kidney recipients. The ANN-based model dynamically predicted a future eGFR value based on a number of fixed and timedependent longitudinal data. The authors suggested that using such analytical tools may help in realizing the administration of personalized medicine in kidney transplantation.

In another study, Market al[35] used an ensemble of methods including random survival forests constructed from conditional inference trees. The benefit of combining diverse models to predict kidney transplant survival is that different models may work better than others on different cohorts of the data. The dataset was provided by the United Network for Organ Sharing and consisted of recipients who had kidney transplant surgery in the United States from 1987 to 2014[36,37]. The authors used 73 variables of the 163199 observations available during the chosen 10-year time period and proposed that the model achieved a better performance than the estimated posttransplant survival model used in the kidney allocation system in the United States.

In a multicenter study, Raynaudet al[38] analyzed 403497 eGFR measurements of 14132 patients using a number of different ML techniques and identified eight distinct eGFR trajectories with latent class mixed models. Using a validation cohort of 9992 individuals, the authors suggested that their results provided the base for a trajectorybased assessment of kidney transplant patients for risk stratification and monitoring.

In a critical paper, Baeet al[39] examined whether ML techniques are superior to conventional regression analysis. Studying the records of 133431 adult deceased donor kidney transplant recipients from the national registry data, the authors randomly selected 70% of the transplant centers for training and 30% for validation. They used different ML procedures (gradient boosting and RF) and regression analysis, with the aim of predicting DGF, 1-year acute rejection, death-censored graft failure, all-cause graft failure, and death in the training set. After comparing the performances of different models in the validation set, the authors asserted that ML does not outperform the conventional regression-based approaches in predicting various kidney transplant outcomes.

Optimizing the dose of immunosuppression

McMichaelet al[40] developed an intelligent dosing system for optimizing FK 506 therapy, and suggested that the computerized dosing algorithm for FK 506 is as an “expert system” using stochastic open loop control theory[41]. They developed an AI dosing system (IDS) that would predict the drug dosages and levels. This IDS was programmed with hundreds of dosing histories,i.e., previous dose, previous level, current dose, and current level. The system was then used as a model to develop an equation that relates the current FK 506 dose and level with the desired dose and level. The IDS calculates the FK 506 dose required to achieve the target level. A prospective validation study shown that the model was 95% accurate in describing the relationship between FK 506 dosage and FK 506 plasma level, and that there were no biases in the dosing predictions[40].

Camps-Vallset al[42] used neural networks for personalizing the dosage of cyclosporine A (CyA) in patients who had undergone kidney transplantation. The researchers used three kinds of networks [multilayer perceptron, finite impulse response (FIR) network, and the Elman recurrent network] while the formation of neural-network ensembles was used in a scheme of two chained models where the blood concentration predicted by the first model constituted an input to the dosage prediction model. After using 364 samples from 22 patients for training and 217 samples from 10 patients for testing, the authors decided that the best model was an ensemble of FIR and the Elman network. This model yielded anrvalue of 0.977 in the validation set. The authors also suggested that neural models have proven to be well suited to this problem not only because of the accuracy of their estimations but also because of their precision and robustness.

In Görenet al[43]’s study, 654 CyA measurements and 20 input parameters from 138 patients were used to train (473 samples) and validate (181 samples) an adaptivenetwork-based fuzzy inference system. The model aimed at predicting CyA concentration based on 20 input parameters which included concurrent use of drugs, blood levels, sampling time, age, gender, and dosing intervals. The authors measured the performance of the developed model using root-mean square error, which was calculated as 0.057 for the validation set. In conclusion, the researchers suggested that their model could effectively assist physicians in choosing the best therapeutic drug dose in the clinical setting.

In two consecutive papers, Seelingat al[44] described the development of a computer-aided decision system for planning tacrolimus therapy and then the integration of this system to the hospital information system. The authors used data from 492 patients and 13053 examinations, and created a classification model (conditional inference trees) using patient profiles, associated distributions, and intervals of medication adaption (decrease, increase, or maintain). The theoretical model resulted in 16 classes of patients and associated distributions, which were then translated to a medical logic module. Eventually, a method for determining semiautomated immunosuppressive therapy was created to guide nephrologists.

In their study where they used data from 1045 renal transplant patients, Tanget al[45] utilized 80% of the randomly selected data to develop a dose prediction algorithm, and employed 20% of the data for validation. Multiple linear regression, ANN, regression tree (RT), multivariate adaptive regression splines, boosted RT, support vector regression, RF regression, lasso regression, and Bayesian additive RT were applied, and their performances were compared in this work. Among all the ML models, RT performed best in both the derivation [0.71 (0.67-0.76)] and validation cohorts [0.73 (0.63-0.82)]. The authors suggested that the ML models used to predict the tacrolimus dose may facilitate the administration of personalized medicine.

In Thishyaet al[46]’s study, the ANN and LR models were used to predict the bioavailability of tacrolimus and the risk of post-transplant diabetes based on theABCB1andCYP3A5genetic polymorphism status. Besides polymorphism, the authors used the age, gender, BMI, and creatinine data from 129 patients for the input layer of their ANN and concluded that the ANN and multifactor dimensionality reduction analysis models explored both the individual and synergistic effects of variables in modulating the bioavailability of tacrolimus and risk for post-transplant diabetes.

Diagnosis of rejection

Hummelet al[47] examined 145 patients who had kidney biopsy for the differential diagnosis of nephrotoxicity and acute cellular rejection using 18 different clinical and laboratory values for the input parameters, including tacrolimus dose, serum creatinine, and histocompatibility, to train the ANN. The classification results were considered significant by the experts who evaluated the classifiers. However, the researchers asserted that higher rates of sensitivity would be required to apply the classifier in clinical practice. In a separate paper, the same group of authors used the same database to examine the performance of different AI techniques to screen the need for biopsy among patients suspected of having nephrotoxicity or acute cellular rejection during the first year after transplantation[48]. They used the ANN, SVM, and Bayesian interference (BI) to indicate if the clinical course of the event suggested the need for biopsy. The technique that showed the best sensitivity value as an indicator for biopsy was the SVM with an AUC of 0.79. The authors suggested that this technique could be used in clinical practice[48].

In Metzgeret al[49]’s study, SVM-based classification was used for resection and non-rejection. The researchers examined 103 patients (39 for training and 64 for validation) with a kidney biopsy and used CE-MS-based urinary proteome analysis for the data feed. The application of the rejection model to the validation set resulted in an AUC value of 0.91. In total, 16 out of the 18 subclinical rejections and all 10 clinical rejections (BANFF grades Ia/Ib) and 28 of the 36 controls without rejection were correctly classified.

Pinedaet al[50] developed an integrative computational approach leveraging donor/recipient (D/R) exome sequencing and gene expression to predict the clinical post-transplant outcome. The authors made a statistical analysis of 28 D/R kidney transplant pairs with biopsy-proven clinical outcomes with rejection, identifying a significantly higher number of mismatched non-HLA variants in antibody mediated rejection (AMR). They also identified 123 variants associated mainly with the risk of AMR and applied an ML technique to circumvent the issue of statistical power. Eventually, they found a subset of 65 variants using RF that predicted post-transplant AMR with a very low error rate.

In another study, the same group of authors evaluated 37 biopsy-paired peripheral blood samples from a cohort with stable kidney function with AMR and TCMR by RNA sequencing[51]. The authors used ML tools to identify the gene signatures associated with rejection and found that 102 genes (63 coding genes and 39 noncoding genes) associated with AMR (54 upregulated), TCMR (23 upregulated), and stable kidney function (25 upregulated) perfectly clustered with each rejection phenotype and highly correlated with main histologic lesions (P= 0.91). Their analysis identified a critical gene signature in peripheral blood samples from kidney transplant patients who underwent AMR, and this signature was sufficient to differentiate them from patients with TCMR and immunologically quiescent kidney allografts.

Wittenbrinket al[52] used a pretransplant HLA antigen bead assay data set to predict the risk of post-transplant ACR risk. Employing an SVM-based algorithm to process and analyze the HLA data, the model achieved the prediction of 38 graft recipients who experienced ACR with an accuracy of 82.7%. The authors reported that this was one of the highest prediction accuracy rates in the literature for pre-transplant risk assessment of ACR.

Prediction of early graft function

Shoskeset al[53] used retrospective data from 100 cadaveric transplants to train an ANN with the aim of predicting DGF. For input, the authors used donor and recipient characteristics and then validated the model in 20 prospective cadaveric transplants. In the validation cohort, the ANN was able to predict DGF with an 80% accuracy. The authors suggested that the use of such a model could help improve donor/recipient selection and perioperative immunosuppression and reduce overall costs.

In Brieret al[54]’s study, the researchers used an ANN and LR to predict DGF. In the examination of 304 cadaveric kidney transplantations, the researchers used data from 198 patients for training and 106 patients for validation. The results of the study showed that LR analysis was more sensitive in predicting ‘no DGF’ (91vs70%), while the ANN predicted ‘DGF’ with a higher sensitivity (56%vs37%). The neural network was 63.5% sensitive and 64.8% specific. In conclusion, the authors deduced that ANN may be used for prediction of DGF in cadaveric renal transplants.

Santoriet al[55] assessed the efficiency of a neural network model to forecast a delayed decrease of serum creatinine in pediatric kidney recipients. In this study, the neural network was constructed with a training set of 107 pediatric kidney recipients, using 20 input variables. The model was validated in a second set of 41 patients. The overall accuracies of the neural network for the training set, the validation set, and the whole patient cohort were 89.1%, 76.92%, and 87.14% respectively. The developed ANN model had a higher sensitivity compared to LR analysis. The authors inferred that the neural network model could be used to predict a delayed decrease in serum creatinine among pediatric kidney recipients.

In another study, Decruyenaereet al[56] constructed eight different ML methods to predict DGF and compared them to LR by using the data from 475 cadaveric kidney transplantations. Besides LR, the authors employed the following methods to construct the prediction models: LDA, quadratic discriminant analysis, and SVMs using linear, radial basis function and polynomial kernels, DT, RF, and stochastic gradient boosting. The performance of the models was assessed by computing sensitivity, positive predictive value, and AUROC after a 10-fold-stratified cross-validation. The authors found that the linear SVM had the highest discriminative capacity (AUROC: 84.3%), outperforming each of other methods, except for the radial SVM, polynomial SVM, and LDA. However, it was the only method superior to LR. Eventually, the authors asserted that the linear SVM was the most appropriate ML method to predict DGF.

In Costaet al[57]’s evaluation of the impact of donor maintenance-related (arterial blood gas pH, serum sodium, blood glucose, urine output, mean arterial pressure, vasopressors use, and reversed cardiac arrest) variables on the development of DGF, data from 443 cadaveric donors ML methods that included DT, neural network, and SVM to locate donor maintenance-related parameters that were predictive of DGF were used. However, according to the multivariable LR analysis, the donor maintenance-related variables did not have any impact on DGF occurrence.

In a large scale study, Kawakitaet al[3] aimed to build personalized prognostic models based on ML methods to predict DGF. Using the data obtained from the United Network for Organ Sharing/Organ Procurement and Transplantation Network, their development set included a total of 55044 patients and the validation set included 6176 patients. Of the selected 26 predictors, 13 were donor-related, eight were recipient-related, and five were transplant-related. The authors used a development dataset with the selected features to train five ML algorithms: LR, elastic net, RF, extreme gradient boosting (XGB), and ANN. For performance comparison, a baseline model based on LR was developed. After training the ML algorithms, the authors assessed each model for three performance measures: Discrimination, calibration, and clinical utility using different metrics. All of the algorithms trained with the new predictors performed better or equally well in these characteristics compared to the baseline model, especially the ANN and XGB. The XGB is an ensemble learning method, which assembles DT as its building blocks to build a strong learner that is able to learn the nonlinear relationships between the predictors and the outcome. The authors suggested that ML was a valid alternative approach for the prediction and identification of the predictors of DGF, adding an important piece of evidence to support the use of ML in driving medical progressions.

Other areas

In addition to the above-mentioned areas, AI techniques are used in kidney transplantation for different purposes. We located different articles in the following topics: Assessment of risk for various complications such as cardiovascular risk[58], pneumonia[59,60], and CMV infection[61], prediction of changes in lipid parameters[62], prediction of HLA response[63-65], and assessment of the risk of kidney transplantation during the coronavirus disease 2019 pandemic[66].

CONCLUSION

AI is used in a large spectrum of studies in kidney transplantation, ranging from pathological evaluation to outcome predictions. Those studies pave the way for increased automation, which will increase standardization and speed in medical evaluations. CAD and quantifiable personalized predictions are developing at a great pace that will enhance precision medicine.