Even complicated and confusing topics will be easily developed and covered if you request our help writing an essay. Place an order today!

629High rates of readmission after hospitalization for heart failure impose tremendous burden on patients andthe healthcare system.1–3 In this context, predictive modelsfacilitate identifcation of patients at high risk for hospitalreadmissions and potentially enable direct specifc interventions toward those who might beneft most by identifyingkey risk factors. However, current predictive models usingadministrative and clinical data discriminate poorly onreadmissions.4–8 The inclusion of a richer set of predictorvariables encompassing patients’ clinical, social, and demographic domains, while improving discrimination in someinternally validated studies,9 does not necessarily markedlyimprove discrimination,10 particularly in the data set to beconsidered in this work. This richer set of predictors mightnot contain the predictive domain of variables required, butdoes represent a large set of data not routinely collected inother studies.Another possibility for improving models, rather thansimply adding a richer set of predictors, is that predictionmight improve with methods that better address the higherorder interactions between the factors of risk. Many patientsmay have risk that can only be predicted by modeling complexrelationships between independent variables. For example, noavailable variable may be adequately explanatory; however,Background—The current ability to predict readmissions in patients with heart failure is modest at best. It is unclear whethermachine learning techniques that address higher dimensional, nonlinear relationships among variables would enhanceprediction. We sought to compare the effectiveness of several machine learning algorithms for predicting readmissions.Methods and Results—Using data from the Telemonitoring to Improve Heart Failure Outcomes trial, we compared theeffectiveness of random forests, boosting, random forests combined hierarchically with support vector machines orlogistic regression (LR), and Poisson regression against traditional LR to predict 30- and 180-day all-cause readmissionsand readmissions because of heart failure. We randomly selected 50% of patients for a derivation set, and a validation setcomprised the remaining patients, validated using 100 bootstrapped iterations. We compared C statistics for discriminationand distributions of observed outcomes in risk deciles for predictive range. In 30-day all-cause readmission prediction,the best performing machine learning model, random forests, provided a 17.8% improvement over LR (mean C statistics,0.628 and 0.533, respectively). For readmissions because of heart failure, boosting improved the C statistic by 24.9% overLR (mean C statistic 0.678 and 0.543, respectively). For 30-day all-cause readmission, the observed readmission rates inthe lowest and highest deciles of predicted risk with random forests (7.8% and 26.2%, respectively) showed a much widerseparation than LR (14.2% and 16.4%, respectively).Conclusions—Machine learning methods improved the prediction of readmission after hospitalization for heart failurecompared with LR and provided the greatest predictive range in observed readmission rates. (Circ Cardiovasc QualOutcomes. 2016;9:629-640. DOI: 10.1161/CIRCOUTCOMES.116.003039.)Key Words: computers ◼ heart failure ◼ machine learning ◼ meta-analysis ◼ patient readmissionCirc Cardiovasc Qual Outcomes is available at https://allaplusessays.com/order DOI: 10.1161/CIRCOUTCOMES.116.003039© 2016 American Heart Association, Inc.Received May 26, 2016; accepted October 17, 2016.From the Section of Cardiovascular Medicine, Department of Internal Medicine (B.J.M., N.S.D., E.M.B., K.D., H.M.K.), Department of Psychiatry andthe Section of General Medicine, Department of Internal Medicine (A.M.), and Robert Wood Johnson Foundation Clinical Scholars Program, Departmentof Internal Medicine, Yale School of Medicine, and Department of Health Policy and Management (H.M.K.), Yale School of Public Health, New Haven,CT; Center for Outcomes Research and Evaluation, Yale-New Haven Hospital, New Haven, CT (B.J.M., N.S.D., E.M.B., K.D., S.-X.L., H.M.K.); andDepartment of Statistics, Yale University, New Haven, CT (B.J.M., S.N.N.).Current address for N.S.D.: Brigham and Women’s Hospital, Boston, MA.Current address for E.M.B.: Boston Children’s Hospital, Boston, MA.*Drs Negahban and Krumholz contributed equally as senior authors to this work.This article was handled independently by Javed Butler, MD, MPH, as a Guest Editor. The editors had no role in the evaluation or in the decision aboutits acceptance.The Data Supplement is available at https://allaplusessays.com/order to Harlan M. Krumholz, MD, SM, Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine,1 Church St, Suite 200, New Haven, CT 06510. E-mail of Machine Learning Techniquesfor Heart Failure ReadmissionsBobak J. Mortazavi, PhD; Nicholas S. Downing, MD; Emily M. Bucholz, MD, PhD;Kumar Dharmarajan, MD, MBA; Ajay Manhapra, MD; Shu-Xia Li, PhD;Sahand N. Negahban, PhD*; Harlan M. Krumholz, MD, SM*Original ArticleDownloaded from https://allaplusessays.com/order by on May 5, 2021630 Circ Cardiovasc Qual Outcomes November 2016interactions between variables may provide the most usefulinformation for prediction.Modern machine learning (ML) approaches can accountfor nonlinear and higher dimensional relationships between amultitude of variables that could potentially lead to an improvedexplanatory model.11,12 Many methods have emerged from theML community that can construct predictive models usingmany variables and their rich nonlinear interactions.13–15 Threewidely used ML approaches may provide utility for readmission prediction: random forest (RF), boosting, and support vector machines (SVM). The primary advantage of ML methodsis that they handle nonlinear interactions between the availabledata with similar computational load. RF involves the creationof multiple decision trees16 that sort and identify important variables for prediction.14,17 Boosting algorithms harness the powerof weaker predictors by creating combined and weighted predictive variables.18,19 This technique has been applied to preliminary learning work with electronic health records.20 SVM is amethodology that creates clearer separation of classes of variables using nonlinear decision boundaries, or hyperplanes, fromcomplex, multidimensional data in possibly infnite dimensionalspaces.21–26In this study, we tested whether these ML approaches couldpredict readmissions for heart failure more effectively thantraditional approaches using logistic regression (LR).7,9,27,28We tested these strategies using data that included detailedclinical and sociodemographic information collected duringthe Tele-HF trial (Telemonitoring to Improve Heart FailureOutcomes),29 a National Institutes of Health–sponsored randomized clinical trial to examine the effect of automatedtelemonitoring on readmission after hospitalization for heartfailure.10,29,30 We further tested whether the ML techniquescould be improved when used hierarchically where outputsof RF were used as training inputs to SVM or LR. We evaluated these approaches by their effect on model discriminationand predictive range to understand ML techniques and evaluate their effectiveness when applied to readmissions for heartfailure.MethodsData SourceData for this study were drawn from Tele-HF, which enrolled 1653patients within 30 days of their discharge after an index hospitalization for heart failure.30 In addition to the clinical data from the index admission, Tele-HF used validated instruments to collect data onpatients’ socioeconomic, psychosocial, and health status. This studycollected a wide array of instrumented data, from comprehensive,qualitative surveys to detailed hospital examinations, including manypieces of data not routinely collected in practice, providing a largerexploratory set of variables that might provide information gain.The primary outcome was all-cause readmission or mortalitywithin 180 days of enrollment.29 A committee of physicians adjudicated each potential readmission to ensure that the event qualifed asa readmission rather than another clinical encounter (eg, emergencydepartment visit) and to determine the primary cause of the readmission. The comprehensive nature in which outcomes were tracked anddetermined across the various sites makes this a well-curated data setthat can potentially leverage this information where other trials maynot, as readmissions often occur at other institutions external to thestudy network.31 Results of the Tele-HF study revealed that outcomeswere not signifcantly different between the telemonitoring and control arms in the primary analysis.29Analytic Sample SelectionFor this study, we included all patients whose baseline interviewswere completed within 30 days of hospital discharge to ensure that theinformation was reflective of the time of admission. Of the 1653 enrolled patients, we excluded 36 who were readmitted or died before theinterview, 574 whose interviews were completed after 30 days fromdischarge, and 39 who were missing data on >15 of the 236 baselinefeatures to create a study sample of 1004 patients for the 30-day readmission analysis set. To create the 180-day readmission analysis set, wefurther excluded 27 patients who died before the end of the study andhad no readmission events, leaving 977 patients in the sample.Feature SelectionWe used 472 variables (called features in the ML community) forinput. We frst gathered the full 236 baseline patient characteristicsavailable in Tele-HF.10,30 This set included data extracted from medical record abstractions, hospital laboratory results, physical examination information, as well as quality of life, socioeconomic, anddemographic information from initial patient surveys. The polarityof qualitative and categorical questions was altered if necessary toensure that the lowest values reflect a strongly negative answer ormissing data, and the highest values correspond to strongly positiveanswers (Data Supplement). In addition, we created dummy variablesfor each of the 236 features to indicate whether the value was missingor not (0 and 1 values).Defnition of OutcomesWe developed methods to predict 4 separate outcomes: (1) 30-dayall-cause readmission; (2) 180-day all-cause readmission; (3) 30-dayreadmission because of heart failure; and (4) 180-day readmissionbecause of heart failure. We trained predictive models for each ofthese outcomes and compared them with each other.Predictive TechniquesWe built models using both traditional statistical methods and ML methods to predict readmission, and compared model discrimination and predictive range of the various techniques. For traditional statistical methods,we used an LR model, and a Poisson regression. Three ML methods—RF, boosting, and SVM—were used for readmission prediction.Logistic RegressionA recent study from Tele-HF used random survival forest methodsto select from a comprehensive set of variables for Cox regressionWHAT IS KNOWN• Prediction models for readmissions in heart failureare often modest, at best, at discriminating betweenreadmitted and nonreadmitted patients.WHAT THE STUDY ADDS• This study compares popular machine learning methods for comparison against traditionaltechniques.• This study introduces models that can select variables for us from a larger, comprehensive data set.• This study shows the improvements in usingmachine learning methods and a comprehensivelycollected data set, to help readers understand factorsthat contribute to readmission and limitations to current methods in readmission predictionDownloaded from https://allaplusessays.com/order by on May 5, 2021Mortazavi et al Machine Learning for Readmission 631 analysis.10 The article had a comprehensive evaluation of the mostpredictive variables in the Tele-HF data set, using various techniquesand various validation strategies. Using the variables selected in thearticle would provide the most accurate representation of an LR model on the Tele-HF data set for comparison purposes. Therefore, weused the same selected variables for our current study to comparemodel performance, as the current analysis is concerned with fnding improved analytic algorithms for predicting 30-day readmissionsrather than primarily with variable selection.To verify that this LR model, which leverages the comprehensivevariable selection technique in previous work,10 was the best modelfor comparison, we compared its performance against other LR models built using varied feature selection techniques. The frst modelselected variables for LR by lasso regularization, the next a forward,stepwise feature selection technique based on each variable’s likelihood ratio. All models were validated using 100 bootstrapped iterations; the former could not fnd a set of predictive variables, the lattervaried in features chosen but selected, on average, only 5 variables.The model built using features selected from the work of Krumholzet al10 outperformed the other techniques, when comparing mean Cstatistics. For a further detailed discussion of the other LR modelsbuilt, we refer readers to the Data Supplement.The testing and selecting of appropriate weights are further detailedin the Data Supplement.Once the derivation and validation sets were created, a traditional LR model was trained. We used SAS and the 5 most importantvariables as identifed previously10: blood urea nitrogen, glomerularfltration rate, sex, waist:hip ratio, and history of ischemic cardiomyopathy. The remaining techniques were created using the full rawdata of 472 inputs for each patient and were trained on the differentreadmission outcome labels.We trained models to predict either the 30-day readmission or180-readmission outcome. Although we supplied the same inputdata to each method, we varied the outcome information provided.Because the 180-day readmission set contains more outcomes information, including the total number of readmission events during thetrial, it is possible that it might be easier to predict 180-day readmission, given the propensity of some patients to be readmitted multipletimes. To provide a range of modeling for the fnal binary prediction(readmitted/not readmitted), we ran 5 distinct training methods basedon the labels provided to the algorithm. For 30-day readmission prediction, we frst generated 3 different training models with the samebaseline data, but 3 different outcomes: 30-day binary outcomes, 180-day binary outcomes, and 180-day counts of readmissions. We thenused the predictive models created by these 3 training methods to predict 30-day readmission outcomes in the validation cohort. Similarly,to test 180-day readmission, we generated 2 different training methods with same baseline data but different outcomes, namely, 180-daybinary outcomes and 180-day counts of readmissions. A detailed description of how these methods vary from each other is available inthe Data Supplement.We ran the models generated on the validation set and calculatedthe area under the receiver operating characteristics curve (C statistic), which provided a measure of model discrimination. The analysiswas run 100× in order to provide robustness over a potentially poorrandom split of patients and to generate a mean C statistic with a95% confdence interval (CI). We also evaluated the risk-stratifcationabilities of each method. The probabilities of readmission generatedover the 100 iterations were then sorted into deciles. Finally, we calculated the observed readmission rate for each decile to determine thepredictive range of the algorithms.These models should be used prospectively, to provide clinicianswith decision-making points. For each iteration, we calculated thepositive predictive value (PPV), sensitivity, specifcity, and f-score, acommon measurement in ML,25 which is calculated asComparison TechniquesGiven the flexibility of nonlinear methods, the complexity of thedesired models might overwhelm the available data, resulting inoverftting. Although all the available variables can be used in MLtechniques such as RF and boosting, which are robust to this overftting, we may require some form of feature selection to help preventoverftting in less robust techniques like SVM.22 Further explanationof the predictive techniques, as well as the evaluation method, is provided in the Data Supplement and additionally in the textbook byHastie et al.16Hierarchical MethodsTo overcome the potential for overftting in LR and SVM, we developed hierarchical methods with RF. Previous hierarchical methodsused RF as a feature selection method because it is well suited to adata set of high dimensionality with varied data types, to identify asubset of features to feed into methods such as LR and SVM.22 RF iswell known to use out-of-bag estimates and an internal bootstrap tohelp reduce and select only predictive variables and avoid overftting,similar to AdaBoost.32 RF is well known to be able to take varied datatypes and high-dimensional data and reduce to a usable subset, whichwe also verify through our bootstrapped cross-validation (through theuse of a derivation and validation set).33 However, rather than to usethe list of variables supplied by RF as the inputs to SVM or LR, thef-score PPV SensitivityPPV Sensitivity=+2* Tele-HF data set allowed us to use the probability predicted by RF asthe inputs to SVM and to LR to create 2 new hierarchical methods.The Tele-HF data set has comprehensive outcomes informationfor all patients. We leveraged this by designing 2 prediction modelsusing all of the aforementioned methods in a hierarchical manner. Weused the RF algorithm as a base for this technique. The RF model,trained on all of the available features, produced a probability formany readmission events (eg, 0–12 events in this data set). Theseprobabilities were then given to LR and SVM as inputs from whichto build models. A detailed discussion of the method by which RFand SVM as well as RF and LR are combined together is in the DataSupplement.Analytic ApproachFor each of the predictive techniques above, we iterated the analysesillustrated in Figure 1 100×. To construct the derivation and validation data sets, we split the cohort into 2 equally sized groups, ensuringequal percentages of readmitted patients in each group. To accountfor a signifcant difference in numbers of patients who were readmitted and not readmitted in each group, we weighted the ML algorithms. The weight selected for the readmitted patients was the ratioof not-readmitted patients to readmitted patients in the derivation set.*The f-score provides a balance between PPV and sensitivity, similar to a C statistic, but at designated thresholds on the ROC curve.The data-driven decision threshold for prospective modeling measurements was that which maximized the f-score.Furthermore, to test whether a narrowed focus of prediction couldimprove discrimination, we conducted the same analyses using heartfailure–only readmission instead of all-cause readmission.All ML analyses were developed in R. The list of R packagesused in this study is included in the Data Supplement. The HumanInvestigation Committee at the Yale School of Medicine approved thestudy protocol.ResultsBaseline characteristics of patients in 30- and 180-day analyticsamples are detailed in Table 1. In both analytic samples, theproportion of women and blacks was ≈40%. Ninety percent ofpatients had New York Heart Association class II or III heartfailure, with ≈70% having left ventricular ejection fraction of

  • Assignment status: Already Solved By Our Experts
  • (USA, AUS, UK & CA PhD. Writers)
QUALITY: 100% ORIGINAL PAPER –  NO PLAGIARISM – CUSTOM PAPER

testimonials icon
SURNAME 1NAMEINSTITUTION AFFILIATIONCOURSE NAME AND NUMBERINSTRUCTORS NAMEDATEWHAT IS LOVELove has many definitions and has different inclinations to...
testimonials icon
1ItemSpecific Commodity SelectedWhy are you invested in this commodity chain?What is the composition of your customers/stakeholdergroups? (Yo...
testimonials icon
Question:Which item on the list is one that you would consider to be one you exercise most naturally? Which item on the list is one that you...
testimonials icon
/*! elementor - v3.6.5 - 27-04-2022 */ .elementor-heading-title{padding:0;margin:0;line-height:1}.elementor-widget-heading .elementor-heading...
testimonials icon
Running head: MARGINAL COST1Marginal CostStudents NameInstitutional AffiliationMARGINAL COST2DiscussionWhat businesses do?When the marginal cost...
testimonials icon
Discuss the damage wrought by the Indian Ocean earthquake and tsunami. Include details on death tolls, total amount of damage, effect/impa...
testimonials icon
task is in attachment below....
testimonials icon
Practical Connection Assignment (Individual Work)Assignment:Provide a reflection of at least 250 words (double spaced) of how the kno...
testimonials icon
The purpose of this assignment is for the student to learn to assimilate, analyze, critique, and summarize original research articles. In other wor...
testimonials icon
Explain whether this maxim is morally right according to the Categorical Imperative: “I should let my friend help me cheat on this exam,...
testimonials icon
Write an Essay About a Culture Different From Yours div class=’user-generated-description’!–googleoff: index...

Other samples, services and questions:

Calculate Price

When you use PaperHelp, you save one valuable — TIME

You can spend it for more important things than paper writing.

Approx. price
$65
Order a paper. Study better. Sleep tight. Calculate Price!
Created with Sketch.
Calculate Price
Approx. price
$65