Principal Investigator
PI Home Department
Source of Support
Total Cost
D. Tancredi, M.S., C. Kuenneth, M.P.H
Ctr. for Health and Technology
02/2005 - 05/2005

The purpose of this study is to examine utilization rates of the UC Davis Emergency Department (ED) over an 18 month period and how those utilization rates vary according to patient characteristics already routinely collected in patient records (e.g. patient payor, primary and secondary diagnosis at each visit). This analysis is designed to inform an intervention in which patient navigators intervene with ED patients who have a profile that may put them at risk of becoming a frequent user of ED services (ie, four or more visits in 12 months). 
Data were gathered from Decision Support Services at the UC Davis Health System for emergency department visits that occurred between July 1, 2003, and December 31, 2004. Two descriptive analyses were conducted: one looking at the rate of return for a second visit after the initial visit in the study period; the other looking at the rate of return for additional visits after any visit in the study period.
For the first analysis, the number of people who had at least one ED visit in the study period was 48,561. Each of these patients was followed after their initial visit until either her next visit in the study period or until the end of the study period, whichever came first. The number of patients with a second visit during the study period was 8,557 and the total amount of follow-up time was 35,016 person-years for an overall rate of return following an initial visit of 0.24 per year of follow-up.
In this computation, no account was taken of patients who died or who migrated from the area, so the total amount of follow-up time is overestimated and the rate of return is underestimated. However, the consequences of this bias should be small, given that we are interested in examining relative variations in return rates and we do not expect the magnitude of differential bias due to mortality and migration to be large enough to be of concern.
For the second analysis, each of the 48,561 patients with an initial visit during the study period was followed until the end of the study. The total follow-up time was 41,099 person-years and there were a total of 13,636 revisits during the study period, for an overall rate of return of 0.33 per year of follow-up. As with the first descriptive analysis, no account was taken of mortality or migration. The second analysis differs from the first in that it includes all revisits from a patient, not just the first revisit. Altogether, there were 5,079 ED visits during the study period that were a patient’s 3rd or greater visit.. The number of patients with 3 or more visits during the study period was 3,322 and 1,580 of these patients had 4 or more visits.
The revisit rates defined in both descriptive analyses were computed separately within various subpopulations in order to provide insight into patient characteristics associated with interesting variations in ED revisit rates. The two descriptive analyses strongly suggest that there are patient characteristics that are associated with ED utilization and that the amount of ED utilization accounted for frequent users is greater than would be expected by chance variation in a homogeneous population. Hence, multivariate regression techniques were used to identify characteristics associated with high rates of utilization. The regression technique most closely identified with analyzing rates of utilization is probably Poisson regression, which is well-suited for count data (e.g. number of ED visits in an interval of time) An alternative approach is logistic regression, which analyses dichotomous outcomes such as an indicator for whether a patient had 4 or more ED visits in a fixed-interval of time.
We used both techniques for purposes of variable selection and for model corroboration, but we elected to feature the results of a logistic regression because it allows us to account for a crucial distinction between two possible patterns of ED utilization in a subpopulation. Subpopulations with a moderate number of heavy ED users should be distinguished from subpopulations with a heavy number of moderate ED users. Although Poisson regression is well-suited for distinguishing among populations on the basis of overall ED utilization, it is not as well-suited as Logistic Regression for distinguishing among those high-utilization subpopulations according to how the ED visits are distributed among its members. A study that wants to examine overall reduction of ED visits might choose to feature Poisson regression results. However, for a study that wants to profile which subpopulations have more high-volume ED users, the multivariable logistic regression is preferred.
The study dataset for the logistic regression analysis consisted of the 14,813 patients who had an ED visit during the six-month burn-in period beginning on July 1, 2003 and ending on December 31, 2003 and who were age 18 years or older on the date of their last ED visit during that burn-in period.
The results of the descriptive analyses and of the Poisson regression analyses were used to help build a logistic regression model that compared people having at least ED 4 visits during the 12-month follow-up period (CY 2004) to those who had three or less. The decision to use 4 visits as a threshold for frequent use was partly based on the consideration that if ED utilization occurred at a rate of 0.33 visits per year in a homogeneous population, over 99.9% of the populations would have 3 or fewer visits per year, so that having 4 or more visits in a calendar-year ought to be a strong signal of ``trouble’’. There were 221 members of the study sample of 14,813 patients (1.5%) who had 4 or more ED visits in CY 2004. Using the rule-of-thumb of 10 outcomes per candidate predictor in a logistic regression model, our threshold allowed us to consider a moderately large number of predictors.
Primary and secondary ICD-9 codes were aggregated into discharge diagnosis groups defined by Appendix B (i.e. discharge diagnoses) of Mandelberg et al. These were included in the model to identify the diagnoses most likely to result in repeat ED visits. A slight revision of Mandelberg’s coding scheme was used for three of the diagnoses (Abdominal Pain, Chest Pain and Renal Failure) so that they would include the ICD-9 codes in corresponding clusters from Appendix A of his paper (i.e. admission diagnoses). Also, the clusters for alcohol use, alcohol withdrawal, drug use, and drug withdrawal were combined into a single cluster. The ICD-9 codes from all of a patient’s records in the burn-in period were used to code variables indicating the presence or absence of each of the discharge diagnosis groups at any time in the burn-in period. For example, if a patient had two visits in the burn-in period and the first record has ICD-9 codes corresponding to Abdominal Pain and Nausea and the second record has ICD-9 codes corresponding to Nausea and Headache, then the indicators for Abdominal Pain, Nausea and Headache would all be coded “Present”, while the other diagnosis indicators would be coded as “Absent”. A variable recording the total number of visits in the burn-in period was also used as a predictor. This variable was trimmed at 10 (i.e. values of 11 or more were recoded to 10).
Appendix A shows a cross-classification of this variable with a variable showing the total # of visits in CY 2004 and demonstrates a strong association between the two variables. Other variables, taken from a patient’s latest (most recent) record in the burn-in period, were included to control for confounding. Payer and age were defined as categorical variables, with age 21-40 years serving as the reference group for age and private/other county serving as the reference group for payor. Additionally, interaction terms were created for age and insurance status. Race/ethnicity was not used in the model because almost half the observations in the dataset (n = 27,276) were missing this variable. Revisit rates by gender also seemed somewhat unimportant, as the rates for men and women were very close to the average revisit rate.
Model-building and model-selection was performed in an iterative fashion and included some use of automated variable-selection methods. To offset the biasing effects of these approaches, split-sample validation was used to identify and exclude covariates whose apparent importance might be spurious. Our primary objective in the selection of variables was to identify those associated with high propensity for a patient being a frequent ER user.
The Tables 1 and 2 show the main covariates (ie, race, payor, gender, and age) and any combination of covariates describing patient subpopulations with a revisit rate in excess of the average rate (0.24 (ED visits/yr) for the analysis of the next visit after first visit, 0.33 (ED visits/year) for the analysis of all visits). Table 1 only looks at the number of patients with one visit and a subsequent revisit. Table 2 looks at all visits and revisits. In each table, excess revisits indicates the number of ED visits that could be averted if the rate for a particular demographic were reduced to the average rate. Although a number of patient profiles had negative excess revisits, meaning the revisit rate for the profile was lower than the population average, this table only includes positive values. Note that race is a largely uncoded field in this dataset.
Demographic profiles of patients were analyzed for their impact on excess revisits. Profiles for patients age 41 to 64 and payor status of Medicare, Medi-Cal, or Sacramento County seemed to be the best predictors of excess visits. The following profiles yielded excess revisits of more than 500:
41-64 y.o., male, Medicare: 684 excess visits
41-64 y.o., male, Medi-Cal: 944 excess visits
41-64 y.o., female, Medi-Cal: 592 excess visits
41-64 y.o., male, Sacramento County: 642 excess visits
The logistic regression model results are presented in Tables 3 and 4. The model showed a statistically significant relationship between insurance, namely Sacramento County and UC Davis Risk, and the likelihood of using the ED 4 or more times in CY 2004. Additionally, the interaction between age 41-64 and having Medicare or Medi-Cal was also significant. Also, the number
A number of diagnostic groups showed statistically significant positive associations with repeat ED visits (Table 3): 1) abdominal pain, AOR = 1.52; 2) upper respiratory infection, AOR = 2.33; 3) asthma, AOR = 1.75; 4) pneumonia/bronchitis, AOR = 1.63; 5) seizures, AOR = 2.80; 6) headache, AOR = 3.85; 7) urinary tract infection, AOR = 2.22; 8) diabetes, AOR = 1.78; 9) COPD, AOR = 3.70; 10) hematologic problems, AOR = 2.08; and 11) drug/alcohol use, AOR = 2.58.
Tables 4 and 5 present the results of a simplified logistic regression model and illustrates the potential for using routinely-recorded patient data to identify prospectively those patients at high risk for becoming a frequent ED user in the near-future. This model uses a small number of patient characteristics (Table 4) to produce a ED Frequent-user propensity score for each patient. The characteristics are based simply on patient age-group, payor, and the diagnostic information accumulated in a 6-month burn-in period. Notably, the diagnostic information is simply summarized into two components, total # of ED visits and the presence/absence of any of the Diagnoses listed in Table 3. Table 5 reports how effective various choices of thresholds for the propensity score might be in a prospective application. For example, if a propensity score of 0.040 or greater is used to classify patients as being high-risk for becoming a frequent ED user in the upcoming year, then 53.4% of patients who actually became frequent ED users would be identified, although 4% of non-frequent ER users (589 of 14587) would be misclassified. A higher propensity score threshold of, say, 0.080 would reduce the number of false positives to less than 2% (154 of 14857), while still allowing over a third of the frequent ED users to be identified.
The proper selection of a threshold would depend on how one appraises the relative costs of the two sorts of misclassification and the relative benefits of the two sorts of correct classification. Once supplied with this information, we can customize our recommendations for thresholds to use in this or in similar models.
An even simpler classification rule would be based on the total number of visits in a 6-month burn-in period. As can be seen in Appendix A, a rule that classifies patients according to whether they had 4+ visits in a 6-month burn-in period would catch over 40% of those patients who became frequent users in a subsequent 12-month follow-up period, while misclassifying less than 1.5% of infrequent users in the subsequent period.
These results demonstrate that routinely-collected patient data can be useful in distinguishing among patients according to their propensity for becoming frequent future users of the ED. The results of this analysis are consistent with Mandelberg’s report, which showed an increased association between ED use and chronic diseases such as asthma, COPD, and hematologic problems. This analysis, however, also found an association between ED use and a number of acute conditions, including respiratory infection and pneumonia. Both the Mandelberg study and this analysis identify between two to four times more drug and alcohol related diagnoses among frequent ED users than nonfrequent users. Because of the amount of chronic illness and drug/alcohol abuse among frequent ED users, patient navigators may best serve this population by directing them to chronic disease management programs and substance abuse treatment in the community.
Speaking of general patient profiles, these analyses show that most frequent ED users are likely to be between the ages of 41 and 64 and insured by Medicare, Medi-Cal, or Sacramento County. One interesting group to study further would be the 41-64 year olds who are on Medicare. Since this population is younger than age 65, it is possible that the majority of these users are Medicare eligible based on disability and, as such, may have social needs that are different from the other frequent ED users.
An attractive and not-unexpected finding is that even a few patient characteristics and summaries of recent past utilization can provide useful information regarding the prospective propensity for frequent ED use. Table 5 and Appendix A present examples of the operating characteristics that one may expect from a simple risk-assessment model. Easily-available measures can be used to identify patients who have a far-higher chance of becoming frequent ED users than other patients have (e.g. 35+ % chance vs. 1.5% chance or less). In applications in which the cost of misclassifying a non-frequent user are slight compared to the benefits of correctly classifying a frequent user, simple risk-assessment models based on readily-available data can prove to be very valuable.
Table 1

Table 2

Table 3

Table 4

Appendix A