Can Machine Learning Target Health Care Fraud? Evidence from Medicare Hospitalizations
By Shubhranshu Shekhar, Jetson Leder-Luis, and Leman Akoglu
Hospitals often face financial incentives to overbill or commit fraud in order to maximize their reimbursements. In their paper, “Can Machine Learning Target Health Care Fraud? Evidence from Medicare Hospitalizations,” authors Shekhar, Leder-Luis, and Akoglu introduce a novel machine learning approach to detect overbilling by hospitals. The method utilizes an unsupervised algorithm, which does not depend on labeled data from prior enforcement actions, to flag hospitals with suspicious diagnosis, procedure, and billing code patterns. To validate the accuracy of their model, the authors cross-reference their findings with data from the Department of Justice (DOJ) on hospitals involved in anti-fraud lawsuits, demonstrating that their method offers an 8-fold improvement in identifying fraudulent hospitals compared to random targeting. The model is also explainable, allowing auditors and investigators to see exactly which codes or billing patterns led to a hospital being flagged as suspicious. This approach has the potential to significantly enhance the ability of investigators to identify and audit suspicious hospitals more efficiently.
Healthcare fraud manifests in various forms, making detection particularly challenging. According to the U.S. Government Accountability Office (GAO), Medicare recorded improper payments totaling $46.2 billion in 2019 alone. Common types of fraud include upcoding, where hospitals falsely assign more severe diagnosis codes to justify higher payments, and billing for services that were not medically necessary. Once fraud is detected, the DOJ can pursue legal action under the False Claims Act, but many fraudulent hospitals likely go undetected. The DOJ data used in the paper represents only a partial ground truth, meaning some hospitals flagged by the model may be fraudulent but not yet caught by enforcement. The authors’ approach is not concerned with identifying the specific method of fraud; rather, it focuses on hospitals whose anomalous billing patterns result in higher payments, which are more likely to signal fraudulent activity.
The authors’ model is based on an ensemble method that integrates three unsupervised detection algorithms to uncover abnormal patterns in hospital billing. The first algorithm identifies irregularities in hospitals’ use of ICD-10 codes, which are used to document patient diagnoses and treatments. These codes directly influence the Diagnostic Related Group (DRG) codes, which determine hospital reimbursement rates. Hospitals may manipulate these ICD-10 codes to secure more lucrative DRG assignments. Given the large number of ICD-10 codes, the authors employ a feature subspace detector that allows them to pinpoint localized anomalies—detecting fraud that might only involve a small subset of codes.
The second algorithm focuses on analyzing DRG code patterns. It compares the frequency distribution of DRG codes across hospitals that treat similar patient populations. This peer-based comparison helps to flag hospitals that assign unusually expensive DRGs more often than their peers, despite offering similar types of care to similar types of patients. The third algorithm targets hospitals with disproportionately high expenditures that cannot be explained by patient characteristics or medical histories. By identifying hospitals where costs exceed what would be expected based on patient profiles, the model helps detect excessive or unnecessary billing.
To create a final ranking of hospitals based on suspiciousness, the authors use an instant-runoff voting method, which aggregates the rankings generated by each of the three detection models. The result is a comprehensive list of hospitals, ranked by the likelihood of fraudulent behavior. While only 1 in 20 hospitals nationwide have been flagged as fraudulent by the DOJ, the authors’ method identified 21 DOJ-named hospitals within their top 50 ranked hospitals, achieving an 8-fold improvement in detection over random targeting.To further validate their findings, the authors present case studies of specific hospitals, highlighting the exact ICD and DRG codes that raised suspicion, providing practical examples of how their model can guide investigations.
Overall, this paper presents a highly effective new tool for detecting healthcare fraud in Medicare claims. The unsupervised and explainable nature of the model makes it well-suited for use by policymakers, auditors, and law enforcement agencies, and its scalability means it can easily be applied across large datasets. By dramatically improving the detection rate of fraudulent hospitals, this method can help prioritize investigations and audits, directing limited enforcement resources toward the most suspicious cases. The authors also suggest that their approach could be adapted to detect fraud in other healthcare contexts, such as outpatient claims and doctor’s office visits, and could benefit private insurers facing similar challenges. Ultimately, this research contributes to the ongoing effort to reduce fraud and waste in the U.S. healthcare system by offering a promising machine-learning-based approach to identifying and explaining potentially fraudulent behavior.