Dr. April Liang, a hospitalist at Stanford Health Care, once spent 30 minutes writing a single hospital discharge summary—a detailed document meant to guide a patient's future care. Now, thanks to an AI tool called MedAgentBrief, that task has become faster and, remarkably, less burdensome for doctors already stretched thin.
Hospital discharge summaries are a peculiar kind of torture for physicians. These documents must comprehensively and succinctly distill days or weeks of medical details into information that outpatient providers can use to continue patient care safely. The work is critical—every detail matters—but it comes at a steep cost. "One of the main problems in medicine is the amount of information in the system," said Dr. François Grolleau, a postdoctoral scholar at Stanford's Division of Computational Medicine. "It's just too much for any human to process, yet that's what we ask of physicians. They're overwhelmed."
Stanford researchers built MedAgentBrief to see whether large language models—AI systems trained to process and summarize massive amounts of data—could tackle this specific problem. Last summer, they deployed the tool at the 24-bed Stanford Health Care patient care unit at Sequoia Hospital, where 11 hospitalists worked with AI-generated summaries during a 10-week pilot that began August 1, 2025.
The format was carefully designed. Each morning, physicians received secure emails containing AI-generated discharge summaries that followed a clinician-approved template: a one-liner explaining why the patient came to the hospital, a high-level overview of admission, and a structured summary for each diagnosis. Doctors could ignore the summaries entirely. They didn't. "Historically, it's very hard to deploy technology and have it adopted very quickly, especially in medicine," Grolleau said. "For this tool, that wasn't the case. There was so much demand."
The results, published May 8 in JAMA Network Open, revealed something encouraging: the tool was safe. Feedback on 100 AI-generated summaries showed some omissions (25 percent) and inaccuracies (20 percent), but hallucinations—a concern with large language models—were rare at just 2 percent. Physicians rated 88 summaries as having no potential for harm and 21 as having only mild potential harm. One summary was initially flagged as likely to cause moderate harm because it omitted context about a completed antibiotic course versus a prophylactic prescription, but independent reviewers determined it posed no actual risk. No severe harm was reported.
Time savings were more modest than physicians initially perceived. While doctors believed they were saving more than 10 minutes per summary, actual time log analysis showed savings of roughly three minutes per discharge, and the benefit varied from use to use. Yet the pilot revealed something perhaps more valuable: using MedAgentBrief was associated with lower burnout scores among the physicians who used it.
The findings matter because physician burnout remains a crisis in American medicine, tied to worse patient outcomes and a shrinking workforce. MedAgentBrief won't solve that crisis alone, but it offers a concrete example of how AI, deployed thoughtfully and tested rigorously, can reduce one specific source of exhaustion. The tool didn't just save time—it gave doctors back something harder to measure but easier to recognize: a little less weight on their shoulders.
