Fairuz Shadmani Shishir was poring over electrocardiogram data in a lab at the University of Kansas in Lawrence when he realized the signals held more than just heartbeats—they whispered secrets about a patient’s age, sex, and race. That revelation sparked a mission: to build an AI that could protect those personal details without sacrificing the life-saving insights ECGs provide. The result is PP-VAE, a privacy-preserving model that strips away sensitive biometrics while preserving critical clinical signals like left ventricular ejection fraction (LVEF), a key predictor of heart failure and early mortality.
As AI becomes more embedded in medicine, ECGs—long seen as simple heart monitors—are now recognized as soft biometric fingerprints. Algorithms can identify individuals or infer demographic traits from the rhythm alone, raising urgent privacy concerns, especially when hospitals and research centers share data. Shishir, a doctoral student in electrical engineering and computer science, led a team that included Sumaiya Shomaji from KU and cardiovascular experts Amit Noheria, Christopher Harvey, and Amulya Gupta from KU Medical Center to tackle this challenge. Their solution, published in Scientific Reports, uses a variational autoencoder (VAE) enhanced with adversarial training—a method that actively suppresses demographic signals while amplifying clinically relevant ones.
In testing, PP-VAE proved capable of predicting five-year mortality risk and detecting conditions like left ventricular hypertrophy with accuracy on par with leading machine learning models. Crucially, it reduced the identifiability of age, sex, and race by training independent convolutional neural networks to block those signals. Unlike other models that prioritize prediction at the cost of privacy, PP-VAE strikes a balance: it enables secure data sharing without sacrificing diagnostic power. The team validated their model using both internal data from KU Medical Center and public ECG datasets, ensuring robustness across different populations.
Beyond privacy, the model has the potential to reduce bias in cardiac care. Historically, AI models trained on skewed datasets have contributed to the underdiagnosis of women and marginalized racial groups. Shishir’s team intentionally balanced representation across gender and race in their training data—a step toward fairer algorithms. Still, they acknowledge the model’s limitations: trained primarily on regional data, it needs further testing across global populations to ensure equitable performance.
The implications are far-reaching. If adopted widely, PP-VAE could become a cornerstone of ethical AI in cardiology, allowing hospitals to collaborate, researchers to innovate, and patients to retain control over their most personal data. As Shishir puts it, the future of medical AI isn’t just about smarter algorithms—it’s about trustworthy ones.
