Thousands of chemicals surround us every day—woven into the clothes we wear, the food we eat, the cleaning products under our sinks—yet most of them have never been rigorously tested for safety. Now researchers at Texas A&M University's College of Veterinary Medicine and Biomedical Sciences are harnessing artificial intelligence to close a gap that has haunted toxicology for decades.
The problem is deceptively simple: testing chemicals the traditional way takes too long and costs too much. Animal studies require years and significant resources, while human epidemiological research only documents harm after people have already been exposed. "With rodents, there's not enough time or resources to test everything," says Dr. Weihsueh Chiu, the professor leading the effort. "For human studies, people are already getting sick by the time those effects are identified." This mismatch has left thousands of chemicals in commerce with little to no reliable safety data.
Over the past decade, scientists have developed machine learning models called quantitative structure-activity relationship (QSAR) models that use a chemical's molecular structure to estimate safe exposure levels. But these tools had a critical weakness: they often operated as "black boxes," spitting out predictions without explaining their reasoning. Regulators and safety scientists couldn't trust what they couldn't understand.
Chiu's breakthrough came in two stages. First, he and his team redesigned their models to use recognizable, real-world properties—water solubility, biodegradability, toxicity indicators—instead of abstract molecular descriptors. This made predictions transparent and interpretable. But the real innovation went deeper: they added what Chiu calls "uncertainty-aware" machine learning, which estimates how confident the model actually is in each prediction.
This distinction is crucial. Two chemicals might appear equally toxic on paper, but one prediction could rest on solid evidence while the other is a shot in the dark. "We want these machine learning models to not only predict a number but also show how confident they are in that prediction," Chiu explains. When the researchers applied these uncertainty-aware models to more than 126,000 chemicals, they uncovered a revealing pattern: certain chemical groups—metals, polychlorinated compounds, and PFAS—showed much higher uncertainty levels, often because existing data was sparse or the chemicals' behavior too complex to model easily.
Rather than chasing whatever chemical happens to be in the headlines, this approach lets scientists systematically identify the actual gaps in safety knowledge across the entire chemical landscape. The models generate a range of possible outcomes for each prediction, showing both the estimate and how much confidence to place in it. "Just because two chemicals have the same prediction doesn't mean they carry the same worst-case risk," Chiu notes.
The implications for regulatory decision-making are substantial. These tools could help officials identify which substances need further testing, stricter rules, or removal from shelves altogether. In a world where we're exposed to thousands of untested chemicals, the ability to intelligently prioritize which ones deserve urgent attention—and where research efforts would pay the biggest dividends—represents a fundamental shift in how we approach chemical safety. For the first time, we're not just guessing; we're learning where we don't know.
