Walking out of the cath lab one night after taking care of a patient with an acute myocardial infarction, a thought crossed my mind. In cardiology, we would never deploy a new device without vetting it. Before a stent ever touches a coronary artery, it undergoes bench testing, animal trials, and human studies to prove safety and efficacy. This process typically takes years, and in some cases, decades. Data points and design decisions are validated and scrutinized. Yet many artificial intelligence systems influencing medical decisions today lack that rigor. From triaging chest pain in the ED to interpreting echocardiograms, from generating clinical notes to predicting readmission risk, AI now touches almost every corner of medicine. Tools such as ambient documentation and diagnostic support systems are becoming increasingly ubiquitous. However, while the technology has advanced at breakneck speed, our frameworks for validation remain archaic, outdated, or non-existent. The result is a widening gap between innovation and trust, and that gap is precisely where physicians must lead.
Why vetting AI matters more than ever
Earlier this year, I spoke at the American College of Cardiology’s Board of Governors meeting about the critical need for structured vetting of AI in clinical medicine. Because here’s the truth: Vetting doesn’t slow innovation; it makes it safe, ethical, and reproducible. Without it, enthusiasm risks outpacing evidence. We should evaluate AI with the same discipline we apply to any clinical tool. What does it mean to evaluate this with clinical rigor, and what frameworks might we consider:
- Utility: Is it just technology in search of a solution, or does the technology actually improve outcomes or workflow?
- Technical robustness: Is it accurate and precise? Does it demonstrate reliability across diverse populations and conditions, or does it fail at the margins?
- Ethical integrity: Are we actively testing for bias before deploying it?
- Regulatory transparency: Do we understand its logic well enough to explain it to a patient, or to a jury?
Every new model should address these questions before entering clinical care. AI may be capable of analyzing patterns we can’t see. However, it should still meet the same evidentiary standards as any medical device or drug.
The clinician’s evolving role
I see AI as amplifying the clinician’s role rather than diminishing it. Clinicians are ideally positioned to ensure the integrity and relevance of the generated insights. That requires us to transition from being passive end-users to active clinical stewards of technology. When physicians participate early in dataset design, bias testing, and post-market surveillance, we not only protect patients but also help build better AI. Clinical context is the missing ingredient that many tech companies underestimate. We must ask vendors and developers:
- What data trained this model, and does it reflect my patient population?
- How does it perform on populations like mine, not just on average?
- What is its false-positive rate, and how do I verify its outputs?
- When it fails, how will I know?
If we can’t answer these questions confidently, we shouldn’t use the tool. Physicians are the last line of defense between an algorithm’s confidence and a patient’s consequence.
From cath lab to courtroom: Applying medical rigor everywhere
The same principles of vetting clinical AI apply far beyond the hospital walls. In my work developing AI systems for high-stakes decision-making, our team of physicians, engineers, and legal experts faces these challenges daily. The challenges aren’t just technical; they are ethical. When an AI system organizes thousands of pages of medical records for a malpractice case or synthesizes evidence for peer review, accuracy isn’t optional. It’s foundational to fairness. We design our platforms with the same core principles we apply in medicine: traceability, validation, and human oversight. Every output links back to its source document, every finding can be audited, and every user maintains discretion over what is included in the record. We’ve learned that the same discipline of reasoning and transparent provenance we demand in clinical medicine should be applied in every domain where AI intersects with human judgment. Whether it’s a diagnostic decision in the ICU or a case review in a law firm, the principle remains the same: Trust comes from verification.
The real risk isn’t AI, it’s unvetted AI.
AI will make mistakes. So do we. The antidote isn’t fear; it’s accountability. That means continuous validation, bias detection, and human-in-the-loop oversight by design. It means demanding that we hold companies to the same exacting standards we have always held: sensitivity, specificity, positive predictive value, and negative predictive value, just as we expect from any diagnostic test. The most dangerous errors occur when neither the clinician nor the developer understands why the system failed. A biased dataset or a poorly generalized model can wreak havoc faster than any human could. The “black box” mindset must go. If an algorithm’s reasoning can’t be explained to a colleague, its utility in patient care should be questioned.
Leading with clinical integrity
The next generation of AI in professional domains will be judged by its credibility rather than by its complexity. That credibility begins with us. The question isn’t whether AI will transform medicine; it has already done so. The question is whether physicians will shape that transformation or be bystanders as the algorithmic race accelerates. As physicians, we already possess all the necessary skill sets to apply medical reasoning and transparency to the development, validation, and deployment of AI. We understand the stakes; after all, this is the reality of our everyday life. It behooves us to get it right. Responsible AI isn’t about slowing progress; it’s about ensuring that progress serves our patients well. When clinicians guide AI development and adoption, innovation aligns with ethics, and technology becomes an ally. AI doesn’t replace the physician. It tests whether we’re still willing to lead.
Saurabh Guptais aninterventional cardiologist.