FDA Publishes Good Machine Learning Practices Guide For Medical Device Manufacturers
WASHINGTON—In conjunction with Health Canada and the United Kingdom's Medicines and Healthcare Products Regulatory Agency, the U.S. Food and Drug Administration (FDA) has published 10 guiding principles for Good Machine Learning Practice in the development of medical devices. The principles are intended to promote development of safe and effective medical devices that use artificial intelligence and machine learning.
The document is one of the deliverables laid out in the FDA's AI/ML software as a medical device (SaMD) action plan issued in January as it looks to establish a regulatory approach to the fast-developing field. FDA framed the principles as a starting point for international harmonization and is seeking feedback as part of its broader discussion of the regulatory framework for modifications to AI/ML-based SaMD.
The specific principles are:
>Multi-Disciplinary Expertise Is Leveraged Throughout the Total Product Life Cycle
>Good Software Engineering and Security Practices Are Implemented
>Clinical Study Participants and Data Sets Are Representative of the Intended Patient Population
>Training Data Sets Are Independent of Test Sets
>Selected Reference Datasets Are Based Upon Best Available Methods
>Model Design Is Tailored to the Available Data and Reflects the Intended Use of the Device
>Focus Is Placed on the Performance of the Human-AI Team
>Testing Demonstrates Device Performance during Clinically Relevant Conditions
>Users Are Provided Clear, Essential Information
>Deployed Models Are Monitored for Performance and Re-training Risks are Managed
Collectively, the principles cover concerns about the possible biases of algorithms, their applicability to clinical practice and the potential for them to evolve as they are used in the real world. FDA and its collaborators have expanded on each of the principles, explaining, for example, that developers need to have "appropriate controls in place to manage risks of overfitting, unintended bias or degradation of the model” when their systems are “periodically or continually trained after deployment."