Good Machine Learning Practices for Medical Device Development: Guiding Principles


October 27, 2021, in accordance with the Artificial Intelligence / Machine Learning (AI / ML) based software as a blueprint for medical devices (Action Plan), the United States Food and Drug Administration (FDA) released its Good Machine Learning Practices for Medical Device Development: Guiding Principles (Guiding Principles) developed in collaboration with Health Canada and the Medicines and Health Products Regulatory Agency (MHRA) of the United Kingdom (United Kingdom). In the action plan, the FDA noted that stakeholders had asked the FDA to encourage the harmonization of the development of good machine learning practices (GMLP) through consensus efforts on standards and other community initiatives. GMLPs are AI / ML best practices (for example, data management, feature extraction, training and evaluation) that are analogous to quality system practices or good software engineering practices.


The FDA also solicited stakeholder comments on GMLPs in its 2019 report. Proposed Regulatory Framework for Changes to Artificial Intelligence / Machine Learning (AI / ML) -based Software as a Medical Device (SaMD) Discussion Paper and Request for Comments. The 10 Guiding Principles, while non-formal or binding, provide a useful framework for developers and identify areas in which collaborating bodies and international standards organizations could work to advance GMLPs through policy development and development. ‘formal guidelines.

Guiding principles

  1. Leverage multidisciplinary expertise throughout the product lifecycle
    Having a thorough understanding of how the ML compatible medical device will fit into the clinical workflow can help ensure that these devices are safe and effective. Developers need to rethink the traditional device development process to include contributions from internal stakeholders such as the information security officer, privacy and data policy staff, and medical staff. Input from these stakeholders may be required earlier in the design and development process than is typical for traditional devices.

  2. Implement good software engineering, data quality assurance, data management and security practices
    These practices include a systematic risk management and design process designed to capture and communicate decisions and rationale for the design, implementation and management of risk, and to ensure the authenticity and integrity of data. Developers should also consider the Content of pre-market submissions for cybersecurity management in medical devices guidance and interoperability of ML compatible devices within systems or workflows of different manufacturers.

  3. Design clinical studies with participants and datasets representative of the target patient population
    FDA Compliant Improving the Diversity of Clinical Trial Populations – Eligibility Criteria, Recruitment Practices and Trial Design Guidelines for Industry (discussed in depth here), data collection protocols should ensure that the relevant characteristics of the target patient population, use and measurement inputs are sufficiently represented in a sample of adequate size in the clinical study or studies. training and testing data sets. This allows the results and the use of data to be generalized and helps to mitigate bias.

  4. Ensure training data sets are independent of test sets
    Developers should consider dependency sources (for example, patient, data acquisition, and site factors) and ensure that training data sets and test data sets are appropriately independent of each other. This principle suggests that regulators will expect developers to explain how they separated the training and testing sets to control for bias and confounders.

  5. Ensure that the selected benchmark datasets are based on the best available methods
    Developers should use the best available and accepted methods to develop a reference standard to ensure that they are collecting clinically relevant and well-characterized data, and must ensure that they understand the limitations of the reference. When available, developers should use accepted benchmark data sets for model development and testing. This can present a barrier for ML compatible devices that address disease states or therapeutic areas for which there is no universally accepted reference standard.

  6. Adapt the model design to the available data and reflect the intended use of the device
    The design of the model should be adapted to the available data and actively mitigate known risks (for example, overfitting, performance degradation, security risks). The Guiding Principles suggest that regulators can expect developers to provide more detailed information to demonstrate alignment between the proposed intended use of a product and the indications for use and model design in terms of ‘risk mitigation and demonstration of effectiveness and performance.

  7. Focus on the performance of the Human-AI team
    Because the model has a human element, developers must consider human factors and the interpretability of model outputs. Considerations that inform the development of traditional devices, such as the impact of human factors, the need for specialized training to use the device, and the expected effect on clinical outcomes (that is to say, improvements) and the impact on clinical workflows and other users, will be equally important for machine learning tools.

  8. Demonstration of device performance by testing under clinically relevant conditions
    Device performance should be evaluated independently of the training data set. Test performance should take into account the target patient population, clinical environment, human users, measurement inputs, and potential confounders.

  9. Provide users with clear and essential information
    Users should be provided with clear and contextually relevant information, including intended use of the product and indications for use, information on the performance of the model in relevant subgroups, characteristics of the data used to train and test the model, acceptable inputs, known limitations, how to interpret the user interface, and how the model fits into the clinical workflow. Users should also be made aware of device changes, updates to real-world performance monitoring, the basis for decision-making, and a way to communicate product concerns to developers.

  10. Monitor deployed models for performance and ensure recycling risks are managed
    Developers should monitor the deployed models. Additionally, when models are trained after deployment, whether on a continuous or periodic basis, developers should ensure that there are appropriate controls to manage the risks of overfitting, unintentional bias, or degradation of the model (for example, dataset flaw) that could have an impact on the security or performance of the deployed model. Developers also need to think about how to ensure that the datasets they use to develop and train models don’t become stale or obsolete over time. The Guiding Principles suggest that regulators expect developers to consider how changes to real-world clinical assumptions, diagnostics, or treatment standards can impact the performance of the tool during its cycle. expected life.

Although the Guiding Principles provide practical and common sense principles for GMLPs, the concepts are not necessarily new. The most difficult task for regulators and for industry will be to develop concrete practices, policies and procedures for ML tools within or alongside the existing framework for the regulation of the medical device quality system in the United States. , UK, EU and other regions.

The guidelines file, FDA-2019-N-1185, is open to public comment. FDA recently announced that it consider publishing a draft guidance on marketing submission recommendations for a change control plan for the software functions of artificial intelligence / machine learning (AI / ML) compatible devices, as a authorized development resources, during the current 2022 fiscal year.

© 2021 McDermott Will & EmeryRevue nationale de droit, volume XI, number 302


Margie D. Carlisle