AI in the Lab

AI in the Lab: The Regulatory Landscape in 2026

Illustration for an IGOR blog post on AI in lab documentation, showing research records, regulatory compliance, and FDA, EMA, and global AI guidance.

This is Part 1 of our three-part series on AI use in research documentation.

AI this, AI that. It feels like the world has gone AI mad, and the life sciences and biotech sectors are no different. Every new startup promises the next breakthrough “harnessing the power of AI”. It's tempting to tune it all out as noise.

But in all seriousness, AI is a big deal. A really big deal. It's genuinely going to change the way we work in the lab - from how we design experiments to how we document them. And it's already happening faster than most people realize.

Scientists are already using AI in the lab. Not the heavily marketed "AI agents" that ELN vendors demo at trade shows, but the quiet, practical kind: asking ChatGPT to draft a protocol summary, using Copilot to clean up a data analysis script, running instrument data through a machine learning model to flag anomalies. It's happening in academic labs, in biotech startups, and in pharma R&D groups, often without any formal policy in place - and often without understanding what the implications could be.

Regulators and industry groups have noticed. In the past two years, the FDA, EMA, MHRA, and ISPE have published draft guidance, reflection papers, policy papers, and industry frameworks that show how regulators and industry groups are thinking about AI in regulated life sciences work. None of it was written specifically with the bench scientist in mind - the one wondering whether it's really OK to let ChatGPT draft a notebook entry. But the principles they establish have clear implications for exactly that scenario.

In April 2026, the FDA issued a warning letter that appears to be one of the clearest public enforcement examples to date of the FDA calling out inappropriate AI use in pharmaceutical manufacturing documentation. That changes the conversation.

This is the first post in a three-part series. Here we cover the regulatory landscape - what each major body has published, what it actually says, and the gaps that still remain. Part 2 gets into what this means at the bench, specifically how ALCOA+ data integrity principles apply when AI is writing your notebook. Part 3 is where this becomes actionable: A practical guide to building an AI use policy before the gap between your team's current practice and your documented procedures gets any wider.

Important Disclaimer: This post is for informational purposes only and reflects our understanding of publicly available guidance as of the date of publication. It is not legal or regulatory advice. Requirements vary by jurisdiction, product type, and intended use. If you're making compliance decisions, talk to a qualified regulatory professional who knows your specific situation.

The Regulatory Landscape: From Draft Guidance to Enforcement

Between January 2024 and January 2026, the regulatory picture around AI in drug development changed significantly. What had been limited, fragmented guidance became a visible set of frameworks from the FDA, EMA, MHRA, and ISPE - plus the first-ever joint transatlantic principles from the two largest pharmaceutical regulators on the planet (FDA and EMA).

These documents matter. Not because they're going to land on your desk tomorrow as enforceable requirements, but because they establish the principles that will shape those requirements. And some expectations, particularly around data integrity and documentation, already apply under existing GxP rules. 

It's also worth noting that the technology is developing faster than regulators can write the rulebooks. The agencies are watching closely and publishing guidance when they can, but there are real questions - about model updates, about validation scope, about AI-generated records - where current guidance doesn't give you a clean answer yet. Companies navigating AI adoption today are doing it partly on instinct. The current frameworks help narrow that uncertainty, but they don't eliminate it.

Here is what each major body has published, what it actually says, and where the most relevant practical implications sit.

FDA Draft Guidance (January 2025): The Seven-Step Credibility Framework

On January 6, 2025, the FDA published its first draft guidance specifically addressing the use of Artificial Intelligence in drug and biological product development. This was a long time coming. The FDA had been reviewing submissions with AI components since at least 2016 and by 2023 had seen over 500 such submissions - but until this guidance, there was no published framework for how the agency evaluated them. 

The core of the guidance is a seven-step risk-based credibility assessment framework. It was written for sponsors preparing regulatory submissions, but the logic applies well beyond that context. Any lab thinking seriously about AI in regulated workflows will find it a useful scaffold.

Step 1: Define the question of interest. What specific question, decision, or concern is the AI model addressing? This forces precision. "We're using AI to help with data analysis" is not a defined question. "We're using a machine learning classifier to predict which patients are likely to experience a specific adverse event based on baseline lab values" is more like it. The specificity is the point. If you can't define the question the AI is answering, you can't govern how it answers it.

Step 2: Define the context of use. How exactly will the model's output be used? Is it the sole basis for a decision, or one input among many? Is it informing a regulatory submission directly, or supporting internal decision-making? The context of use determines how much credibility evidence is needed and what level of validation is appropriate.

Step 3: Assess the model risk. The FDA proposes a two-dimensional risk matrix based on model influence (how much the AI output drives the final decision) and decision consequence (what happens if the decision is wrong). A model with high influence on a high-consequence decision - like predicting drug toxicity - needs significantly more validation than one with low influence on a lower-consequence question, like suggesting which samples to re-run based on a QC flag.

Steps 4 through 6 cover planning, executing, and documenting the credibility assessment. Specify what evidence you'll collect, collect it, and record what happened - deviations included. The documentation requirement is doing real work here. Without it, there's no way for anyone reviewing the submission to understand how the AI was used or whether the outputs can be trusted.

Step 7: Determine adequacy. Is the AI model adequate for its intended context of use, given the evidence collected? If not, loop back - refine the model, adjust the risk mitigation strategy, or revisit the credibility assessment framework.

The guidance has a defined scope. It doesn't cover AI used purely in drug discovery, or operational tasks that sit outside patient safety, drug quality, or study integrity. Where it does apply - nonclinical, clinical, manufacturing, post-marketing - the standard is the same as for any other analytical tool. AI doesn't get a separate evidentiary bar.

One thing worth noting: the guidance is focused on AI models that generate or process data for regulatory purposes. A scientist using an LLM to draft a notebook entry or summarize results doesn't obviously fall within its scope. That's a real blind spot, and it's one we'll come back to later in this series.

(Source: FDA, "Considerations for the Use of Artificial Intelligence to Support Regulatory Decision-Making for Drug and Biological Products," Draft Guidance, January 2025. Docket No. FDA-2024-D-4689.)

EMA Reflection Paper (September 2024): A European Perspective Across the Full Lifecycle

Finalized by the EMA's CHMP and CVMP in September 2024 after extensive public consultation, this paper provides the European perspective on AI across the entire medicines lifecycle, from drug discovery through post-authorization. It's broader in scope than the FDA's guidance and takes an explicitly human-centered, risk-proportionate approach.

The EMA categorizes AI applications by their potential for high patient risk and high regulatory impact. Applications that are high on both dimensions - AI informing dosing decisions or patient stratification in clinical trials, for example - face the highest bar. Applications that are lower risk and lower regulatory impact get more flexibility.

Several points stand out for labs working in regulated environments.

First, existing GxP requirements apply fully to AI-assisted processes. The EMA does not carve out a special AI category with different rules. If a process is covered by GLP, GCP, or GMP, then any AI system used in that process is covered by the same requirements. This includes documentation, validation, audit trails, and data integrity standards.

Second, the EMA explicitly states that AI models used in clinical trials should be pre-specified - fixed and documented before study unblinding. This is not merely a modeling preference. In regulated submissions, pre-specification is closely tied to data integrity, bias control, and the credibility of the analysis. Changing your analytical model after you've seen the results, even if the change is AI-assisted, undermines the scientific validity of the analysis and is exactly the kind of thing that creates issues in marketing authorization review.

Third, the EMA expects full documentation of AI systems: model architecture, training data, data processing pipelines, validation methodology, and performance monitoring plans. For proprietary commercial AI tools, providing this documentation is genuinely difficult - vendors don't disclose their training data composition, and model architecture details are often proprietary. This creates a practical tension the guidance doesn't fully resolve.

Fourth, if an AI tool hasn't been qualified through the EMA's Qualification of Novel Methodologies pathway, the required documentation may be requested as part of marketing authorization review. Labs planning to include AI-generated analyses in regulatory submissions should factor in that documentation burden early in their planning, not at submission time.

(Source: EMA, "Reflection Paper on the Use of Artificial Intelligence (AI) in the Medicinal Product Lifecycle," EMA/CHMP/CVMP/83833/2023, adopted September 2024.)

Joint FDA–EMA Guiding Principles (January 2026): The Transatlantic Alignment

Published on January 14, 2026, this joint statement represents the most significant regulatory alignment on AI in drug development to date. The FDA and EMA worked together to identify ten principles covering the full drug development lifecycle - and the fact that two regulators with historically different approaches agreed on a shared framework is itself significant.

The ten principles are: human-centric by design; risk-based approach; adherence to standards; clear context of use; multidisciplinary expertise; data governance and documentation; model design and development practices; risk-based performance assessment; life cycle management; and clear, essential information. Read together, they describe a compliance posture that would feel familiar to anyone who has worked in a regulated research environment - because these are fundamentally GxP principles adapted for AI.

A few are worth highlighting specifically.

The first principle - human-centric by design - is the most frequently cited and probably the most immediately relevant for bench scientists. It establishes that AI systems should be designed around human oversight proportionate to risk. In plain English: AI should complement human decision-making, not replace it. For work that touches patient safety or regulatory submissions, that means someone accountable has to check what the AI produced. This is exactly the posture that the Purolea warning letter (discussed below) documents as violated: a quality unit that accepted AI-generated compliance documentation without adequate human review.

The data governance and documentation principle echoes what the ISPE GAMP guide states explicitly: organizations should maintain records of AI system design, development, validation, and deployment sufficient to support regulatory review. For labs using commercial AI tools, this includes records of which tool was used, how it was configured, what it was asked to do, and how the output was verified.

The life cycle management principle addresses a practical challenge that many labs haven't thought about yet. When your cloud-based AI provider silently updates their model - which happens routinely, and which vendors don't always announce prominently - does that constitute a change to a system used in a regulated process? The joint principles don't spell it out explicitly, but the life cycle management principle points that way: organizations should be able to monitor, assess, and manage changes that could affect AI performance in its intended context of use. 

A vendor-initiated model update reasonably fits that description. But what those processes look like in practice is still being worked out.

(Source: FDA and EMA, "Guiding Principles of Good AI Practice in Drug Development," January 2026.)

MHRA: The UK's Principles-Based Approach

The UK's Medicines and Healthcare products Regulatory Agency has taken a somewhat different path, focusing initially on AI in medical devices rather than in lab documentation and drug development processes specifically.

In April 2024, the MHRA published "Impact of AI on the Regulation of Medical Products," outlining how it would implement the UK Government's AI White Paper across its regulatory activities. The framework centers on five principles: safety, security and robustness; appropriate transparency and explainability; fairness; accountability and governance; and contestability and redress. These mirror the broader UK approach to AI governance, which favors principles-based regulation over prescriptive rules.

The most practically innovative element has been the AI Airlock, launched in 2024 as MHRA’s first regulatory sandbox for AI as a Medical Device. The pilot phase completed in March 2025 and published findings that will inform future UK AI medical device guidance. In September 2025, MHRA established a National Commission into the Regulation of AI in Healthcare, and in December 2025 opened a call for evidence running through February 2026. The outputs from that commission will likely shape UK guidance on AI in lab and pharmaceutical contexts through 2027 and beyond.

For lab scientists in the UK, the MHRA's data integrity guidance is the most directly and immediately relevant framework. The MHRA has been explicit that ALCOA+ principles apply equally to AI-generated data and traditional data - meaning any AI output used in a GxP process must be attributable, legible, contemporaneous, original, and accurate, plus complete, consistent, enduring, and available. That's not a new standard created for AI. It's an existing standard being applied clearly.

It's also worth noting that the MHRA's approach post-Brexit has given it some flexibility to diverge from the EU on specifics, but on the fundamentals of data integrity the UK and EU positions are aligned. Labs operating in both markets are unlikely to face contradictory requirements on ALCOA+ compliance for AI outputs.

(Source: MHRA, "Impact of AI on the Regulation of Medical Products," April 2024.)

ISPE GAMP Guide: Artificial Intelligence (July 2025): The Industry Framework

While regulatory agencies set expectations, the International Society for Pharmaceutical Engineering provides the practical framework for meeting them. In July 2025, ISPE published the GAMP Guide: Artificial Intelligence - a 290-page document developed by a team of industry and academic experts that represents the most comprehensive guidance available for validating AI in GxP environments.

The guide builds on the established GAMP 5 framework (second edition) and specifically addresses AI's unique characteristics: quality risk management, data governance including training data provenance, model lifecycle management from concept through decommissioning, dynamic systems that change after deployment, cybersecurity considerations including adversarial attacks on AI models, and AI as or in medical devices.

Two points are especially important for bench scientists and lab managers.

First, the guide is explicit that AI systems in GxP contexts must satisfy all existing regulations for computerized systems. 21 CFR Part 11, EU Annex 11, and ALCOA+ data integrity requirements all apply to AI models and their outputs, just as they apply to any other electronic system. This isn't a new burden placed on AI - it's a clarification that AI doesn't get an exemption from existing requirements.

Second, and this point tends to surprise people, many GxP organizations will need to decide whether prompts, AI inputs, AI outputs, and human review records form part of the controlled record for a given workflow. For AI-assisted documentation, treating those artifacts as traceable records is the more conservative compliance posture. If you're using an LLM to help draft documentation in a GxP context, the prompt you gave it, the output it produced, and the human review performed may all need to be captured or controlled, depending on the workflow and how the output is used. Most labs currently have no mechanism for capturing this, which means they have a traceability gap for every AI-assisted document that has been incorporated into their quality system.

The guide also introduces a practical framework for assessing which AI applications need full GAMP validation versus lighter-touch fitness-for-purpose assessment. Not everything needs to be treated like a critical GMP system. The depth of validation should be proportionate to the risk - and the guide provides a framework for making that determination that aligns with the FDA's credibility assessment approach and the EMA's risk-based categorization.

(Source: ISPE, "GAMP Guide: Artificial Intelligence," published July 2025.)

The FDA’s Clearest AI Enforcement Action So Far

For all the guidance documents published since 2024, the regulatory landscape remained largely theoretical for most lab scientists. Framework documents are important but they don't always feel urgent. That changed on April 2, 2026.

The FDA issued a warning letter to Purolea Cosmetics Lab in Livonia, Michigan. The letter covers a range of CGMP violations - insanitary conditions including insects and filth in manufacturing areas, an inadequate quality unit, absent batch-release testing, and failure to conduct process validation before distributing products. These are serious violations on their own terms. But one section of the letter stands apart from the rest.

Under the heading "Inappropriate Use of Artificial Intelligence in Pharmaceutical Manufacturing," the FDA documented something that, in terms of public enforcement records, is new.

During the inspection, the firm's owner stated to FDA investigators that she had used AI agents to help her firm comply with FDA regulations. Specifically, she used AI to create drug product specifications, procedures, and master production and control records - the core documentation of a pharmaceutical quality system.

The FDA's response is direct. If a firm uses AI as an aid in document creation, it must review the AI-generated documents to ensure they are accurate and actually compliant with CGMP. Purolea's failure to conduct that review was cited as a violation of 21 CFR 211.22(c), the provision governing quality unit responsibilities for approving or rejecting all procedures and specifications.

The most striking moment documented in the letter comes when inspectors inform the owner that the firm had not conducted process validation prior to distributing its drug products, as required under 21 CFR 211.100. The owner's reply: she was not aware of that legal requirement. The reason? The AI agent she had used to create her compliance documentation had never told her it was required.

The FDA's conclusion in that section is clear: any output or recommendations from an AI agent must be reviewed and cleared by an authorized human representative of the firm's quality unit before those outputs are relied upon for compliance purposes.

(Source: FDA Warning Letter to Purolea Cosmetics Lab, MARCS-CMS 722591, April 2, 2026.)

What the Warning Letter Means for the Broader Industry

Purolea was a small operation with significant baseline compliance gaps. Insects in the manufacturing area. No process validation. Products intended to treat medical conditions sold without approval. The AI issue did not exist in isolation; it was one element of systemic quality failures at a facility that should not have been manufacturing pharmaceutical products.

So it would be easy to read the warning letter as a story about a bad actor getting caught, where the AI angle is just one more item on a long list of violations. 

I believe though that that reading would miss the point.

The FDA chose to give AI its own named section in this warning letter. They could have addressed the documentation failures as generic quality unit failures, but they didn't. They documented the AI use specifically, named the failure mode specifically - overreliance on AI output without human review - and articulated a standard that now exists in the public enforcement record.

Regulators use warning letters partly to communicate expectations to the broader industry, not just to the recipient. The choice to call out AI explicitly, under its own heading, in a public enforcement document tells regulated industry something about how the FDA is thinking about this issue and what it expects to see in inspections going forward.

The failure mode documented at Purolea - accepting AI-generated compliance documentation without verifying its accuracy or completeness - is not unique to small or poorly run operations. A version of this failure can happen in any lab that has started using AI tools to generate documents, SOPs, specifications, or records and hasn't built adequate review processes around that use. The scale differs, but the underlying dynamic is the same.

There is also a lesson in how the failure developed. The owner told FDA investigators she had used AI agents to help achieve compliance with FDA regulations. She relied on the AI to inform her of regulatory requirements. The AI produced documentation that looked complete, and she treated it as such. The documentation was missing a fundamental requirement - process validation under 21 CFR 211.100 - and the AI never flagged that gap. It probably generated what it was asked to generate and stopped.

This is the core risk with generative AI in compliance contexts: it does not know what it doesn't know. It doesn't have a mechanism for auditing its own gaps against a regulatory framework. A language model that helps create a batch record template will produce something that looks correct. It will not produce a gap analysis comparing the template against 21 CFR Part 211 subparts. That analysis is the human's job.

For any regulated lab using AI tools to draft documentation, create specifications, write SOPs, or generate records of any kind, the Purolea letter prompts a practical question. If an FDA inspector walked into your facility tomorrow and asked to see evidence that every AI-generated document in your quality system had been reviewed and verified by a qualified human before use, could you show it? Not as a policy that says review is required, but as actual records demonstrating it happened?

What's Still Missing - the Open Questions Nobody Has Answered Yet

For all the regulatory activity since 2024, significant gaps remain. No regulatory body has published specific guidance on using generative AI - LLMs like ChatGPT or Claude - for routine lab documentation. The FDA's guidance focuses on AI models that generate or process data for submissions, not on AI that helps write the narrative around that data. The EMA's reflection paper is broad but doesn't address the specific scenario of a scientist using a chatbot to draft a notebook entry. The ISPE GAMP guide comes closest, but it was written primarily for software validation professionals, not bench scientists.

Beyond the generative AI gap, several open questions remain that current frameworks don't fully resolve.

Disclosure requirements. Should all AI-assisted sections be explicitly labeled in regulatory submissions? The FDA's draft guidance asks sponsors to document AI use and provide credibility evidence, but it doesn't mandate that individual paragraphs or data summaries carry an "AI-assisted" label. That question remains open, and companies are currently making their own calls without a clear answer to point to.

Acceptable error thresholds. Current LLMs produce plausible-sounding incorrect information. Not occasionally - routinely. These are called hallucinations and they are a documented characteristic of how LLMs work. But what level of hallucination or mis-summarization is tolerable in operational contexts, assuming humans review and correct the output? The FDA's credibility framework assesses model adequacy; the ISPE GAMP guide identifies hallucination as a risk requiring mitigation. Neither defines a quantitative threshold. Each company is currently making its own judgment call, and different auditors may evaluate those calls differently.

Third-party model provenance. The FDA and EMA frameworks both point toward documentation of the model, data, context of use, and credibility evidence. For proprietary commercial models, that can be difficult because sponsors may not have access to full training-data or architecture details. When you use a proprietary commercial model like GPT-4 or Claude, the training data composition isn't fully disclosed - vendors treat this as proprietary. This creates a genuine practical tension. The EMA's reflection paper also emphasizes data quality and bias assessment, which again presumes access to training data details that commercial vendors don't provide. Labs using commercial AI tools can document the tool they used and the version, but cannot provide the level of model documentation the guidance envisions. How regulators will treat this gap in practice remains to be seen.

Model drift and silent updates. Cloud-based AI providers update their models regularly, and these updates don't always come with explicit changelogs or advance notice. When a model update changes how your AI tool behaves - even subtly - does that constitute a change to a system used in a regulated process? The ISPE GAMP guide covers lifecycle management and change control for AI systems, but the specific scenario of a vendor-initiated silent update to a commercial cloud service isn't cleanly addressed. Labs that have built workflows around specific AI behaviors have real exposure here.

Cross-jurisdictional consistency. The joint FDA–EMA principles from January 2026 show convergence on high-level themes, but operational details remain jurisdiction-specific and continue to evolve. For companies operating globally and submitting to multiple regulators simultaneously, this means building compliance frameworks flexible enough to satisfy regulators who haven't fully aligned on specifics.

All of these are questions your quality team faces when trying to write an AI use policy, and the questions an auditor might raise when reviewing your records. The honest answer to most of them today is that the regulatory landscape hasn't fully settled. Which is precisely why erring on the side of more documentation, more oversight, and more caution makes sense right now.

Where the Regulations Are Heading

Where this goes is fairly predictable, even if the timing isn't. 

The FDA draft guidance from January 2025 will eventually be finalized. The EMA is working toward formal guidelines that build on both its reflection paper and the joint principles published with the FDA in January 2026. The European Commission published a draft GMP Annex 22 on AI in July 2025 - notable mostly for what it doesn't cover. In its draft form it applies only to static, deterministic AI/ML models, which means generative AI and LLMs sit outside its current scope entirely. How regulators will eventually treat those categories is still an open question.The MHRA's National Commission began accepting evidence in December 2025 and is expected to publish recommendations that will feed into UK regulatory expectations for years. All of it is moving in the same direction: toward more specific requirements. 

On the enforcement side, the Purolea warning letter is probably not the last of its kind. What it shows is that the FDA is already applying existing quality system regulations - 21 CFR 211.22 in this case - to AI-related compliance failures. Finalized AI-specific guidance isn't a prerequisite for enforcement action. The mechanisms already exist.

Generative AI in routine lab documentation remains the largest unaddressed gap. No published guidance tells you exactly how a scientist should use an LLM to help draft a notebook entry, what they need to document about that use, or what level of review constitutes adequate verification. That guidance will come. In the meantime, labs are operating without a clear standard.

Labs that have documented AI practices, trained staff, and clear human review processes in place before the formal requirements arrive will be in a substantially different position from labs that haven't thought about any of this yet. 

Part 2 of this series gets specific about what all of this means at the bench level - how ALCOA+ principles apply to AI-generated lab documentation, what counts as adequate human review, and where the real compliance risks are hiding in everyday workflows.

The practical message from the regulatory landscape is this: use AI where it helps, document how you use it, review everything it produces, and maintain the scientific judgment that makes your records trustworthy. Using platforms like IGOR that were built to support compliance and data integrity can help you by providing a strong foundation for your research documentation practices. While AI governance is a new challenge; data integrity is not.

Important Disclaimer: This post is for informational purposes only and reflects our understanding of publicly available guidance as of the date of publication. It is not legal or regulatory advice. Requirements vary by jurisdiction, product type, and intended use. If you're making compliance decisions, talk to a qualified regulatory professional who knows your specific situation.

References

  1. FDA, "Considerations for the Use of Artificial Intelligence to Support Regulatory Decision-Making for Drug and Biological Products," Draft Guidance, January 2025. Docket No. FDA-2024-D-4689. fda.gov
  2. EMA, "Reflection Paper on the Use of Artificial Intelligence (AI) in the Medicinal Product Lifecycle," EMA/CHMP/CVMP/83833/2023, adopted September 2024. ema.europa.eu
  3. FDA and EMA, "Guiding Principles of Good AI Practice in Drug Development," January 2026. fda.gov
  4. MHRA, "Impact of AI on the Regulation of Medical Products," April 2024. gov.uk
  5. ISPE, "GAMP Guide: Artificial Intelligence," July 2025. ispe.org
  6. FDA, "Artificial Intelligence and Medical Products: How CBER, CDER, CDRH, and OCP are Working Together," March 2024, revised February 2025. fda.gov
  7. FDA Warning Letter to Purolea Cosmetics Lab, MARCS-CMS 722591, April 2, 2026. fda.gov

Anika Weber, DPhil; CEO & Founder at IGOR - Your Personal Lab Assistant