Why ELSA Is a Step—but Not the Destination—for AI in Regulatory Writing

Jeanette Towles
Jun 11
3 min read

Updated: Jul 8

The FDA’s recent Evaluation of Labeling Submissions with AI (ELSA) pilot program has generated both curiosity and skepticism in the regulatory community. As reported by Regulatory Focus, early responses from reviewers using ELSA have been mixed, with some finding it helpful for catching inconsistencies or typos, while others found it redundant or disconnected from their workflows.

While the effort is laudable and signals FDA’s openness to modernizing its own processes, the ELSA

pilot also highlights deeper limitations in how general-purpose AI tools perform in complex, regulated environments like clinical labeling.

When Automation Isn’t Enough

ELSA appears to focus on automating surface-level tasks—identifying discrepancies, formatting issues, and minor textual inconsistencies. But in the nuanced world of regulatory writing, automation alone doesn’t cut it. Labeling is not just about correcting spelling errors or aligning section headers—it’s about conveying clinical and regulatory intent, harmonizing with precedent, and maintaining consistency with supporting modules and study data. That requires domain depth, not just computational power.

Several design gaps may be contributing to the underwhelming feedback reported:

Hands typing on a laptop on a white table with a smartphone and a speckled cup nearby. Minimalistic, monochrome setting.

1. Lack of Transparency and Explainability

Reviewers working in regulated spaces need to know why a recommendation was made. If ELSA’s suggestions cannot be traced back to clear logic or sources, trust breaks down. Explainability is a cornerstone of responsible AI, especially in compliance-focused fields.

2. Overemphasis on Automation, Underemphasis on Augmentation

ELSA’s emphasis on automation might ignore the reality that reviewers benefit more from augmentation—AI that supports, rather than replaces, expert judgment. Tools that can’t incorporate feedback or adapt to specific reviewer styles risk becoming more of a burden than a benefit.

3. Poor Grasp of Regulatory Intent and Tone

ELSA may detect formal inconsistencies, but it lacks awareness of regulatory tone—such as avoiding promotional language, using precedent-aligned risk language, or interpreting guideline-driven phrasing. These subtleties can’t be captured without specific training on context and precedent.

4. Context Isolation

Labeling content does not exist in a vacuum. Without the ability to reference other modules (e.g., clinical summaries, nonclinical data), a tool like ELSA can flag content as inconsistent when in fact it's justified by context the tool can’t access. This leads to overflagging and unnecessary reviewer burden.

5. Workflow Misalignment

ELSA’s rollout across multiple FDA offices has resulted in uneven feedback, likely due to mismatches with how reviewers actually do their work. If an AI tool disrupts rather than complements established processes, adoption will lag—even if the technology itself is sound.

6. Lack of Feedback Loops

Perhaps most critically, ELSA does not appear to incorporate a structured feedback loop to improve over time. Without input from reviewers on what worked and what didn’t, the system can’t evolve or improve its outputs—undermining long-term utility.

Ontologies and Knowledge Graphs: The Missing Foundation

None of these issues are insurmountable—but they point to a fundamental flaw in applying generic AI to domain-specific problems. Clinical and regulatory documentation depends on deeply structured knowledge: standardized terminologies, therapeutic area-specific language, regulatory precedent, and ontology-based reasoning.

That’s why successful AI in this space must be trained not just on documents, but on relationships—the kind captured in knowledge graphs and built into models that reflect how humans interpret complex data.

How AgileWriter Solves for These Gaps

At Synterex, we’ve taken a fundamentally different approach with AgileWriter™—our AI-powered platform purpose-built for clinical trial documentation. Rather than relying solely on generative AI, AgileWriter uses a combination model: generative AI handles fluency and restructuring, while rules-based, retrieval-augmented, and source-stabilizing models ensure factual accuracy and consistency across documents.

This architecture addresses the same limitations that ELSA is now confronting:

Explainability is built in, with traceable outputs and audit trails.
Augmentation-first design allows SMEs to guide, override, and improve the system continuously.
Contextual awareness is supported across document types and regulatory modules.
Training includes ontologies, document templates, and feedback loops from real-world usage.
Workflow integration is flexible, supporting different styles of writing and review.

If You Don’t Train for the Domain, You Train for Disappointment

The lesson here isn’t that ELSA is a failure—it’s that AI without domain-specific training will always fall short. Generic AI may be impressive in demos, but it struggles when dropped into the highly structured, high-stakes world of clinical development.

That’s a theme we’ve explored in several recent Synterex blogs, including:

These pieces reinforce the idea that AI is only as good as its foundation—and in life sciences, that foundation must be built by people who understand the domain.

Final Thoughts

ELSA is a milestone worth acknowledging. But to unlock the full potential of AI in regulatory writing, we need more than automation—we need alignment: with the task, with the user, and most of all, with the domain. That’s the difference between an AI that helps—and one that works.