The Errors Are More Interesting Than the Apology

April 24, 2026

On April 9, 2026, Sullivan & Cromwell filed an emergency motion for provisional relief in In re Prince Global Holdings Limited, No. 26-10769, before Chief Judge Martin Glenn of the U.S. Bankruptcy Court for the Southern District of New York. The motion sought to freeze assets and compel discovery in the Chapter 15 proceeding of a Cambodian conglomerate whose founder has been charged in Brooklyn federal court with directing forced-labor compounds and a massive investment fraud. Nine days later, S&C partner Andrew Dietderich filed a five-page letter acknowledging that the motion contained “inaccurate citations and other errors,” some of which were “artificial intelligence (‘AI’) ‘hallucinations.’” Boies Schiller Flexner, representing creditors, had caught the problems and flagged them.

The coverage has focused, understandably, on the letter. Sullivan & Cromwell advises OpenAI on the “safe and ethical deployment” of artificial intelligence, a representation the firm touts on its website. The irony writes itself, and commentators have not been shy about noting it. But the letter is crafted to do a specific job: express contrition, describe safeguards, and frame the failure as an aberration. Schedule A, the three-page, single-spaced errata cataloguing roughly 40 corrections across multiple filings, is where the interesting information lives.

What the errors tell us

Dietderich’s letter does not identify the AI tool, the user, or the stage at which AI was involved. But the pattern of errors in Schedule A is suggestive. The motion contains some fabricated cases, the kind of hallucination that has dominated coverage since Mata v. Avianca. But many of the errors, and the more revealing ones, involve real cases cited for relevant propositions with corrupted citation details.

Consider In re BYJU’s Alpha, Inc., 2024 WL 1455586, a real Bankruptcy Court for the District of Delaware decision that the motion cites four times. Each citation contains different wrong page references: *8 in one paragraph, 6-7 in two others (without the Westlaw asterisk format a lawyer would use), and *7 in a fourth. The correct references vary by paragraph, but the errors are inconsistent with each other. If a lawyer had looked up the case once and recorded the wrong page, you would expect the same wrong page repeated throughout. Four different wrong references to the same case suggest each was generated independently, which is what a language model does: it produces a plausible-looking page number at each point in the text, with no memory of what it generated three paragraphs earlier. In re Soundview Elite Ltd. shows a similar pattern: the volume number is wrong in both appearances (503 B.R. instead of 543 B.R.), but the page references differ, and the year is wrong in one citation but correct in the other. The motion also attributes quotations to both Soundview Elite and In re Team Systems International that bear only a loose resemblance to what those courts wrote, altering tense and grammatical structure in ways that change the legal point.

What’s interesting about the bulk of the errors is that the AI cited real decisions for propositions those decisions actually address. What it got wrong were the citation details: volume numbers, page references, quoted language. That pattern is consistent with AI being used relatively late in the drafting process, likely during an editing pass rather than during initial research or drafting. An LLM asked to polish prose, tighten language, or reformat a document can and does alter text it should leave alone, including citation data. The result is a brief that reads fluently and cites the right authorities for the right propositions with the wrong details: corrupted volume numbers, fabricated page references, rewritten quotations.

If that is what happened here, and the error pattern strongly suggests something like it, the profession’s working model of AI risk has a blind spot. The conventional wisdom sorts AI-assisted legal tasks into high-risk (research, drafting) and lower-risk (editing, formatting, tone). Editing is widely treated as one of the “safe” uses of AI, the kind of task where hallucination risk seems negligible because the human has already done the substantive work. The S&C filing suggests otherwise. The most revealing errors in the Prince Global motion are not the hallucinations that have dominated headlines since Mata. They are the product of AI working with a document that already contained correct law and corrupting what was there.

The compliance model and its limits

Dietderich’s letter describes Sullivan & Cromwell’s AI governance infrastructure in some detail: two mandatory training modules, tracked completion, Office Manual provisions instructing lawyers to “trust nothing and verify everything,” a policy requiring independent verification of all AI-generated citations before any filing leaves the firm. By any reasonable standard, this is a serious compliance program. It is more rigorous than what most firms have. And it failed.

The letter attributes the failure to noncompliance: “The Firm’s policies on the use of AI were not followed in connection with the preparation of the Motion.” This framing positions the problem as a deviation from an otherwise sound system, the institutional equivalent of pilot error. But if a well-resourced firm with a rigorous training program, mandatory AI policies, and a culture of professional excellence cannot ensure that its lawyers follow its own citation-verification protocols, the question is whether the compliance model itself is adequate to the risk.

The policy-and-training approach to AI risk borrows from compliance frameworks designed for other contexts: anti-money laundering, conflicts of interest, data security. Those frameworks work reasonably well because the actions they govern are discrete, procedural, and often automatable. Checking a new client against a sanctions list is a bounded task with a clear decision rule. Citation verification after AI-assisted work is a different kind of cognitive activity. It requires sustained, skeptical attention to text that appears polished and professional, produced by a tool the lawyer chose to use because the lawyer trusted it to produce competent output. The compliance model assumes the failure mode is “the lawyer didn’t know the rule.” The actual failure mode is that the lawyer knew the rule but the AI output looked good enough to skip the step.

This gap between policy and practice is not unique to Sullivan & Cromwell. Every major firm now has an AI use policy. The policies are, in broad strokes, similar: mandatory training, citation verification, human review of all AI-generated work product. The S&C incident suggests that the profession’s current approach to AI governance may be optimizing for the wrong variable: the existence of policies rather than the conditions under which lawyers actually follow them. A policy that instructs lawyers to “trust nothing and verify everything” is sound advice and poor institutional design, because it asks every lawyer, on every filing, to overcome the very cognitive shortcut that made the AI tool attractive in the first place. As I described in a recent post, AI output that is approximately right erodes the cognitive conditions under which a reviewer would catch the ways it is specifically wrong.

The supervision problem the letter does not name

Dietderich’s letter is carefully drafted. It takes personal responsibility (“I take responsibility for the failure”), describes the firm’s policies at length, and frames the incident as a failure of compliance rather than of institutional design. What it does not do is engage with the supervisory obligations that the facts it describes implicate.

Model Rule 5.1 operates at three levels. Rule 5.1(a) requires partners and lawyers with comparable managerial authority to make reasonable efforts to ensure that the firm has “measures giving reasonable assurance that all lawyers in the firm conform to the Rules of Professional Conduct.” The Comment explains that such measures include internal policies and procedures, and that larger firms or firms where difficult ethical problems frequently arise may need “more elaborate measures.” Rule 5.1(b) separately requires any lawyer with direct supervisory authority over another lawyer to make reasonable efforts to ensure that the supervised lawyer conforms to the Rules. And Rule 5.1(c)(2) holds a supervising lawyer responsible for another lawyer’s violation when the supervisor knew of the conduct at a time when its consequences could have been avoided or mitigated but failed to take reasonable remedial action.

The most common reading of Rule 5.1, and the one that appears to have shaped S&C’s response, treats 5.1(a) as the central obligation. Under this reading, firms discharge their duty by establishing policies and training programs. Dietderich’s letter reads like a brief on that standard: mandatory training modules, tracked completion, Office Manual language requiring independent verification, policies that “are both clear and rigorous.” On a conventional 5.1(a) analysis, S&C has arguably done what the rule requires. The firm established measures. It trained its lawyers. It put the policies in writing.

But Rule 5.1(a) does not require firms to adopt policies; it requires firms to adopt measures “giving reasonable assurance” that lawyers will conform to the Rules. A policy that lawyers predictably will not follow under the conditions in which they actually work does not give reasonable assurance of anything. If the profession’s experience with AI use has demonstrated one thing over the past three years, across more than 1,300 documented incidents, it is that telling lawyers to verify AI-generated citations and trusting them to do so does not reliably produce verification. A firm that adopts that policy and treats the obligation as discharged has satisfied the letter of 5.1(a) while failing its purpose. The question 5.1(a) poses is whether the measures work, not whether they exist.

Rule 5.1(b) sharpens the point. The obligation to make “reasonable efforts” to ensure a subordinate lawyer’s compliance is matter-specific; it applies to each filing, each engagement, each supervised task. Dietderich’s letter acknowledges that the firm’s protocols “were not followed” on this filing, which means that whatever supervisory attention the motion received before it was filed with the court did not include the kind of review that would have caught errors a first-year Bluebooking exercise would flag. The letter does not describe what review the motion did receive, or from whom, and the gap is notable in a document otherwise detailed about the firm’s institutional safeguards.

The implications extend beyond Sullivan & Cromwell. Every firm that has responded to AI risk by adopting policies and training and treating the resulting compliance infrastructure as sufficient faces the same 5.1 question: do those measures actually give reasonable assurance that lawyers will conform to the Rules, or do they give the firm a paper trail to point to after the next incident? When the task being supervised involves AI-assisted work product, reasonable supervisory efforts under 5.1(b) must be calibrated to the failure modes AI introduces, which include not just fabricated cases but corrupted citation data, garbled quotations, and the kinds of close-but-wrong errors that a skim will not catch. A supervising lawyer who approves an AI-assisted filing without ensuring that someone verified every citation and quotation against the source has a difficult argument that the supervisory effort was reasonable, regardless of what the firm’s policies say.

Building systems that assume human error

S&C’s lawyers had every resource and every instruction they needed to be careful, and the review process still failed. Any institutional response that begins and ends with telling lawyers to try harder will produce the same result. The review failed because trained professionals working under time pressure will predictably treat polished-looking AI output with less skepticism than it requires, and no amount of training will reliably override that tendency.

The alternative is to design systems that assume lawyers will miss errors and build verification into the workflow at points where it does not depend on an individual lawyer’s vigilance.

Automated citation-verification tools, of which several now exist, can flag fabricated cases and, in some configurations, check pin cites and quoted language against the source. Building these into the document pipeline as a mandatory technical gate, not as a policy that a lawyer may or may not invoke, would catch the kinds of errors that are easiest to detect mechanically. Dual-review protocols, where the person who verifies citations is someone other than the person who drafted or edited the brief, address the cognitive problem more directly: the drafter’s familiarity with the AI-assisted text is precisely what compromises the drafter’s ability to review it critically, so the verification step should go to someone who has not been immersed in the document. Version-control practices that flag citation changes between drafts would be particularly valuable if, as the error pattern here suggests, AI is corrupting citations during editing passes; a diff that shows a volume number changing from 543 to 503 between the pre-edit and post-edit versions of a brief would surface exactly the kind of error that a linear read-through would miss.

None of these measures is novel, and none is sufficient alone. But they share a design principle that the current policy-and-training model lacks: they treat human inattention as a predictable system input rather than an aberration to be corrected through exhortation. Hospitals learned this decades ago. Surgical teams use timeouts, forced checklists, and independent verification protocols not because surgeons lack training or conscientiousness, but because the patient-safety literature demonstrated that trained professionals operating under time pressure will skip verification steps they believe are unnecessary in the moment. The response was to remove the decision about whether to verify from the individual clinician and make it a structural feature of the process. Wrong-site surgery rates dropped. The legal profession has yet to absorb the equivalent lesson: that the answer to predictable human error is process redesign, not better instructions to the humans who are predictably going to err.

The visible case and the invisible ones

The errors in the Prince Global motion surfaced because Boies Schiller Flexner was on the other side, read the filing carefully, and flagged the problems. Adversarial proceedings have a built-in, if involuntary, quality-control mechanism: opposing counsel has every incentive to scrutinize your citations. And that incentive is hardening into an obligation. In Nuvola, LLC v. Wright, 2025 Minn. Dist. LEXIS 5940 (Minn. Dist. Ct. Nov. 21, 2025), the court sanctioned an attorney for filing AI-generated fake citations, then turned to opposing counsel: “The Court should not be left as the last line of defense against citations to fictional cases,” the court wrote, and reminded both sides “that it is the obligation of counsel on both sides to respond to each other’s arguments, including completing a basic cite-check of the cases cited by the other side.” The Seventh Circuit struck a similar note in Dec v. Mullin, 171 F.4th 940, 947 (7th Cir. 2026), admonishing counsel for hallucinated citations and adding: “That opposing counsel also failed to catch these errors and bring them to our attention also gives us pause, albeit to a lesser degree.” Neither court sanctioned opposing counsel, but the direction of travel is clear: courts are beginning to treat cite-checking your opponent’s brief as part of the job.

But the majority of legal work that involves AI does not take place in adversarial settings. Transactional documents, regulatory filings, opinion letters, compliance memoranda, counseling advice: these are contexts where no opposing party is checking the lawyer’s citations against the reporter. The S&C incident became a story because someone caught it. The more sobering question is how much AI-assisted legal work product is circulating in contexts where no one will.

This post draws on Sullivan & Cromwell’s April 18, 2026 letter to Chief Judge Glenn in In re Prince Global Holdings Limited, No. 26-10769 (Bankr. S.D.N.Y.); Nuvola, LLC v. Wright, 2025 Minn. Dist. LEXIS 5940 (Minn. Dist. Ct. 2025); Dec v. Mullin, 171 F.4th 940 (7th Cir. 2026); coverage from Bloomberg Law, Above the Law, David Lat’s Original Jurisdiction, and Law.com; and Damien Charlotin’s AI Hallucination Cases Database. It extends arguments from prior posts on sycophancy as a failure mode and the delegation of professional judgment.