Courts have said time and again that the fair use doctrine may be “‘the most troublesome in the whole law of copyright.’” See, e.g., Oracle Am., Inc. v. Google Inc., 886 F.3d 1179, 1191 (Fed. Cir. 2018) [internal citations omitted], rev’d on other grounds, 141 S. Ct. 1183 (2021). The emerging cases by authors and copyright owners challenging various generative AI programs for using copyrighted materials are certain to create new troubles for the courts being asked to apply the fair use doctrine to this important new technology. Several such cases to date have received considerable publicity, including two class actions by Michael Chabon, Ta-Nehisi Coates and others, Chabon v. OpenAI Inc., No. 3:23-cv-04625 (N.D.Cal.) and Chabon v. Meta Platforms Inc., No. 3:23-cv-04663, (N.D.Cal.); another class action involving several best-selling authors, Authors Guild v. OpenAI Inc., No. 1:23-cv-08292 (S.D.N.Y.), and another class action including Sarah Silverman, Kadrey v. Meta Platforms Inc., No. 3:23-cv-03417 (N.D.Cal).
Perhaps the most troublesome of all so far is the new complaint filed by The New York Times alleging that OpenAI, Microsoft and others committed copyright infringement in training its Generative Pre-Trained Transformer (GPT) systems and wrongfully attributing false information to The Times via the output from systems such as ChatGPT and Bing Chat. The New York Times Co. v. Microsoft Corp., No. 1:23-cv-11195 (S.D.N.Y.) The Times asserts claims for direct as well as vicarious and contributory copyright infringement, unfair competition, trademark dilution, and violation of the Digital Millennium Copyright Act — Removal of Copyright Measurement Information (17 U.S.C. §1202). Among other things, the complaint alleges that GPT not only copied published articles verbatim but could be prompted to offer up content that is normally protected by The Times‘ paywall. The complaint cites several such examples of nearly verbatim copying of large sections of articles and includes screenshots of GPT delivering up the first paragraphs of articles as well as ensuing paragraphs when prompted to do so. The complaint alleges that the “training” of the program included storing encoded copies of the works in computer memory and repeatedly reproducing copies of the training dataset, such that millions of The Times works were “copied and ingested — multiple times — for the purpose of ‘training” Defendants’ GPT models.” The Times further alleges that when OpenAI’s chatbots are not revealing verbatim copying, they instead (when actual content is not available) entirely fabricate “hallucinations” — inventing and misattributing to The Times content that it did not publish.