In the ongoing legal tussle between OpenAI and The New York Times, the stakes have escalated with OpenAI filing a motion to dismiss parts of the lawsuit. OpenAI claims that The New York Times engaged in what it terms as "hacking" by allegedly paying someone to manipulate OpenAI's products, such as ChatGPT, to produce instances of copyright infringement.
The tech company, in its filing submitted on Monday to a Manhattan federal court, asserted that The Times employed "deceptive prompts" in thousands of attempts to generate questionable results, which OpenAI alleges blatantly violate its terms of use. OpenAI further emphasized that such usage is not typical behavior for regular users of its products, insinuating a deliberate attempt to exploit its systems.
The practice referred to by OpenAI as "hacking" is commonly known in the industry as prompt engineering or "red-teaming." It's a method frequently employed by AI trust and safety teams, ethicists, academics, and tech companies to stress-test AI systems for vulnerabilities. This procedure aims to identify and rectify any weaknesses within AI systems, akin to cybersecurity stress-testing for website vulnerabilities.
Ian Crosby, partner at Susman Godfrey and lead counsel for The Times, refuted OpenAI's claims, stating that what OpenAI terms as "hacking" is merely an attempt to uncover evidence of copyright infringement. Crosby highlighted the extensive copying of The Times' works by OpenAI for its commercial products without authorization, asserting that the scale of this copying surpasses the 100 examples cited in the lawsuit.
The conflict between OpenAI and various stakeholders, including publishers, authors, and artists, over the utilization of copyrighted material for AI training data, has been intensifying. The lawsuit filed by The New York Times in December targets Microsoft and OpenAI, seeking accountability for substantial damages.
OpenAI has previously argued the necessity of training AI models using copyrighted works, stating that it's practically impossible to develop top AI models without such material. Despite claims from OpenAI CEO Sam Altman that the company's models do not require The Times' data, the filing suggests that while one publisher's data might not significantly impact ChatGPT, collective opt-outs from numerous publishers could potentially affect the AI's functioning.
OpenAI's efforts to secure partnerships with publishers underscore its acknowledgment of the importance of copyrighted content for training AI models. The company has already struck deals with Axel Springer and is reportedly in discussions with CNN, Fox Corp., and Time to license their content. OpenAI has highlighted its opt-out process for publishers but emphasizes the necessity of such content for training modern AI models.