Maybe DeepSeek did distill OpenAI’s models to train its own, and maybe that is a violation of the terms of service OpenAI has published. But “extracting information and putting it to use” feels like a fair description of what DeepSeek has done here. If DeepSeek’s work truly weren’t possible without the work that OpenAI had already done, perhaps DeepSeek should think about compensating OpenAI in some way?
This kind of hypocrisy makes it difficult for me to muster much sympathy for an AI industry that has treated the swiping of other humans’ work as a completely legal and necessary sacrifice, a victimless crime that provides benefits that are so significant and self-evident that it’s wasn’t even worth having a conversation about it beforehand.
A last bit of irony in the Andreessen Horowitz comment: There’s some handwringing about the impact of a copyright infringement ruling on competition. Having to license copyrighted works at scale “would inure to the benefit of the largest tech companies—those with the deepest pockets and the greatest incentive to keep AI models closed off to competition.”
“A multi-billion-dollar company might be able to afford to license copyrighted training data, but smaller, more agile startups will be shut out of the development race entirely,” the comment continues. “The result will be far less competition, far less innovation, and very likely the loss of the United States’ position as the leader in global AI development.”
Some of the industry’s agita about DeepSeek is probably wrapped up in the last bit of that statement—that a Chinese company has apparently beaten an American company to the punch on something. Andreessen himself referred to DeepSeek’s model as a “Sputnik moment” for the AI business, implying that US companies need to catch up or risk being left behind. But regardless of geography, it feels an awful lot like OpenAI wants to benefit from unlimited access to others’ work while also restricting similar access to its own work.
Good luck with that!