Tech giant Microsoft has opened an investigation into the possibility of DeepSeek making use of OpenAI’s application programming interface (API) for its model development. The probe comes a few hours following claims by Trump’s AI and crypto “czar”, David Sacks that the Chinese company used OpenAI’s models to train its own models. https://lnkd.in/deZRCi8u
Technext’s Post
More Relevant Posts
-
I think my irony meter just exploded. "Microsoft Corp. and OpenAI are investigating whether data output from OpenAI’s technology was obtained in an unauthorised manner by a group linked to Chinese artificial intelligence startup DeepSeek, according to people familiar with the matter. Microsoft’s security researchers in the fall observed individuals they believe may be linked to DeepSeek exfiltrating a large amount of data using the OpenAI application programming interface, or API ..." So let's get this straight. Microsoft and OpenAI (and many others!) trained their AIs by harvesting pretty much the entire Web, regardless of the licenses of the contents they harvested, or the wishes of the content owners. And now it looks like DeepSeek *might* have done the same to them, they're crying foul! I'd say it's unbelievable, but it's actually quite in keeping with the "one rule for us, another rule for them" approach that has characterised big tech's approach to intellectual property rights in the past. https://lnkd.in/gKychiT5
To view or add a comment, sign in
-
Of course this needs to be vetted and investigated. The thot plickens! From the article... "Just a few hours after David Sacks claimed DeepSeek used OpenAI’s models to train its own models, Bloomberg Law reports that Microsoft is investigating DeepSeek’s use of OpenAI’s application programming interface (API). According to security researchers working for Microsoft, the Chinese company behind the R1 reasoning model may have exfiltrated a large amount of data using OpenAI’s API in the fall of 2024. Microsoft, which also happens to be OpenAI’s largest shareholder, notified OpenAI of the suspicious activity. While anyone can sign up and access OpenAI’s API, the company’s terms of service stipulate that you can’t use the output to train a new AI model. “You are prohibited from […] using Output to develop models that compete with OpenAI,” the company writes in its terms of use. Additionally, the company says that you can’t “automatically or programmatically [extract] data or Output.”
To view or add a comment, sign in
-
Microsoft and OpenAI are investigating whether data output from OpenAI’s technology was obtained in an unauthorised manner by a group linked to Chinese artificial intelligence start-up DeepSeek, according to people familiar with the matter.
To view or add a comment, sign in
-
So OpenAI says it has evidence that DeepSeek used their API to build competing models. They’re claiming that DeepSeek used their API to build competing models through "distillation" - basically developers can obtain better performance on smaller models by using outputs from larger ones. OpenAI’s terms of service specifically state that users can’t copy or “use output to develop models that compete with OpenAI”. ToS language is all fine and good, but it’s not what really matters here. The bigger picture is that the AI market became a lot more competitive overnight and that’s going to reshape our industry. It’s common knowledge in AI circles that start-ups and academics use outputs from human-aligned commercial LLMs to train other models. I expect the big LLMs vendors will be more vigilant now. I also suspect that the market will split into two camps: closed, proprietary vs open source. Another flavor of this is iOS vs Android in the mobile operating system market. Finally, I know a lot of tech industry pundits are focused on the irony of OpenAI’s position. A huge part of GPT’s success was based on OpenAI hoovering up the internet without anyone’s consent. So it’s definitely ironic and it makes for a good headline/tweet. Just remember - nobody outside of tech really cares.
To view or add a comment, sign in
-
Oh, the irony! OpenAI is moaning that a Chinese rival uploaded vast amounts of data from OpenAI's models without permission. OpenAI is complaining that this violates terms of service. Reminder: OpenAI's whole $multibillion business model is based on uploading vast amounts of data without permission. To train its models, it ignored terms of service and copyright (and simple courtesy) to upload social media, Wikipedia, entire news sites, blogs, video subtitles from every YouTube video, and more... everything it could find online. https://lnkd.in/efna43zb
To view or add a comment, sign in
-
Microsoft Probing If DeepSeek-Linked Group Improperly Obtained OpenAI Data - Microsoft Corp. and OpenAI are investigating whether data output from OpenAI’s technology was obtained in an unauthorized manner by a group linked to Chinese artificial intelligence startup DeepSeek - https://lnkd.in/gpRunz_i
To view or add a comment, sign in
-
There exist several tactics for taking advantage of market buzz about your competitors (Q1, Q2). Q1: "Microsoft’s security researchers observed individuals they believe may be linked to DeepSeek exfiltrating a large amount of data in the autumn using the OpenAI application programming interface, or API, said the people, who asked not to be identified because the matter is confidential. Software developers can pay for a licence to use the API to integrate OpenAI’s proprietary AI models into their own applications." Q2: "Microsoft, an OpenAI technology partner and its largest investor, notified OpenAI of the activity, the people said. Such activity could violate OpenAI’s terms of service or could indicate the group acted to remove OpenAI’s restrictions on how much data they could obtain, the people said." https://lnkd.in/gnEGUKZv
To view or add a comment, sign in
-
https://lnkd.in/gaUA2-k7 This is an interesting variant in the budding war over IP ownership as it relates to AI training. I think Stack Overflow’s strategy of enforcing a usage agreement that it sounds like a fair number of their users disagree with is a long term losing strategy… they can revert answers that users have changed, but they can’t force those users to continue contributing. Over time, if enough users become disenchanted, Stack Overflow will become outdated and irrelevant. The whole thing brings to mind what I think is a very strong dynamic in personal data privacy more broadly. What I’ve observed is that most individuals don’t care that much about their data privacy…at least many are willing to agree to relinquish it for almost nothing, but, importantly, not actually nothing. What users object to is someone else making money off their data without sharing a fair share (whatever that is). It feels like the same dynamic here…the disgruntled answerers are fine to share their knowledge with the world (and I truly admire that) but get upset when OpenAI looks to profit from that. Tbh it’s hard to disagree with them.
To view or add a comment, sign in
-
OpenAI is threatening to ban everyone who tries to research what their new model ("o1") is actually doing and how it works (there are a lot of probable theories but the details still need some confirmation). While OpenAI uses a lot of language about "safeguards" it's mostly about keeping the illusion intact that "o1" is a big leap when in fact it is a marginal patch to what they have been doing for a long time now. It's another iteration that provides marginal gains in specific cases without really moving the needle while being massively more expensive to use. Which is what we have been seeing for a year now: All shrinking gains in those huge models come with increasingly absurd price tags. But they are looking for money right now and need to keep the hype going.
To view or add a comment, sign in