Microsoft is investigating DeepSeek for possible data theft

DeepSeek is the biggest story in AI since ChatGPT hit the market. It showed up out of the blue and beat out the best AI models on the market at a fraction of the cost. However, it’s mostly known for the controversies that are piling on. Adding another controversy to the pile, it looks like Microsoft is investigating DeepSeek.

In just a couple of days, DeepSeek has gone from being a rising star to being a rising security and safety concern. According to a recent report, DeepSeek R1 is a rather dangerous model, as it’s easy to trick into producing harmful content. People are able to get it to tell them how to build certain weapons, brew up toxic substances, develop malware, etc.

Along with that, just its presence has caused a massive near-trillion-dollar crash in the stock market for AI companies. Nvidia alone nearly lost $600 billion in just one day. There are other controversies and information coming out about this chatbot. Be sure to check out our piece on Everything You Need To Know About DeepSeek.

Microsoft is investigating DeepSeek

It was only a matter of time before other AI companies took notice of DeepSeek. Microsoft has a reason to be a bit upset at this newcomer. It was one of the top 10 companies to lose value in the stock market because of the crash. It lost $72.2 billion in the stock market.

However, that’s not the reason that Microsoft is investigating DeepSeek. According to the report, Microsoft noticed a large extraction of data from OpenAI’s systems. Upon looking deeper into it, the company spotted that a group possibly stole a ton of the company’s data through OpenAI’s API. Microsoft concluded that the people responsible for this are possibly connected to DeepSeek.

Allegedly, the group distilled the data from ChatGPT into DeepSeek’s model. This process, called distillation, is when someone takes the output from one AI model and uses it to train another. So, it looks like this group took the outputs from ChatGPT to train DeepSeek R1. This process sounds pretty shady, and it’s begging for problems, as the model is being fed data that’s possibly inaccurate.

At this point, there’s still a ton of information that we don’t know about. Neither company involved (Microsoft, OpenAI, DeepSeek, High-Flyer) has commented on the ordeal as of yet. However, if it turns out that DeepSeek stole the data, then it could be in for a tough time going ahead.