All
DeepSeek Is Our AI Wake-Up Call
by Ed Burke and Kelly Burke, Dennis K. Burke Inc.

New artificial intelligence systems prioritize efficiency in training, but what about processing?
In January, we all heard how Chinese AI startup DeepSeek had a model launch that triggered a Wall Street selloff.
DeepSeek says that their R1 model was built and trained in two months for less than $6 million dollars, running on less powerful chips than previously released AI systems, and it is more efficient requiring less computer power.
DeepSeek offered their free app on the Apple App Store and had a very successful debut. Within days, the DeepSeek’s AI Assistant app quickly skyrocketed to the most downloaded free app in the U.S., overtaking OpenAI’s ChatGPT.
DeepSeek’s latest model demonstrated a level of efficiency that rattled Wall Street, investors, and the industry. It basically upended all aspects of the AI universe.
The Selloff
Following DeepSeek’s announcement, Alphabet (Google’s parent company), Microsoft, Nvidia, and Oracle experienced a collective market loss of nearly $1 trillion. Investors reacted to concerns that DeepSeek’s advancements could threaten the dominance of U.S. firms in the AI sector.
Several major U.S. tech and AI stocks plummeted dramatically in premarket trading early the next day. There was a broader selloff of just about anything AI related, including nuclear and natural gas markets in unregulated markets.
Unlike OpenAI’s ChatGPT and Meta’s Llama models which are trained on expensive high-end semiconductors, DeepSeek has developed an alternative that is allegedly 45 times more efficient than its competitors. Its final training run cost only $5.6 million, a fraction of the vastly higher sums required for the U.S.-made models.
How Is Deepseek’s R1 Different?
DeepSeek is designed as an open-sourced “reasoning model,” which means it is meant to perform well on things like logic, pattern-finding, math, and other tasks that typical generative AI models struggle with. Reasoning models do this using something called “chain of thought.”
Notably, DeepSeek improved reinforcement learning, where a model’s outputs are scored and then used to make it better. The DeepSeek team got really good at automating it.
The life cycle of any AI model has two phases: training and inference. Training is the often months-long process in which the model learns from data. The model is then ready for inference, which happens each time users asks it something. Both usually take place in data centers, where they require a lot of energy to run chips and cool servers.
Chain-of-Thought
An unintended outcome of U.S. export controls on high-end AI chips to China is that it forced startups there to “prioritize efficiency” using less powerful chips.
Using “chain-of-thought” allows the AI model to break a task into parts and work through them in a logical order before coming to its conclusion.
Chain-of-thought models tend to perform better on certain benchmarks which test knowledge and problem-solving in 57 subjects. But, as is becoming clear with DeepSeek, they also require significantly more energy to generate their answers.
Even though it matches rival models from OpenAI and Meta on certain benchmarks, DeepSeek’s model also appears to be more efficient, which means it requires less computing power to train and run.
Serious Problems Though
There are reports that Microsoft and OpenAI are probing whether a group linked to DeepSeek accessed OpenAI’s data using the company’s application programming interface without authorization. Microsoft’s security team observed a group believed to have ties to DeepSeek extracting a large volume of data from OpenAI’s API.
While DeepSeek has proven technically impressive, it has also raised serious red flags. Cyber firms have warned about DeepSeek AI tools due to serious vulnerabilities. DeepSeek also has potential data risks similar to Chinese-owned TikTok.
DeepSeek’s success also calls into question the significant electric demand projections for the U.S. Developing a way to make training more efficient might suggest that AI companies will use less energy. But DeepSeek’s improvements in efficiency suggests further AI performance gains may require less energy-intensive “computing” than assumed.
The performance of one of DeepSeek’s smaller models suggests it could be more energy intensive when generating responses than the equivalent-size model from Meta. The issue might be that the energy it saves in training is offset by its more intensive techniques for answering questions, and by generating the much longer answers they produce.
If the enthusiasm around DeepSeek continues, companies might feel pressured to put its chain-of-thought-style models into everything. But it is clear, based on the architecture of the models alone, that chain-of-thought models use more energy as they arrive at sounder answers.
The Road Ahead
We do seem to be heading in a direction of more chain-of-thought reasoning. In fact, OpenAI recently announced that it would expand access to its own reasoning model.
A more cost-efficient model could actually accelerate adoption across industries, further driving productivity gains and market expansion.
The biggest beneficiaries may not be the AI application companies themselves, but rather the firms building the infrastructure, chip manufacturers, data centers, cloud computing providers, cybersecurity firms, and defense contractors integrating AI into next-generation applications.
AI is already creating significant economic gains. Revenue from AI chatbot and AI art generators has soared to nearly $1.3 billion in 2024.
Ed and Kelly Burke are respectively Chairman of the Board and Senior Marketing Manager at fuel distributor Dennis K. Burke Inc. They can be reached at 617-884-7800 or ed.burke@burkeoil.com and kelly.burke@burkeoil.com.