Advertisements
In the fast-evolving landscape of artificial intelligence (AI), monumental shifts and breakthroughs tend to capture global attention, reminiscent of the explosive emergence of ChatGPT in November 2022. This event marked a transformative moment that would reverberate through the industry, coining the term "ChatGPT Moment." Fast-forward to the end of 2024, and a new phrase enters the lexicon: "DeepSeek Moment," heralding a pivotal shift in AI history.
As the Lunar New Year approached in 2025, a start-up based in Hangzhou, China, known as DeepSeek, unveiled two groundbreaking open-source models, V3 and R1, on December 26, 2024, and January 20, 2025, respectivelyThese models have garnered immense attention for their impressive performance metrics, notably V3's purported ability to rival proprietary models like OpenAI's GPT-4o and Anthropic's Claude-3.5-Sonnet, while surpassing Meta’s Llama 3 in capabilities—all achieved with a training cost of just $5.576 millionThe R1 inference model boasts a performance level approaching that of OpenAI's o1, with an API pricing tagged at a mere 3.7% of o1's costs.
DeepSeek, founded on July 17, 2023, astonishingly held over 10,000 Nvidia GPUs prior to the launch of its modelsThe company has been able to develop competitive large models at approximately 7% of the cost incurred by foreign AI giants, igniting fierce competition within the Chinese market, resulting in a price war that spread overseas by year's endFollowing the company's rapid rise, stocks related to cloud computing and AI experienced tumultuous fluctuations, with Nvidia’s share price plummeting nearly 17%, leading to an evaporation of around $600 billion in market capitalization—the largest in U.S. stock market history.
The ramifications of DeepSeek’s entry into the scene sent industry leaders scramblingOpenAI claimed to have found evidence suggesting that DeepSeek had "distilled" its models, while Anthropic's CEO, Dario Amodei, publicly refuted the claims surrounding R1's achievements, calling for stricter control on AI computation exports to China
Advertisements
In light of the growing concerns and competitive landscape, the primary question emerges: Has DeepSeek been overhyped, and what ripples will it create in both domestic and international AI industries?
DeepSeek's ascent, while celebrated, has been accompanied by immense scrutinyLin Zhi, a noted practitioner in the AI sector, highlighted the company’s strengths in offering free usage and an engagement model that revealed its thought process during interactions, encouraging users to ask better questions, an innovation that proprietary models like o1 do not provideThis openness, coupled with the complete release of their technical papers and models, contributed to the virality of DeepSeek.
However, the initial wave of success soon faced backlash; users experienced server outages and malfunctions due to a significant DDoS attackWhile these issues were eventually resolved, the spells of downtime raised questions regarding scalability and reliabilityDespite this, the company’s very existence has emphasized a unique trajectory in the AI industry, drawing attention to the stark contrast between proprietary and open-source models.
Amidst this turbulence, concerns about whether DeepSeek is genuinely innovative have persistedThe disclosures in the technical papers accompanying V3 and R1 suggest a commitment to innovation, with the former leveraging proprietary technologies like DeepSeekMoE and MTP for training, and the latter turning away from traditional human feedback reinforcement learning in favor of direct reinforcement learning modelsAs a result, DeepSeek has ostensibly demonstrated the feasibility of creating high-performance models with a relatively low training cost.
Nonetheless, industry analysts have pointed out that DeepSeek's reported cost primarily accounts for GPU expenses during model pre-trainingWhen factoring in operational and capital expenditure, estimates suggest a total expenditure that might reach $2.573 billion over four years
Advertisements
The downward trend in innovation costs cannot be overlooked; many industry insiders assert that the costs associated with AI development have been decreasing significantly, with AI training costs dropping approximately 75% per yearMoreover, other investors predict that standard models will see vast reductions in training costs, potentially plummeting to a tenth of their current pricing.
The core of DeepSeek's allure, however, extends beyond its pricing strategy—it encapsulates a narrative akin to a "dragon-slaying" fairy taleBefore the emergence of ChatGPT and during China's computational constraints, DeepSeek founder Liang Wenfeng had already amassed an extensive GPU reserve, rooted primarily in his exploration of quantitative trading which began in 2008. The establishment of his company, Magic Square, yielded substantial developments in AI capabilities through the creation of AI clusters, preemptively securing a wealth of chips and talent.
This storytelling aspect is both unique and compelling, generating widespread enthusiasm and curiosity surrounding DeepSeek, reflecting a collective aspiration for technological advancementThe narrative transitions into a new stage: who feels threatened by DeepSeek's meteoric rise?
Following DeepSeek's launch, both Chinese and American AI companies across the spectrum felt the shockwavesNotably, chatbot applications faced significant repercussionsAs reported, DeepSeek's daily active users soared past 20 million by the eve of the Lunar New Year, eclipsing rivals such as Doubao and Kimi in ChinaIn less than a week, DeepSeek garnered over 100 million users—a feat that took ChatGPT two months to achieve.
Simultaneously, competitors like Yuuzhan Dark Surface released new models and features, aiming to counteract DeepSeek's swift acquisition of market shareAnalysts pointed out that this phenomenon underscores a concerning trend; users exhibit low loyalty towards chatbot models, readily shifting allegiance when newer, faster alternatives emerge on the scene
Advertisements
However, within the competitive landscape, nuanced differences remain; while some firms have integrated multimodal capabilities, DeepSeek primarily focuses on textual interactions for now.
Looking deeper, the competitive repercussions extend to self-developed large model companies attempting to navigate this disruptive environmentInvestors emphasize the growing necessity for these entities to reassess their strategies, especially concerning training costs and efficiencySimultaneously, industry insiders identify a redefining phase amid the chip market turbulence—while DeepSeek may appear as the disruptor, Nvidia may not necessarily suffer adverse effects but find itself on a path of market recalibration.
DeepSeek's emergence signals a broader shift that urges AI pursuits to return to grounded principles, focusing on practical applicationsThis focus facilitates innovation along the AI supply chain and fosters growth within AI-native applications and hardwareObservers have begun to speculate that the year 2025 could herald a new epoch for AI commercialization.
As the narrative unfolds, the looming question of how DeepSeek will navigate the potential options ahead becomes significantReports suggest that Alibaba may be planning to invest $1 billion for a stake in DeepSeek at a valuation exceeding $10 billion—this would underscore the growing pressures from larger competitors in the fieldHowever, there's concern that should DeepSeek take on investment, it may follow a trajectory similar to that of its contemporaries, ultimately losing its essence of freedom and fostered whole-hearted exploration of AGI.
In this new paradigm, the contrast between differing strategies of AI development becomes increasingly evidentBusinesses are now confronted with pressing concerns regarding not only decreasing training costs but also the efficacy of algorithmsAn important realization arises that the traditional model of merely piling resources to boost performance is no longer tenable
Despite this shift, there's an underlying sentiment that DeepSeek's story is just beginning—a narrative filled with potential growth and transformation within the AI landscape.Ultimately, DeepSeek’s rise stands as a testament to the notion that innovation can take shape in differing fashions, promoting a new approach to AI development that values efficiency alongside performanceAs the echoes of its disruptions continue to ripple throughout the industry, one thing remains certain: the world will be watching closely to see how DeepSeek navigates the unfolding challenges and opportunities.