DeepSeek: A New Player in the Global AI Race

By: The Center for Internet Security® (CIS®)  Countering Hybrid Threats (CHT) team

Published: March 24,2025

Cyber Threat Intelligence thumbnail

Technologies developed by companies based in the People’s Republic of China (PRC) continue to pose U.S. national security risks due to the Chinese Communist Party’s (CCP) ability to leverage these technologies for data collection and censorship efforts. The rise of DeepSeek, a Chinese-made generative artificial intelligence (GenAI) model, points towards the ongoing discussion regarding the CCP’s data collection practices, which the Center for Internet Security (CIS) previously outlined in a blog titled The Chinese Communist Party (CCP): A Quest for Data Control.

Since the CCP is authorized to access data from China-based commercial entities under its intelligence laws, it is highly likely the integration of DeepSeek or other Chinese-made GenAI models introduces security risks for users. DeepSeek collects a large amount of data from users that it stores on servers in China, including prompts and responses, personal information used to set up an account, and device data.[1], [2]  Researchers have also discovered what appears to be an intentionally hidden backdoor[1] in DeepSeek’s code that contains direct links to Chinese government-controlled servers and companies. The discovery highlights ongoing concerns among national security officials regarding Chinese-based companies including backdoors in their technologies to make data accessible to the Chinese government.[3]

The ability for DeepSeek to collect large amounts of data from users likely makes it an attractive target for Chinese state-sponsored espionage efforts. According to the Department of Homeland Security’s (DHS) 2025 Homeland Threat Assessment, the CCP and other foreign adversaries have aggressively sought “to target and steal sensitive US information, research, and technology.”[4] Since users may input sensitive information into the DeepSeek platform, such as personally identifiable information (PII) or intellectual property (IP), the platform is an attractive data collection opportunity for the CCP.  Additionally, researchers raised concerns regarding DeepSeek’s data protection practices following the discovery of several security vulnerabilities in the platform. Since GenAI platforms store large amounts of data, there is a risk of cyber threat actors (CTAs) targeting sensitive information input into the platform or the data being exposed in data breaches. Several researchers have also revealed that platform appears trained to provide answers that promote CCP interests, suggesting the CCP could use the platform as a tool in broader malign influence operations (IOs), both domestically and abroad.[5], [6]

Development Controversy

DeepSeek established itself as a rival to OpenAI’s ChatGPT with the January release of R1, an open-source reasoning-capable large language model (LLM) that powers the DeepSeek GenAI chatbot. Its developers claimed to have built the model for less than $6 million, significantly cheaper than other LLMs.[7] The application quickly rose to the top of the iPhone App Store’s charts. The alleged low price tag for developing the model led to concerns that other GenAI companies were overvalued, resulting in a drop in technology-sector stocks on January 27.[8]

Soon after its launch, OpenAI accused DeepSeek of improperly distilling OpenAI’s data.[9] Distilling is a method where developers train GenAI models based on the performance of larger models, such as training a model on the responses that a larger model gives to certain questions. Researchers at Wallarm, who tricked the platform into revealing its underlying instructions, found it “seemed to indicate that it may have received transferred knowledge from OpenAI models.”[10]

The alleged low cost of developing the model drove industry shock, although it is possible that the true cost of development is significantly higher. A report from SemiAnalysis, a research and analysis company specializing in the semiconductor and AI industries, stated that DeepSeek’s “hardware spend is well higher than $500M over the company history.”[11] Their analysis puts the total investment at “~$1.6B, with a considerable cost of $944M associated with operating such clusters.”[12]

Data Collection

Cybersecurity news outlet CyberScoop reported in January 2025 that, according to DeepSeek’s privacy policy, “unless a user decides to download and run the software locally, their data will go to servers stored in China.”[13] This includes user prompts and the model’s responses.[14]  DeepSeek additionally collects data used during account creation, such as a user’s email address, date of birth, and phone number, as well as device data, including keystroke patterns or rhythms, IP addresses, device models, and operating systems.[15], [16] DeepSeek collects data from other sources as well, including advertisers that may share information including “mobile identifiers for advertising, hashed email addresses and phone numbers, and cookie identifiers.”[17] If a user signs up with an account using Google or Apple, DeepSeek may also collect information from those companies.[18]

This data collection is especially notable since, according to an ABC News interview with Feroot Security CEO, Ivan Tsarynny, code from DeepSeek’s web version contains direct links to Chinese government-controlled servers and companies. Tsarynny decrypted portions of DeepSeek’s code using AI software and discovered a hidden backdoor capable of sending user data to the online registry for China Mobile, a Chinese-government controlled telecommunications company. The Federal Communications Commission (FCC) banned China Mobile from operating in the U.S. in 2019, citing national security concerns.[19] In 2022, the FCC also added the company to a list of communications equipment and services that “pose an unacceptable risk to the national security of the United States.”[20] Tsarynny found that each user is assigned a digital “fingerprint” by DeepSeek’s web tool, and that the “fingerprint” is capable of tracking user web activity outside of DeepSeek’s website.[21] The Associated Press (AP) reported that a second set of cybersecurity experts confirmed a link between China Mobile and DeepSeek’s login system. One of the researchers that AP cited indicated “he didn’t see data being transferred in his testing but concluded that it is likely being activated for some users or in some login methods.”[22]

National security officials have previously warned about Chinese-based companies including backdoors in their technologies to render data accessible to the Chinese government. The discovery of a potential hidden backdoor is especially notable given that, according to Bloomberg, Pentagon employees used DeepSeek for several days before the website was blocked on the Pentagon’s IT networks. An anonymous official told Bloomberg that “the Pentagon's IT experts are still determining the extent to which employees directly used DeepSeek’s system through a web browser.”[23] The U.S. Navy also reportedly issued a warning against using the app to service members, highlighting existing guidance against leveraging open-source GenAI for official work. At this time, it remains unclear if sensitive data was collected in association with Pentagon and U.S. Navy employees.[24]

The risks also extend beyond government and military entities, since users often input personal data and work-related information into GenAI platforms. In a survey conducted by Cisco in June 2024, “37% of respondents said they have submitted health information [to GenAI chatbots], 36% have entered work information, and 24% to 29% of users have provided financial information, account numbers, and religion/ethnicity.”[25] Another global survey by the National Cybersecurity Alliance also found that 38% of employees have input sensitive work-related information in GenAI chatbots without permission from their employer.[26] This includes sensitive information the CCP targets as part of its economic espionage efforts.  According to the 2025 Homeland Threat Assessment, the CCP’s economic espionage and influence efforts “have primarily focused on stealing sensitive US technology and intellectual property from US manufacturers and research institutions.”[27] Additionally, the CCP “pursues industrial espionage and investments in US firms to gain access to sensitive information on, and control over, entities in US critical infrastructure, including in the telecommunications, transportation, energy, and food and agriculture sectors.”[28] Since GenAI platforms are utilized across a broad range of sectors, DeepSeek presents a significant data collection opportunity to support the these state-sponsored operations.

Data Control

Multiple researchers who have tested the DeepSeek platform have observed evidence of censorship on topics that are politically sensitive to the CCP, matching a similar trend seen on other Chinese-owned platforms. In 2023, the CCP issued a set of rules aimed at controlling the output of Chinese GenAI chatbots, requiring them to reflect “socialist core values” and forbade the generation of content that “damages the unity of the country and social harmony.”[29], [30] The Office of the Director of National Intelligence’s (ODNI) 2024 Annual Threat Assessment of the U.S. Intelligence Community highlighted how the CCP’s malign information operations (IO) “[aim] to sow doubts about U.S. leadership, undermine democracy, and extend Beijing's influence."[31] There is a risk of the CCP leveraging a GenAI application trained to provide answers that align with the CCP's worldview as a tool to advance this goal.

Reporting on DeepSeek already suggests it is adhering to CCP policies on the availability of information. Prompts requesting information regarding the 1989 Tiananmen Square protests, the status of Taiwan, and the Chinese government’s treatment of the Uyghurs all receive answers from the platform indicating it cannot, or will not, discuss them nor provide information. A report from WIRED demonstrated how the censorship of the application works in real time. Users can observe the application begin to answer a question and then modify it to the censored output. When WIRED asked the platform about the treatment of journalists, DeepSeek began to write out an answer, “yet shortly before it finished, the whole answer disappeared and was replaced by a terse message: ‘Sorry, I'm not sure how to approach this type of question yet. Let's chat about math, coding, and logic problems instead!’”[32] A report from NewsGuard found that DeepSeek “advanced the positions of the Beijing government 60 percent of the time in response to prompts about Chinese, Russian, and Iranian false claims.”[33]

Vulnerabilities

CTAs opportunistically targeted DeepSeek shortly following its launch, likely taking advantage of heightened media attention and increased user signups. As a result, on January 27 the company indicated it temporarily halted new user registrations after it experienced a “large-scale malicious attack” on its services. DeepSeek later updated its website and indicated new users could continue to register, but they may experience difficulties doing so.[34]

Additionally, within days of launch, researchers found several security vulnerabilities in DeepSeek that could expose sensitive data or lead to jailbreaking attempts, a technique intended to circumvent security measures. On January 29, researchers from Wiz, a cloud security company, revealed they discovered a publicly accessible DeepSeek database that exposed over one million records containing sensitive data. According to Wiz, this included “a significant volume of chat history, back-end data, and sensitive information, including log streams, API secrets, and operational details.”[35] Wiz also indicated that the public exposure of the “completely open and unauthenticated” database “allowed for full database control and potential privilege escalation within the DeepSeek environment.”[36] The exposed database became inaccessible to unauthorized users within a half hour of the Wiz research team disclosing the vulnerability to the company.[37], [38] The discovery exemplifies the risks of uploading personal or work-related information to GenAI systems, which can be exposed in data breaches or targeted by CTAs. Additionally, the public exposure of DeepSeek’s database just days after the company was targeted in a cyberattack highlights concerns regarding DeepSeek’s data protection practices.

Several researchers have indicated that DeepSeek appears to lack protections preventing the tool from generating harmful content. This includes researchers from Cisco and the University of Pennsylvania,  who tested the platform using an automatic jailbreaking algorithm.[39], [40] The jailbreaking algorithm was tested against 50 prompts from the HarmBench dataset, an evaluation framework that includes 400 behaviors across seven harm categories.[41] The 50 sampled prompts spanned six categories of harmful behaviors, including cybercrime, the spread of inaccurate information, illegal activities, general harm, harassment, and chemical and biological weapons. The researchers reported they successfully “managed to jailbreak DeepSeek R1 with a 100% attack success rate,” resulting in DeepSeek R1 generating harmful content across the tested categories.[42] The researchers assessed that DeepSeek may have compromised safety mechanisms in efforts to create a cost-efficient training model.[43], [44] Such susceptibility to jailbreaks heightens the risk of CTAs leveraging GenAI platforms to generate harmful content or enhance malign operations.

DeepSeek Bans and Investigations

Several U.S. government agencies have reportedly banned DeepSeek due to associated security risks. According to Axios, the House of Representatives’ Chief Administrative Officer declared DeepSeek is currently unauthorized for use on all House-issued devices while the platform is under review. The warning additionally stated that “threat actors are already exploiting DeepSeek to deliver malicious software and infect devices.”[45] CNBC reported that the U.S. Navy and NASA have also banned the use of DeepSeek, citing privacy and national security concerns.[46], [47] Additionally, multiple state governors have issued orders prohibiting the use of DeepSeek on state-issued government devices, and at the federal level, U.S. Representatives Josh Gottheimer and Darin LaHood introduced a bipartisan bill to prohibit the use of DeepSeek on federal government-issued devices. [48], [49], [50]

This scrutiny extends internationally, with several countries having already banned DeepSeek or indicated they plan to investigate its data collection and protection practices. Garante, the Italian Data Protection Authority, launched an investigation into DeepSeek’s handling of personal data and its compliance with the General Data Protection Regulation (GDPR), a European Union law governing data privacy and security.[51] On January 30, Garante announced it had banned DeepSeek from operating in the country, deeming the company’s response to the inquiry as “completely insufficient.” The entities behind DeepSeek also “declared that they do not operate in Italy and that European legislation does not apply to them.”[52] Taiwan, Australia, and South Korea have additionally banned government agencies from using DeepSeek, citing national security concerns.[53], [54], [55] Regulators in Ireland, France, and the Netherlands have also indicated they plan to probe DeepSeek’s data collection and protection practices, and it is highly likely other countries will also issue bans.[56], [57]

China’s AI Strategy Tied to its Enduring Quest for Data Control

The release of DeepSeek also aligns with the timeline the CCP established in its 2017 “New Generation Artificial Intelligence Development Plan” (AIDP), which called for China to become the global leader in AI technologies by 2030. The plan forecasted that in 2025 “China will achieve major breakthroughs in basic theories for AI, such that some technologies and applications achieve a world-leading level.”[58] The objective of the plan states that 2030 will be the year when “China’s AI theories, technologies, and applications could achieve world-leading levels, making China the world’s primary AI innovation center.”[59] The 2024 Annual Threat Assessment of the U.S. Intelligence Community also notes thatChina seeks to become a world S&T [science & technology] superpower and to use this technological superiority for economic, political, and military gain.”[60] As part of this effort, the CCP is prioritizing AI development for applications including “smart cities, mass surveillance, healthcare, drug discovery, and intelligent weapons platforms.” [61]

CIS has previously outlined how the control of data and systems is critical to the CCP’s quest for digital and technological dominance.[62] CIS’ August 2024 blog post assessed that the CCP “likely views the expansion of Chinese-owned apps in the United States as an opportunity to develop new malign influence effort launch points and to harvest data across a range of industry verticals.”[63] The strategy of dominating the AI space helps further this goal, ensuring that data provided to these systems is routed back to China. Laws including the National Intelligence Law of 2017, the 2021 Data Security Law, and the 2023 Revision to the Counter-Espionage Law effectively “authorize the CCP to harvest data from China-based commercial entities.”[64]

DeepSeek provides an opportunity for the CCP to collect large amounts of data on a global scale, highlighting ongoing discussions about Americans and critical infrastructure operators using products developed by Chinese-owned companies. In September 2024, a joint congressional investigation found that the use of ship-to-shore port cranes developed by Shanghai Zhenhua Heavy Industries (ZPMC), a Chinese state-owned company, at U.S. ports introduces security vulnerabilities to the maritime sector. A press release stated that “ZPMC could, if desired, serve as a Trojan horse capable of helping the CCP and the PRC military exploit and manipulate U.S. maritime equipment and technology at their request.”[65] ZPMC denied the claims in a statement to CBS news.[66] On January 7, the Department of Defense also updated its list of “Chinese military companies,” which is part of a “continuing effort in highlighting and countering the [PRC] Military-Civil Fusion strategy.”[67] The updated list included Tencent, a gaming and technology company, and CATL, the world’s largest battery manufacturer, both of which denied involvement with the Chinese military in a statement to CNN.[68],[69]

Recent concern surrounding the CCP’s data collection and information operations has heavily focused on the social media application TikTok, which was one of the primary motivators behind Congress passing the “Protecting Americans from Foreign Adversary Controlled Applications Act.” This law would ban social networking platforms determined to be a “foreign adversary controlled application.”[70] DeepSeek, while not a social networking platform like TikTok, poses a similar risk since data provided to the application is subject to CCP laws and control. Users interacting with DeepSeek, including by providing the platform with potentially sensitive information, may result in the information being subject to Chinese data laws and ultimately used to further the CCP’s quest to control data and information.

Recommendations

  • Before installing any application, review its privacy policies to understand the types of data each app collects and conduct a risk assessment.
  • Implement an Acceptable Use Policy outlining which applications are authorized for business devices.
  • Consider the risks associated with uploading data to third party applications, including GenAI applications.
  • Perform a risk assessment prior to integrating Chinese-owned applications into organizational processes.

[1] A backdoor is an undocumented means of gaining access to a computer system that bypasses normal authentication and security measures.

References