Commentary from the Center for Innovation, Trade, and Strategy

China’s Data Ambitions
Strategy, Emerging Technologies, and Implications for Democracies

by Lindsay Gorman

August 14, 2021

Lindsay Gorman details China’s data ambitions with a particular eye to how they relate to emerging technology goals associated with AI. She then discusses how these efforts complicate democratic values in cyberspace and analyzes options for how democracies can address these threats.

China seeks to become the global leader in technologies emerging from advancements in artificial intelligence (AI) and data analytics. This strategy has the dual objectives of accelerating the transformation of China’s own economy and building the nation into a cyber power.[1] To achieve these goals, the country has combined national policy planning and aggressive data-retention policies with an outgoing effort to export data-based technologies.

This essay details China’s data ambitions with a particular eye to how they relate to emerging technology goals associated with AI. It then discusses how these efforts complicate democratic values in cyberspace and analyzes options for how democracies can address these threats.

CHINA’S “BIG DATA” STRATEGY

China’s big data strategy was officially launched in 2014 when it was included in a government work report for the first time.[2] Between 2014 and 2017, this “national big data strategy” grew to support industry development. In 2015, the State Council issued its first top-down strategic planning document on big data, the “Action Plan on Promoting Big Data Development,” which called for the creation of databases on China’s population, corporations, natural resources, and geography and integrated data systems for transportation and tourism, medical information, and education management. At the subnational level, the action plan called on city-level governments and above to implement government affairs and public services applications, track economic data, examine agricultural trends, and utilize smart cities to collect citizen data for services and control.[3]

A year later, the National People’s Congress adopted the 13th Five-Year Plan (2016–20) with a goal to “implement the national big data strategy.” Big data received a boost from Xi Jinping himself at the 19th National Congress in October 2017 when he highlighted the need to “promote the deepened integration of Internet, big data, and artificial intelligence with the real economy.” By the end of 2018, 31 provincial units had released big data action or implementation plans.[4]

This continued prioritization of data was reflected in strategic planning documents as recently as last year. In 2020 the Chinese Communist Party (CCP) Central Committee and State Council added “data” to land, labor, capital, and technology as a new factor of production in its “field-based allocation system and mechanism.” This elevation of data is indicative of a vision for a future economy where data drives development. Indeed, according to the Chinese Academy of Information and Communications Technology, a key distinction between data and the traditional production factors is in the multiplier effect—that data can amplify other factors of production such as labor and capital and produce even more significant economic gains.

“As the United States and China increasingly see technological competition as a defining feature of their relationship, the evolution of data in Chinese strategic planning provides a window into how China aims to win that contest.”

As this national vision has gained coherence within China, policymakers remain keenly aware of the global operating environment. In particular, Chinese strategic thinking mirrors the perceived priority that the West has placed on data governance. These planning documents reveal a detailed understanding of Europe’s landmark data protection framework, the General Data Protection Regulation (GDPR), as well as U.S. federal data strategy and even state-level legislation, such as in California and New York. As the United States and China increasingly see technological competition as a defining feature of their relationship, the evolution of data in Chinese strategic planning provides a window into how China aims to win that contest.

CHINA’S DATA AND AI INDUSTRIES

In some regards, China’s data ambitions are inseparable from its aims in AI applications. State-of-the-art machine-learning models that fuel these advances benefit from large quantities of data as training input.

Statista estimates that China’s AI market size in 2019 was over 55 billion yuan.[5] Financial services, healthcare, and government affairs form the three largest industry verticals for Chinese big data companies, but others have emerged in education, transportation, e-commerce, internet, supply chain and logistics, agriculture, manufacturing, sports, environmental meteorology, and energy. Of particular interest to democracies is big data communications technologies. Techniques leveraging AI and personal data, such as precision marketing and “smart promotion” could strengthen the ability for states to influence citizens through greater control and manipulation of the digital environment. AI-enabled facial recognition technologies integrate with smart cities and could turn safe cities into surveilled cities. Finally, in the military domain, the development and proliferation of weapons systems with increasing autonomy or “intelligentization” could threaten U.S. and allied security interests.[6]

CHINA’S DATA POLICIES: SQUARING INTERNAL CONTROL WITH ECONOMIC INNOVATION

Since the introduction of the European GDPR, recent years have seen increased attention to data governance globally. For China, policies to build out a data governance regime “with Chinese characteristics” support two central goals: industrial competitiveness and information control. Whether China’s data policies create conditions for these goals to be mutually reinforcing or they end up at odds may determine their success.

China’s emerging data governance regime has been characterized by a concept of “local storage, outbound assessment”: stringent localization requirements mean that swaths of data must be stored within China, and elaborate restrictions on cross-border data flows present barriers to the outbound transfer of data.[7] “Local storage” is codified in China’s seminal 2017 Cybersecurity Law requiring that personal data—as well as a vaguely defined class of “important data”—generated in China stay in China. The policy stands in contrast to the open model that democracies have sought to champion.

In the last year, two new draft data regulations have sought to advance the CCP’s national security interests around data while bolstering data-driven economic innovation: the draft Personal Information Protection Law and the draft Data Security Law. The former nods to Europe’s GDPR by introducing a notion of consumer privacy. In practice, it strengthens data localization by requiring security and risk assessments by the state for cross-border personal data transfers. The Personal Information Protection Law also extends this control by including the foundation for a blacklist that would ban certain overseas data controllers and processors.[8]

China’s draft Data Security Law seeks to harness the corporate sector in support of state goals. It introduces a national data classification system categorizing data “according to the degree of importance to economic and social development; and according to impact on national security, the public interest, or the lawful rights and interests of citizens or organizations.”[9] It also introduces new compliance requirements for companies to conduct security risk assessments on data pursuant to the classification scheme and signals an intent to create and regulate domestic markets for “data transactions.”

The Data Security Law, which builds a top-down national classification of data, situates data squarely in the geopolitical contest with democracies, including the United States. The law includes export controls on data and a basis for “reciprocity” against alleged “discriminatory measures” against China. If countries choose to exclude Chinese products, as recent U.S. efforts have done, their companies might find themselves on the Chinese blacklist. The Data Security Law signals the possibility for retaliation even if democracies act through investment-screening mechanisms to prevent bulk collection of personal data by Chinese companies.

Amid this increasingly complex network of data requirements, the success of such policies depends on whether the CCP’s national security goals can be reconciled with and leveraged toward China’s innovation aims.[10] Leadership in the data-driven industries of tomorrow will confer increased economic and surveillance power.

CHINA’S OUTGOING DATA EXPANSION EFFORTS

China marries the internal data control efforts discussed above with an external push to expand the footprint of its model and acquire data from the countries in which its companies operate. Chinese technology giants are at the center of a large-scale effort to vacuum up global data, leveraging opportunities through the Digital Silk Road. Using investments in internet infrastructure and information and communications technology, Chinese firms have exported surveillance technologies to over 63 countries worldwide, and with them a coalescing model of authoritarian governance in the digital arena.[10] This expansion of the CCP’s so-called digital authoritarianism has been well documented across continents.[12] But at the same time, such efforts have AI and big data analytics at their core and present myriad collection opportunities for Chinese data surveillance platforms.

As countries in the “global South” adopt Chinese data-driven technologies, they readily assume a familiar place as sites of resource extraction, with data as the critical resource of the digital age. While data fuels analytical and AI systems, obtaining diverse datasets expands functionality. Huawei boasts that its smart city and “safe city” technology platforms are used in 160 cities spanning more than 100 countries and regions.[13] In Bahia, Brazil’s most ethnically Afro-Brazilian state, Huawei intelligent systems are used to “anticipate crime” and make preventative arrests. In Uzbekistan, Huawei is upgrading Tashkent’s camera network to “digitally manage political affairs.” Huawei’s offerings also include a big data approach to “smart tourism” and cultural heritage information.

The Covid-19 pandemic has shone a spotlight on China’s bulk collection of global biometric information. Chinese biotech giant BGI Group has called for international health researchers to send in virus data and patient samples to be shared through China’s National GeneBank.[14] According to Michael Brown, director of the Defense Innovation Unit at the U.S. Department of Defense, China has more genetic sequencing data on the U.S. population than the United States does.[15] The U.S. National Counterintelligence and Security Center assessed that U.S. health data may be “particularly attractive and valuable to China because of the ethnic diversity of the U.S. population.” Obtaining diverse datasets may also be a partial motivation for collection efforts across the developing world.

“Democracies and autocracies are in persistent competition over the information environment. This information contest spans the intersecting domains of content, data, and information architecture. At the core of this competition is a values-based distinction between the way democracies view information and the way authoritarian states do.”

Of particular concern are the partnerships between China’s tech giants and the CCP’s information and propaganda apparatus. As analyst Samantha Hoffman has assessed, China is building a “massive and global data-collection ecosystem” using state-owned enterprises, Chinese technology companies, and partnerships with foreign actors and institutions, including universities. Both Huawei and Alibaba Cloud, for example, have partnership agreements with Global Tone Communication Technology (GTCOM), a subsidiary of the China Publishing Group, which is a state-owned enterprise under the direct supervision of China’s Central Propaganda Department. GTCOM bills itself as “the world’s leading company in big data and artificial intelligence” and collects global data in over 65 languages through online translation services, multilingual machine translation, and speech- and video-recognition software. Some of that data is sent directly back to Chinese government servers.[16] In a partnership agreement between GTCOM and Huawei touting “the strategic synergy between the two in the field of AI big data,” Huawei promised GTCOM “powerful global data transmission and marketing support in order to expand the scenario-based application of AI, big data technologies in different fields by deeply mining and exerting its impressive resource value.”[17]

HOW CHINA’S DATA AMBITIONS CHALLENGE DEMOCRATIC VALUES IN CYBERSPACE

Democracies and autocracies are in persistent competition over the information environment. This information contest spans the intersecting domains of content, data, and information architecture. At the core of this competition is a values-based distinction between the way democracies view information and the way authoritarian states do. Democracies see open and verifiable information as essential to public discourse for the selection of political leaders and a free press as a check on abuses of power. By contrast, autocracies see information as a tool to be weaponized to entrench their control, and they fear open access to information as a threat to regime survival. In addition to the ways that China’s internal data policies and outgoing data-collection efforts increase its competitiveness in key future industries like AI and biotechnology, they also challenge democratic values in cyberspace through limits placed on freedom of expression, state access to citizen data, and the ability for targeted influence campaigns. To achieve these ends, China combines data collection and information manipulation with direct digital censorship, self-censorship, punitive measures for speech, and targeted influence operations.

First, China’s efforts to expand its global data collection threaten freedom of expression and rights to peaceable assembly. Data surveillance is central to information suppression because it allows the state to link online comments to offline individual behavior and to respond accordingly with punitive measures. Within China, this activity is part of the CCP’s broader strategy for information control. WeChat is one of its most effective tools. The app not only censors communication on topics the state finds unfavorable but also has been linked to arrests of journalists and average citizens alike. The mere threat of such punishment creates an environment of self-censorship in which Chinese citizens avoid speaking freely on sensitive topics.

Recent legal and technological developments also lay the foundation for the extraterritorial expansion of China’s data surveillance and information suppression apparatus. Article 2 of the Data Security Law establishes legal liability when “organizations or individuals outside of the mainland territory of the People’s Republic of China engage in data activities that harm the national security, the public interest, or the lawful interests of citizens or organizations of the People’s Republic of China.” This provision raises concerns about the state’s interests in quelling free expression outside China.

Second, China’s data ambitions risk normalizing concepts of state access to citizen data absent independent legal due process. While the Personal Information Protection Law introduces the concept of privacy, it is understood to exist in reference to the private sector, not the state. As Rebecca MacKinnon observes, “Chinese companies are powerless to protect users from digital rights violations by one of the most powerful—and unaccountable—governments in the world.”[18]

Finally, the global expansion of Chinese data and information platforms presents an opportunity for sophisticated information and influence operations. On communications platforms owned by U.S. companies like Facebook and YouTube, opaque algorithms determine what content is shown to what user at what time. Due to business models based on the sale of online advertisements, such platforms seek to drive engagement. The result is massive collection of user data to understand online behavior with the goal of influencing users to spend more time on the platforms.

Beyond the threats to democracy inherent in this model, foreign actors have sought to weaponize these platforms to promote division and undermine democracies. The additional risk in a platform such as ByteDance’s TikTok is that it is ultimately accountable not to a democratic judicial system or even a board of corporate stakeholders but to the CCP. As such, its black box algorithms may selectively promote or demote content based on a desire for political influence in democracies. During the pro-democracy protests in Hong Kong, for example, hashtags related to the protests were widespread across major social media platforms, but very few appeared on TikTok. Similarly, there are reports that content related to the oppression of Uighur Muslims in Xinjiang has been selectively suppressed by the algorithm.[19] Because an algorithm’s inner workings are considered trade secrets, systematic scrutiny of supposed censorship is extremely challenging.

CONCLUSION: RECOMMENDATIONS AND NEXT STEPS FOR DEMOCRACIES

Striking a balance between preserving an environment of open communication within democracies while controlling against the ways authoritarian regimes exploit that openness to undermine democracy presents a significant challenge. Already, multinational businesses with headquarters, servers, or engineering operations in China find themselves caught between a rock and a hard place. This tension has come squarely into focus during the Covid-19 pandemic with the greater reliance on Zoom for videoconferencing. The company is headquartered in the United States but operates globally, including in China. In June 2020, Zoom shut down the account of a California-based activist for participation in a virtual memorial of Tiananmen Square, citing compliance with Chinese law.[20] As awareness of the CCP’s repressive policies increases, companies are under even more pressure not to continue to enable the party’s abuses.

When it comes to Chinese technology companies operating in democracies, open-market principles are the starting place. Yet a lack of additional scrutiny in key areas like critical communications infrastructure poses risks. The U.S. National Counterintelligence and Security Center has issued the following warning, for example: “The combination of stolen PII [personal identifying information], personal health information, and large genomic data sets collected from abroad affords the PRC vast opportunities to precisely target individuals in foreign governments, private industries, or other sectors for potential surveillance, manipulation, or extortion.”

One approach to addressing these threats, while preserving openness, seeks to raise the bar on all information platforms without regard for country of origin. A federal data privacy law in the United States that would legally limit the ability of all technology platforms to share data without user consent, the argument goes, would provide a means to stop abuses like improper data exfiltration. Such a solution would have the benefit of including additional protections on user data. But relying on user consent requires the general public to contemplate and value national security risks. Such an approach is also highly dependent on effective enforcement, which may be elusive. In the case of influence operations, enforcement becomes even more challenging given algorithmic opacity.

A second approach introduces the idea of “covered countries” for which additional scrutiny of information platforms—particularly those widely used or taking on critical information provisions—is warranted. China would be among the countries warranting additional scrutiny. Criticisms of this approach warn of authoritarian mimicry—emulation of the closed models of cyber sovereignty that democracies have opposed. A key difference between this approach and the authoritarian model of information control is that it remains fully open to technologies and platforms from all but a select handful of authoritarian states. When democracies convened at the Prague 5G Security Conference in 2019, the result was a deeper and clearer understanding of the risks posed by vendors coming from a jurisdiction with weak rule of law, and the ability for national governments to make more informed decisions. Implementing a model that provides increased safeguards against countries designated as high-risk could provide a viable path for increased protections against bad actors across the information technology ecosystem.

Democracies can succeed by embracing an approach that is risk-based and multilateral when evaluating threats and determining responses to China’s growing digital ambitions. At the same time, they need to compete in their own right, including by prioritizing data in strategic matters and developing frameworks around data that advance democratic values, practices, and procedures. As bilateral and plurilateral technology cooperation initiatives solidify, innovation sandboxes and joint R&D investments could provide just such an opportunity.

Lindsay Gorman was the Emerging Technologies Fellow at the German Marshall Fund’s Alliance for Securing Democracy and a consultant for Schmidt Futures. She has spent over a decade at the intersection of technology development and national security policy, including in the office of Senator Mark Warner, the White House Office of Science and Technology Policy, and the National Academy of Sciences. Her research focuses on understanding and crafting a transatlantic response to China’s techno-authoritarian rise, from 5G and the future internet to information manipulation and censorship.

This essay was written in the author’s capacity as an Emerging Technologies Fellow at the German Marshall Fund’s Alliance for Securing Democracy and does not reflect the views of her current employer.

Endnotes

[1] China Academy of Information and Communications Technology, White Paper on Big Data (Beijing, 2019), http://www.caict.ac.cn/english/research/whitepapers/202003/P020200327550643303469.pdf.

[2] Ibid.

[3] Derek Grossman et al., Chinese Views of Big Data Analytics (Santa Monica: RAND Corporation, 2020), https://www.rand.org/pubs/research_reports/RRA176-1.html; and China Academy of Information and Communications Technology, White Paper on Big Data.

[4] China Academy of Information and Communications Technology, White Paper on Big Data.

[5] “Artificial intelligence (AI) Market Size in China from 2015 to 2019,” Statista, https://www.statista.com/statistics/898150/china-artificial-intelligence-market-size.

[6] Elsa B. Kania, “‘AI Weapons’ in China’s Military Modernization,” Brookings Institution, Global China, April 2020, https://www.brookings.edu/wp-content/uploads/2020/04/FP_20200427_ai_weapons_kania_v2.pdf.

[7] Jinhe Liu, “China’s Data Localization,” Chinese Journal of Communication 13, no. 1 (2019): 84–103, https://www.tandfonline.com/doi/abs/10.1080/17544750.2019.1649289?journalCode=rcjc20.

[8] “China’s Draft ‘Personal Information Protection Law’ (Full Translation),” trans. by Rogier Creemers et al., New America, October 21, 2020, https://www.newamerica.org/cybersecurity-initiative/digichina/blog/chinas-draft-personal-information-protection-law-full-translation; and Alexa Lee, “Personal Data, Global Effects: China’s Draft Privacy Law in the International Context,” New America, January 4, 2021, https://www.newamerica.org/cybersecurity-initiative/digichina/blog/personal-data-global-effects-chinas-draft-privacy-law-in-the-international-context.

[9] Samm Sacks, Qiheng Chen, and Graham Webster, “Five Important Takeaways from China’s Draft Data Security Law,” New America, July 9, 2020, https://www.newamerica.org/cybersecurity-initiative/digichina/blog/five-important-take-aways-chinas-draft-data-security-law.

[10] Ibid.

[11] Arjun Kharpal, “China’s Surveillance Tech Is Spreading Globally, Raising Concerns about Beijing’s Influence,” CNBC, October 8, 2020, https://www.cnbc.com/2019/10/08/china-is-exporting-surveillance-tech-like-facial-recognition-globally.html.

[12] See, for example, “The New Big Brother: China and Digital Authoritarianism,” Democratic Staff Report prepared for U.S. Senate, Committee on Foreign Relations, July 21, 2020, https://www.foreign.senate.gov/imo/media/doc/2020%20SFRC%20Minority%20Staff%20Report%20-%20The%20New%20Big%20Brother%20-%20China%20and%20Digital%20Authoritarianism.pdf.

[13] “Smart City,” Huawei, https://e.huawei.com/us/solutions/industries/government/smart-city.

[14] Kirsty Needham, “Special Report: COVID Opens New Doors for China’s Gene Giant,” Reuters, August 5, 2020, https://www.reuters.com/article/us-health-coronavirus-bgi-specialreport/special-report-covid-opens-new-doors-for-chinas-gene-giant-idUSKCN2511CE.

[15] Tim Hinchliffe, “U.S., China Are Neck and Neck for High-Tech Supremacy: Defense Innovation Unit,” Sociable, October 30, 2019, https://sociable.co/technology/us-china-are-neck-and-neck-for-high-tech-supremacy-defense-innovation-unit.

[16] Samantha Hoffman, “Engineering Global Consent: The Chinese Communist Party’s Data-Driven Power Expansion,” Australian Strategic Policy Institute, Policy Brief, October 2019.

[17] “GTCOM Signs a Strategic Contract with Huawei to Build an Application Ecology on the Basis of AI Big-Data Technology,” GTCOM, May 9, 2019.

[18] Rebecca MacKinnon, “Chinese Tech Giants Can Change: But the State Is Still Their Number One Stakeholder,” Ranking Digital Rights, 2020.

[19] Raymond Zhong, “TikTok Blocks Teen Who Posted about China’s Detention Camps,” New York Times, November 26, 2019, https://www.nytimes.com/2019/11/26/technology/tiktok-muslims-censorship.html.

[20] Lindsay Gorman, “Companies Like Zoom Must Choose: America or China,” Newsweek, June 19, 2020, https://www.newsweek.com/companies-like-zoom-must-choose-america-china-opinion-1511645.