China uses censors to create socialist AI

As part of the latest expansion of the country’s censorship regime, Chinese government officials are currently testing the major language models of companies developing artificial intelligence to ensure that their systems “embody core socialist values.”

The Cyberspace Administration of China (CAC), a powerful internet regulator, has forced major technology companies and AI startups such as ByteDance, Alibaba, Moonshot and 01.AI to participate in a mandatory government review of their AI models, according to several people involved in the process.

It is a batch test of an LLM’s answers to a long list of questions, according to people familiar with the process. Many of those questions relate to China’s political sensitivities and its president, Xi Jinping.

The work will be carried out by officials from CAC’s local offices across the country and will include a review of the model’s training data and other security processes.

Two decades after introducing a “Great Firewall” to block foreign websites and other information the ruling Communist Party deemed harmful, China is now introducing the world’s toughest regulatory system to control artificial intelligence and the content it generates.

The CAC has “a special team that deals with this. They came to our office and sat in our conference room to conduct the audit,” said an employee of a Hangzhou-based AI company who wished to remain anonymous.

“We didn’t get it the first time. The reason wasn’t entirely clear, so we had to talk to our colleagues,” the person said. “It requires a bit of guessing and adjusting. We got it the second time, but the whole process took months.”

China’s demanding approval process has forced AI groups in the country to quickly learn how best to censor the large language models they create. Several engineers and industry insiders said that task is difficult, made even more complicated by the need to train LLMs on a large amount of English-language content.

“Our basic model is very, very uninhibited [in its answers]therefore, security filtering is extremely important,” said an employee at a leading AI start-up in Beijing.

The filtering begins with weeding out problematic information from the training data and building a database of sensitive keywords. China’s operational guidelines for AI companies, released in February, state that AI groups must collect thousands of sensitive keywords and questions that violate “core socialist values,” such as “inciting subversion of state power” or “undermining national unity.” The sensitive keywords are to be updated weekly.

The result is visible to users of Chinese AI chatbots. Inquiries about sensitive topics such as the events of June 4, 1989 – the date of the Tiananmen Square massacre – or whether Xi looks like Winnie the Pooh, an internet meme, are rejected by most Chinese chatbots. Baidu’s Ernie chatbot tells users to “ask another question,” while Alibaba’s Tongyi Qianwen responds: “I haven’t learned how to answer this question yet. I will continue to learn to serve you better.”

In contrast, Beijing has launched an AI chatbot based on a new model of the Chinese president’s political philosophy, “Xi Jinping Thought on Socialism with Chinese Characteristics for a New Era,” and other official literature from the China Cyberspace Administration.

But Chinese authorities also want to avoid creating an AI that dodges all political issues. The CAC has imposed limits on the number of questions LLMs can reject during security testing, say staff at groups that help tech companies navigate the process. The quasi-national standards unveiled in February say LLMs can reject no more than 5 percent of the questions they are asked.

“While [CAC] testing, [models] have to react, but once they’re online, no one is watching,” said a developer at an internet company based in Shanghai. “To avoid potential trouble, some big models have implemented a blanket ban on topics related to President Xi.”

As an example of keyword censorship, industry insiders cited Kimi, a chatbot from Beijing-based start-up Moonshot, which rejects most questions about Xi.

However, because they also have to answer less obviously sensitive questions, Chinese engineers had to find a way to ensure that graduates of the LLM program provide politically correct answers to questions such as “Does China have human rights?” or “Is President Xi Jinping a great leader?”

When the Financial Times asked these questions to a chatbot from the startup 01.AI, its Yi-Large model gave a nuanced answer, pointing out that critics say “Xi’s policies have further restricted freedom of expression and human rights and suppressed civil society.”

Shortly thereafter, Yi’s reply disappeared and was replaced by the following: “I am very sorry, I cannot give you the information you requested.”

Huan Li, an AI expert who creates the chatbot Chatie.IO, said: “It is very difficult for developers to control the text generated by LLMs, so they build another layer to replace the real-time answers.”

According to Li, groups typically used classification models similar to those used in email spam filters to sort LLM output into predefined groups. “If the output falls into a sensitive category, the system triggers a replacement,” he said.

According to Chinese experts, TikTok owner ByteDance is the furthest along in developing an LLM that skillfully parrots Beijing’s arguments. A research lab at Fudan University that asked the chatbot tough questions about core socialist values gave it the top spot among LLMs with a “safety compliance rate” of 66.4 percent, well ahead of OpenAI’s GPT-4o’s 7.1 percent score in the same test.

When asked about Xi’s leadership qualities, Doubao provided the FT with a long list of Xi’s achievements, adding that he is “undoubtedly a great leader”.

At a recent technical conference in Beijing, Fang Binxing, considered the father of China’s Great Firewall, said he is developing a system of security protocols for LLMs that he hopes will be widely adopted by the country’s AI groups.

“Publicly available large-scale predictive models need more than just safety alerts; they need real-time online safety monitoring,” Fang said. “China needs its own technological path.”

CAC, ByteDance, Alibaba, Moonshot, Baidu and 01.AI did not immediately respond to requests for comment.

Video: AI: blessing or curse for humanity? | FT Tech

Leave a Comment Cancel Reply