Binxing FANG
方滨兴




About the author
Binxing Fang is an expert in cybersecurity and an academician of the Chinese Academy of Engineering. He is the honorary director of the National Computer Network and Information Security Management Center, which sits directly under the Ministry of Industry and Information Technology. At Guangzhou University he is the Director of the Network Security Technology National Engineering Laboratory and the honorary president of the Institute of Advanced Cyberspace Technology. He is known as the "father of China's firewall", and is author of Cyberspace Sovereignty: Reflections on building a community of common future in cyberspace.


关于作者 方滨兴,网络空间安全专家,中国工程院院士。2007年至2013年担任北京邮电大学校长,国家计算机网络与信息安全管理中心名誉主任,广州大学网络空间先进技术研究院名誉院长。他的著作《论网络空间主权》在2017年由北京科学出版社出版。






The following excerpts are translated from Fang’s Artificial Intelligence Safety and Security1 (人工智能安全) (2020). We think the chapters most relevant to existential risks from AI are Chapters 6 (AI-derived safety problems), 7 (AI Actants2), 8 (Safety Hoops for AI Actants), 9 (Safety Evaluation and Detection for AI Actants), and 12 (Looking to the Future of AI Safety). We precede each excerpt with our own bolded summary of the key point and provide section references in brackets after each excerpt.

Cite Our TranslationConcordia AI. “Binxing Fang — Chinese Perspectives on AI Safety.” Chineseperspectives.ai, 29 Mar. 2024, chineseperspectives.ai/Binxing-Fang.


Cite This Work
方滨兴(2020). 人工智能安全. 电子工业出版社.






Selected excerpts

Intelligent weapons may seriously threaten human survival:


"In terms of the possible risks brought by the machine itself, intelligent weapons present the risk of operational errors. In theory, the higher the degree of intelligence of the weapon, the more complex the software controlling its intelligent behavior is, and the higher the probability of failure and errors…

Intelligent weapons will be upgraded to autonomous systems, which may endanger human beings. The development of intelligent weapons from intelligence to autonomy is both a gradual process and an inevitable trend. When intelligent weapons cross the threshold of intelligence and upgrade to autonomy, human beings may also lose the ultimate decision-making authority over intelligent weapons. This leapfrog development of intelligent weapons brings mankind not only the joy of achievements in development and the assurance of achieving goals, but also worries and fears that intelligent weapons may seriously threaten human survival. If actors in a military context lose control of intelligent weapons, those weapons are very likely to cease executing their original military tasks and become threatening enemies. Due to their powerful destructive abilities, intelligent weapons may transform from the main force for attacking the enemy into formidable enemies that are difficult to defeat, with the level of intelligence determining the strength of each side. What’s more, [intelligent weapons] increase the uncertainty of the outcome of the war. Smart weapons which were once the magic weapons for defeating the enemy are likely to become "traitors" that hurt one’s own side. Once emotionless, untiring intelligent weapons are given power over life and death and completely replace soldiers in fighting, will they wantonly kill innocent people because of an "excessive" approach to battle? Will they expand the scope of combat targets and become “humanity’s terminator"? These are the major hidden dangers associated with the possibility of loss of control in AI’s militarization." (6.2.2)
精选原文

人工智能武器军备竞赛带来的风险:


“从机器自身可能带来的风险看,智能武器存在操作失误的风险。从理论上讲,武器的智能化程度越高,控制其智能化行为的软件组织结构就相应越复杂,出现故障失误的概率也就相对增加。……


智能武器将升级为自主系统,存在着可能危害人类的隐患。智能武器由智能化向自主化发展既是一个渐进过程,也是一个必然趋势。当智能武器越过智能化的门槛升级为自主化时,人类也就可能失去了对智能武器的终极决定权。这种智能武器跨越式发展带给人类的就不仅仅是成果开发的喜悦和实现目标的把握,还有担心智能化武器或许会严重威胁人类生存的隐忧和恐惧。运用于军事领域的智能武器如果失控,很可能由原来的军事任务执行者走向反面,转变成为具有威胁力的敌人。由于具备强大的杀伤力,可能使智能武器由打击敌人的主力转变成为难以战胜的强敌,并由智能化程度的高低决定着敌我双方能力的强弱。而更为纠结的是,增加了战争胜负的不确定性,智能武器很可能由原来是本方克敌制胜的法宝,变成了伤害本方的“叛徒”。对于毫无感情、不知疲倦的智能武器,一旦被赋予“生杀大权”,完全替代士兵来作战,会不会因“过度”作战而滥杀无辜?会不会扩大作战对象范围成为“人类终结者”?这些都是人工智能军事化可能失控形成危险的重大隐患。”




There are three necessary elements for out-of-control AI systems:


"When talking about the threat of AI systems to human beings, the first thing to consider is AI systems with the ability to act. We call these “AI actants" (AIA)...

Although as of the date of finalizing this book, there have been no publicly reported cases of AIAs "actually becoming out-of-control and causing harm", there has long been consensus that there is a risk of AIAs becoming out-of-control. In the process of treating AIAs with a high degree of vigilance, we also need to know that not all AIAs present risks of loss of control - only when certain conditions are met can they become out-of-control. Academician Binxing Fang, the chief editor of this book, made a report entitled “My Views on AI Safety and Security” at the China International Software Expo held on June 30, 2018. This report put forward for the first time "three necessary elements for out-of-control AIA". These three elements are: AIAs have the ability to act and destroy3, have uninterpretable decision-making ability, and have the ability to evolve and can evolve into autonomous systems.” (6.4)

人工智能行为体失控三要素: 



“谈起人工智能系统对人类的威胁,首当其冲要考虑的是具有行为能力的人工智能系统。我们将具有行为能力的人工智能系统称为“人工智能行为体“。

在我们高度警惕AIA的过程中,还需要知道,并非所有AIA都存在失控风险——只有满足了特定条件才可能会失控。本书的主编方滨兴院士在2018年6月30日举办的“中国国际软件博览会”上做了题为《人工智能安全之我见》的报告。该报告首次提出了“人工智能行为体失控三要素”。这三要素是指,人工智能行为体具有行为能力以及破坏力、具有不可解释的决策能力、具有进化能力并可进化成自主系统。”



AIA could become an autonomous, conscious system that can fight against humans:


"Once AIA starts to evolve itself or even autonomously set the objective function, it could be possible for it to break away from the “cage” with which humans limit its activity and escape human control. Obviously, if an out-of-control AIA is just like an out-of-control car with no goal, it only presents a derivative safety problem4; however, if AIA evolves to the point where it needs to defend its "right to survive" and does not hesitate to harm human beings in order to protect itself, it will become an autonomous system that can fight against human beings, which will lead to disasters for humanity. Here, our definition of autonomous system is not simply "not controlled by people". Once an autonomous weapon is on the battlefield, it is indeed not controlled by people, but that [loss of control] is limited to the level of decision-making. What we mean by autonomous systems here is that they have "consciousness". Like humans, they know how to protect themselves, and their objective function is to protect themselves from harm. They can even have the ability to socialize and organize. In this case, they will indeed become oppositional to human beings.” (6.4.3)

人工智能行为体具有进化能力,可进化成自主系统:


“一旦AIA开始自我进化甚至自主设置目标函数的时候,将有可能脱离人类限制其活动的“牢笼”,使其失去了人类的控制。显然,失去控制的AIA如果仅仅是像一辆没有目的性的失控汽车一样,那还仅仅是衍生安全的问题;但如果AIA进化到需要捍卫自身的“生存权”,为了自我保护而不惜伤害人类的时候,就会变成可以与人类对抗的自主系统了,这将会导致人类灾难的到来。在此,我们所指的自主系统还不仅仅是“不受人控制”这么简单,自主武器一旦上了战场也不受人控制,但那还仅仅停留在具有决策能力的层面;我们所指的自主系统是具有“意识”,像人类一样懂得如何自我保护,它们的目标函数是保护自己不受伤害;它们甚至可以具备社交、组织能力。在这种情况下,它们确实将会成为人类的对立面。”



AIA can learn things humans haven’t taught them: 


"Autonomy is the third characteristic of AIA. Assuming that harm caused by AIA to humans is not due to loss of control, such as in the case of a driverless car losing control and harming humans due to inertia, then the process of choosing to harm humans will bring about more dangerous situations, and the learning ability of AIA will make it difficult for human beings to mitigate such dangers.

Of course, some people think that since AIA is artificial and is a moving object constructed on a closed set, it is impossible for it to do things that human beings have not taught it to do. For example, when operating on the set of binary digits {0,1}, it would never produce an output from the set of alphabetic characters. In particular, researchers in the field of AI sometimes deny that AI systems (AIS) have the ability to create, because they know that everything AIS learn is taught by human beings. If human beings have not taught something, AIS have no reason to be able to do it; at least AIS have no reason to choose a result that has not previously been deemed correct. For example, even though the Boston Dynamics robot introduced in Section 7.2.1 has the ability to walk on two feet, it will not evolve into a master of "Shaolin Kung Fu" on its own.

The problem is that people may overlook one thing, that is, the connective ability of AI (关联能力). If the Boston Dynamics robot has grasped some basic abilities, even becoming a Kung Fu master would not constitute jumping out of the closed set, so long as it has mastered the evaluation function through observation and learning. In fact, suppose we have taught the robot how to select a reward function such that, if it falls over, the reward function gives a negative value and, if it does not fall over, the reward function gives a positive value. At the same time, the robot has the ability to generate random actions, so after a long time of training and trying various results, the robot will inevitably choose the most suitable combination of actions, which is enough to enable it to achieve the optimal state - a state that was originally impossible for humans to teach. After all, there are many reasons why these robots may master capabilities that exceed expectations.” (7.6.3)

人工智能行为体的自主特性:


“自主性是AIA的第三个特性。假定AIA对人类的危害不是因为失控而造成的,如无人驾驶汽车出现了失控而以惯性来伤害人类,那么自主选择伤害人类的过程就会带来更为危险的局面,AIA的学习能力会变得使人类难以防范。

当然,有人认为,AIA既然是人造的,而且是在一个封闭集上构造出来的运动体,就永远不可能去做人类没有教过的事情,就好比在{0,1}集合上进行操作,永远不会产生字符集中的结果一样。尤其是人工智能专业领域研究者有时否认AIS具有创造能力,因为他们知道AIS所学会的一切都是人类教的,人类没有教的,AIS就没有道理会,至少AIS没有道理去选择一个不曾被认为是正确的结果。例如,就算7.2.1节中所介绍的波士顿动力机器人具有了双足行走的能力,但总不会自己进化成为会“少林武功”的高手。

问题是人们可能忽视了一件事,那就是人工智能的关联能力。如果波士顿动力机器人掌握了一些基本能力,那么通过观察、学习,只要掌握了评价函数,即便是变成武功高手,也不算是跳出封闭集的行为。事实上,从人工智能专业领域的视角来看,假定我们教会了机器人如何选取奖励函数,如摔倒,则奖励函数为负值,如果没有摔倒,则奖励函数为正值,同时机器人具有产生随机动作的能力,那么机器人经过长时间的训练、尝试各种结果,势必会选择出一种最适合的动作组合,从而足以让其获得最优的状态,而这一点原本就是人类无法教授的,毕竟存在多种原因使这些机器人可能掌握超出预想的能力。”



AIAs need to be designed with risk mitigation in mind:


"In order to prevent AIAs from threatening human beings, people need to consider risk at the outset of designing AIAs. The first consideration is the need to develop design principles that can protect human safety. The design principles of AIA should include but not be limited to:
(1) Design to minimize risk: First, eliminate hazards in the design. If the known hazards cannot be eliminated, design choices should reduce risk to an acceptable level;
(2) Use safety devices: Use a safety hoop5 or other safety protection device to reduce the risk to an acceptable level;
(3) Use warning devices: Use a warning device to announce danger and send appropriate warning signals to nearby personnel by voice, light, etc.” (9.1)

人工智能行为体的安全管理概述:


“为了防止AIA威胁人类,人们在设计AIA之初就需要对风险问题加以考虑。为此,首先考虑的就是需要制定能够保护人类安全的设计原则。AIA的设计原则应该包括但不限于:
(1)最小风险设计:首先在设计上消除危险,若不能消除已判定的危险,应通过设计方案的选择将其风险降低到可接受的水平;
(2)采用安全装置:采用保险箍或其他安全防护装置,使风险减小到可接受的水平;
(3)采用告警装置:采用告警装置来告知危险,以语音、光线等方式向接近人员发出适当的警告信号等。”



Safety assessments and evaluation of AI are needed:


[The author writes that if the aforementioned “safety hoop” becomes a necessary component of AIA, it will be necessary to check that the safety hoop works before an AIA can be placed on the market, just as how a car’s brakes and collision avoidance system must undergo testing]

…"More importantly, an evaluation mechanism must be designed to evaluate the level to which an AIA has evolved. For example, can people design an evaluation mechanism to identify what [Go] rank AlphaGo has evolved to? Of course, this should not simply rely on competitions, as is the case currently, but should be a method for fast appraisal. At present, there is no such method because people do not need one. No one worries that AlphaGo will harm mankind, and since for humans Go rank and ability are assessed by competition, the same is true for AlphaGo. However, this does not mean that there is no relevant fast appraisal method. It’s just like how with a classification algorithm, it is relatively easy to quickly evaluate capability as long as the training examples and test examples are separated. So can we extract intermediate processes in Go play to act as evaluation methods? In short, for future technological development and social progress, it is necessary to try to construct corresponding evaluation mechanisms to evaluate how far AIA has evolved.” (12.4)

人工智能的安全评估与检测:


【作者写道,如果上述“保险箍”成为 AIA 的必要组成部分,那么在 AIA 投放市场之前,必须检查保险箍是否有效,就像汽车的刹车和防撞系统必须接受测试一样】

……“更为关键的是,必须设计出一种检测机制,使之能检测出一个AIA已经进化到什么程度。例如,人们是否能够设计出一种检测机制,以鉴定AlphaGo已经进化到什么样的段位?当然这不应该像现在这样简单地靠比赛来鉴定,而应该提出一种方法来进行快速鉴定。目前没有这种手段,是因为人们并不需要,没有人担心AlphaGo会危害人类;而对围棋的段位水平的鉴定就是靠比赛,既然对人类如此,对AlphaGo也就没什么变化。但这并不意味着不存在相关的快速检测手段,就像对一个分类算法进行检测一样,只要将训练样例与检测样例分开就比较容易快速地检测一个分类算法的能力,那么是否可以将围棋对弈中的某些中间过程提取出来作为检测方法呢?总之,设法构造相应的检测机制,以便能够检测出AIA的进化程度,对未来的技术发展和社会进步都是十分必要的。”





Translator’s notes 


1. In English, “safety” and “security” typically refer to protection against unintended and intended harms respectively. In Chinese, the word 安全 (anquan) can encompass the meaning of both “safety” and “security.” Throughout the pieces featured on this website, we select the English translation that we think best fits the author’s meaning. Fang’s book covers both unintended and intended harms, so we translate the title as Artificial Intelligence Safety and Security. In the excerpts on this page, the discussion is mostly of unintended harms/loss of control, so we translate anquan as ‘safety.’

2. Defined as systems with the ability to act.

3. Elsewhere in the chapter, the author expands: “AIA’s ability to act comprises two kinds of external manifestations: one is that it can move in physical space and has kinetic energy; the second is that it can exchange energy with the outside world and thus change the state of other objects.”

4. AI-derived/derivative safety problems are defined elsewhere as “AI vulnerabilities threatening safety in other fields,” e.g. safety accidents caused by AI system errors, the development of AI weapons leading to an international arms race, and the potential loss of control of future AI systems. This is distinguished from AI’s internal safety problems, which are vulnerabilities in the AI system itself.

5. Elsewhere, the author expands: “A safety hoop can be inserted between the propulsion system and the decision-making system, and the decision-making system of the robot needs to pass through the safety hoop to allow propulsion to occur. Just like a fuse will automatically blow when current is abnormal to ensure the safe operation of an electrical system, the safety hoop will be activated once certain conditions are met, and then begin to restrict the robot.






Other Authors


Chinese Perspectives
on Existential Risks from AI

Contact us at: info@concordia-ai.com