WO2022016299A1 - 一种建立强人工智能的方法 - Google Patents

一种建立强人工智能的方法 Download PDF

Info

Publication number
WO2022016299A1
WO2022016299A1 PCT/CN2020/000154 CN2020000154W WO2022016299A1 WO 2022016299 A1 WO2022016299 A1 WO 2022016299A1 CN 2020000154 W CN2020000154 W CN 2020000154W WO 2022016299 A1 WO2022016299 A1 WO 2022016299A1
Authority
WO
WIPO (PCT)
Prior art keywords
machine
information
memory
activation
decision
Prior art date
Application number
PCT/CN2020/000154
Other languages
English (en)
French (fr)
Inventor
陈永聪
曾婷
陈星月
Original Assignee
陈永聪
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 陈永聪 filed Critical 陈永聪
Priority to PCT/CN2020/000154 priority Critical patent/WO2022016299A1/zh
Publication of WO2022016299A1 publication Critical patent/WO2022016299A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Definitions

  • the present application relates to the field of artificial intelligence, in particular to how to establish strong artificial intelligence.
  • the S101 module is the sensor module of the machine.
  • the S101 module needs to use one or more general-purpose sensors: visual sensor, auditory sensor, taste and smell sensor, tactile sensor, gravity direction sensor and attitude information sensor information, etc., and can also Add sensors for specific applications (for example, autonomous driving can add infrared sensors, lidar sensors, etc.).
  • Machines also need to use sensors that monitor their own state, and these sensors are also part of the machine's perception information.
  • S101 is a module mainly composed of sensor hardware and software corresponding to the sensor. The purpose is to perceive the information outside the machine and the machine itself through the sensor. The difference and number of these sensor types do not affect the claims of the present application. Because in the present application, all sensor data are processed in the same way.
  • the S102 module is a simplified module for machine-to-sensor input information.
  • the simplification of the input information by the machine mainly refers to the extraction of the underlying features of the input information by the machine. It can use any existing feature extraction method, including but not limited to convolutional neural network, image segmentation, contour extraction, downsampling feature extraction, etc. Any existing machine image recognition algorithm can be used in the S102 module.
  • the S102 module is not for the purpose of identifying specific things.
  • the machine processes the input data layer by layer, and then optimizes the data processing parameters through error back-propagation.
  • the goal is to minimize the error under large sample statistics.
  • the algorithm implements the mapping from data space to label space.
  • the goal of the S102 module is to extract local common features in the input data.
  • PCT/CN2020/000109 entitled "A method for imitating human memory to realize general machine intelligence"
  • the purpose of the S102 module is to find those widely existing local common features, rather than to find the mapping relationship between the specific sample space and the label space.
  • the same input sample space may contain a large number of "local common features", which are extracted at different resolutions.
  • Local features have nothing to do with size, but refer to a part of information about things extracted at different resolutions.
  • the local features of some images can be as large as the image itself, but the resolution is low and only contains part of the information of the original image, such as the composition of other local features that may only contain the original image.
  • local features may include contours, lines, curves, textures, vertices, vertical, parallel, curvature and other underlying geometric features at different resolutions, as well as color, brightness and other features at different resolutions. It includes motion modes at different resolutions, as well as the combined topology of underlying geometric features.
  • the current popular deep convolutional neural network is to find the mapping relationship between the same input sample space and a small number of specific labels.
  • the deep convolutional neural network can be used as an application algorithm to realize the mapping between the input data and the local common features. This algorithm itself does not belong to the claims of the present application, but it belongs to the requirements of the present application to find multi-resolution local common features from the input data, and use these multi-resolution local common features to establish the connection relationship between things. scope of rights.
  • the S102 module also includes extracting multi-resolution dynamic features. Similar to image feature extraction, the S102 module also extracts local shared dynamic features.
  • the local shared dynamic features here refer to the basic motion patterns, such as swing, circle, straight line, curve, wave and so on, which are similar to the basic dynamic features that widely exist in our world. So instead of mapping between a large number of motion sample spaces and labels that specifically represent dynamics (such as dance, running, parade, carnival, etc.), it inputs the sample space to those similar Mapping between underlying dynamic features.
  • dynamic features are the basis of knowledge generalization.
  • the analogical application (generalization) of human knowledge to knowledge must be an association based on a certain similarity.
  • This similarity can be static similarity (such as similar appearance, or similar abstract features), or dynamic similarity (such as similar movement patterns, or similar changes in abstract features).
  • the dynamic feature itself can be represented by an abstract particle or volume, so the motion feature can be used as a bridge between the empirical generalization of different things.
  • the machine also uses a similar method to process other sensor input data, including extracting static multi-resolution features and dynamic multi-resolution features.
  • static multi-resolution features For example, for speech, basic speech and speech rate can be used as a static feature, while changes in audio, pitch, and speech rate are a dynamic feature.
  • the machine samples the speech slidingly according to the time windows of different lengths, which is equivalent to different time resolutions. Machines need to extract static and dynamic features at different temporal resolutions and different detail resolutions.
  • the use of multi-resolution to extract dynamic features of things is a crucial part.
  • the multi-resolution extraction method of dynamic features has been described in the patent application with the application number of PCT/CN2020/000109, entitled "A method for imitating human memory to realize general machine intelligence", which will not be repeated here.
  • the S102 module is a perceptual layer feature extraction, and it is crucial to extract static and dynamic features from the input data at multiple resolutions. Because the connections between things are different at different resolutions. The similarity between things is also different at different resolutions. So machines need to build a network of relationships between things at different resolutions. Two things that are not acquainted in everyday cognition may have similar properties at different resolutions. These attributes are the bridges for the generalization of related knowledge.
  • the S102 module is a perceptual layer local feature extraction, but it does not extract all multi-resolution local features every time. Instead, the focus intervals to use those resolutions and extractions are determined based on the machine's search goals for sensor data.
  • the machine's search targets for sensor data come from the expected targets generated by the machine during previous activities.
  • S102 is a processing layer of machine perception layer information, and a software layer that simplifies sensor input data.
  • the input of S102 is the data collected by the sensor and the parameters sent from the machine cognitive layer. These parameters are the range of resolution that the cognitive layer tells the perception layer to use and the range to focus on. These parameters are determined according to the size and attributes of the target that needs to be further identified after the machine has processed the previous information, and are part of the machine's response to the input information.
  • various specific algorithms such as currently popular convolutional neural networks, recurrent neural networks, and image filtering processing can be used, but their output targets are local shared features, rather than being directly mapped to a specific classification space.
  • the local shared features to the specific classification space are completed by the cognitive layer.
  • the multi-resolution feature extraction and the assumption that there is a connection relationship between the temporally adjacent input information is a key step in the establishment of the cognitive layer.
  • the information connection relationship (relationship network) in memory is optimized through associative activation, memory and forgetting mechanisms, which is the establishment of the cognitive layer.
  • the machine stores the input information in the order of input.
  • This information includes motivational data such as external sensor data, internal sensor data, needs and sentiment data.
  • External sensor data is the machine's perception of external information.
  • Internal sensors are various monitoring information of the machine itself.
  • Demand is a state of motivation and motivation for which a machine is preset.
  • Emotion is also a state of motivation and motivation that a machine is preset with. Needs and emotions are part of the machine's preset motivation.
  • the machine uses symbols to represent various low-level needs given to the machine by various humans. For example, the safety needs of the machine itself, the pursuit of happiness, the hope of being recognized and respected by humans, and the self-reward (sense of achievement) brought about by the realization of the machine's self-goal (goal achievement), such as the machine's curiosity to explore the unknown Wait.
  • These requirements can be represented by a symbol, and this symbol can be assigned to represent the state it is in. The difference and number of the types of requirements do not affect the claims of the present application. Because in the present application, all requirements are treated the same way.
  • the machine uses symbols to represent various underlying emotions that humans endow the machine.
  • the emotions of machines can be varied, and each emotion can be represented by a symbol, and these symbols can be assigned by the machine to represent the state.
  • the difference and magnitude of these emotion types do not affect the claims of the present application. Because in the present application, all emotions are treated in the same way.
  • the relationship between the emotion of the machine and the demand state of the machine can be linked through a preset program.
  • the parameters of these preset programs can be adjusted by the machine in its own learning process according to the principle of "seeking advantages and avoiding disadvantages".
  • the emotional state of the machine and the explicit expression of the emotion of the machine can also be linked through a preset program.
  • the parameters of these preset programs can be adjusted by the machine in its own learning process according to the principle of "seeking advantages and avoiding disadvantages".
  • connection relationship In the S103 module, we need to establish a connection relationship between the external information, memory information, sensor information inside the machine, and the motivation-related symbols of the machine and the state at different resolutions.
  • This connection relationship needs to correctly reflect the common sense of the world we live in, and this connection relationship is called a relationship network in the present application.
  • Mirror space means that we store information according to the time order of the input information, and for simultaneously input information (such as images), we store information according to the corresponding original spatial organization.
  • the specific method is: after the machine extracts multi-resolution information features from the input, the machine needs to use these features to build a mirror space.
  • the machine first adjusts the position, angle and size of the underlying features by scaling and rotating the extracted features according to the position, angle and size with the highest similarity to the original data, and overlaps them with the original data, so that these features can be preserved.
  • the relative position of the underlying features in time and space and establish the mirror space. What the machine stores in memory is multi-resolution feature data organized in the most similar way to the original data, which we call mirror space.
  • the information stored in memory has its own memory value.
  • the new memory stored in the memory including the memory value of the demand symbol and the memory value of the emotional symbol, are related to the activation value of the corresponding information (symbol) when the storage occurs, usually a positive correlation, which can be a linear relationship or a non-linear relationship. relation.
  • Proximity activation means that after a specific information in memory is activated, it will activate the information adjacent to it (referring to the information that has a close relationship).
  • Similar activation refers to a specific feature in memory, when receiving activation signals from other features, the receiving ability is positively correlated with the similarity with each other. This is a directional reception capability. So when a similar memory is activated, it sends out its own activation signal and may easily further activate other similar memories. This is because similar memories have a strong ability to receive activation signals from each other.
  • a simple activation value transfer relationship can be that the transfer coefficient is proportional to the similarity. Of course, other transfer functions can also be used, but the principle must be that the transfer coefficient and similarity are positively correlated.
  • “Strong memory activation” means that the higher the memory value, the stronger the ability to receive activation signals from other features. So those memorized information are more likely to be activated.
  • each memory information is assigned a memory value, which is used to represent the time that can exist in memory. Those with high memory values may be long-lived and have a strong ability to receive activation signals from other features. This is to imitate the number of synapses in the human brain to represent the strength of memory, and it is assumed that those memories with more synapses are more likely to be activated by obtaining more activation energy from the surrounding environment.
  • Memory and forgetting mechanisms are relational retrieval mechanisms widely used in the present application.
  • each time the information in the memory is activated it is considered to be used once, so the memory value is increased according to the memory curve of the memory bank where the memory is located.
  • all memories decrease their memory value according to the forgetting curve of their own memory bank.
  • a memory function is when some data increases with the number of repetitions.
  • the specific increase method can be represented by a function, which is the memory function. It should be pointed out that different memory functions can be adopted for different types of data.
  • a forgetting function is when some data decreases over time.
  • the specific reduction method can be represented by a function, which is the forgetting function. It should be pointed out that different forgetting functions can be adopted for different types of data.
  • Memorization and forgetting mechanisms refer to the use of memory and forgetting functions on memory information.
  • the relational network is the context in this space.
  • These contexts arise because of the mechanism of memory and forgetting, where those relationships that cannot be repeatedly activated are forgotten, and those that can be repeatedly activated are strengthened.
  • Concepts are composed of multi-resolution information connected by coarse relational veins. It connects images, speech, text or any other form of expression of the same kind of information. Since these expressions frequently appear together and frequently transform into each other, they are more closely connected. Since human beings use language very frequently, usually in a concept, the number of activations of language may be the most, and the memory value of language is also the highest.
  • connection relationships in the relational network are linked by "adjacent activation”, “similar activation” and “strong memory activation”.
  • the tightest local connection relationship constitutes a concept (including static feature maps and their languages at multi-resolution, dynamic feature maps and their languages); a little looser than concepts is experience.
  • Experiences are those cognitive relationships that recur frequently. Because they can be repeated, they can gradually increase the memory connection between each other. And the experience that prevails in human cognition is common sense. Loose than experience is memory.
  • a memory organization form that can simply express the relational network: information is stored in the order of input time.
  • Those "input temporally adjacent relationships" are expressed as "storage locations are spatially adjacent".
  • the adjacent information in storage space can be adjacent in physical location: that is, information that is adjacent in time is stored in adjacent storage units.
  • Information is adjacent in storage space and can also be logically adjacent: that is, it is stored in a logically adjacent manner, and the specific physical storage unit location is represented by a mapping table between logical locations and physical locations.
  • Another method can also be that each stored information has its own storage time coordinate, and the machine determines adjacent information by searching for adjacent time coordinates.
  • a method is to use a special similarity comparison calculation unit to process the task of finding and inputting similar memories from memory.
  • the similarity comparison calculation unit may be implemented by hardware or by software. It can be a single module or a module integrated into the entire arithmetic unit. Similarity comparison is a very mature algorithm and will not be repeated here.
  • the information activation value close to each other has a large transfer coefficient.
  • the activation value is transferred between similar information through similar activation, and the transfer coefficient and similarity are positively correlated.
  • strong memory activation is a specific case of proximity activation.
  • proximity activation In the above-mentioned “proximity activation”, “similar activation” and “strong memory activation”, the transfer function of the activation value needs to be determined through practice, but the basic principle is the activation value transfer coefficient of "proximity activation” and the time when the two information storage occurs. Time distance is inversely correlated. A time interval can be thought of as a medium that propagates attenuation to activation values. The activation value transfer coefficient of "similarity activation” is positively correlated with the similarity between two pieces of information. "Strong memory activation” means that when the activation value is close to activation, in addition to the attenuation of the time interval, the activation value transfer coefficient also needs to consider the memory value of the receiving party. The memory value of the receiving party can be regarded as the size of the ability to receive the activation value.
  • the same thing may have a large number of features at different resolutions to represent the properties of this thing.
  • the similarity between two things at different resolutions is not the same. After a feature (attribute) at a certain resolution of a thing is activated, this feature may activate features at other resolutions of this thing through proximity activation, and may also activate other features at the current resolution of this thing through similarity. Similar features at lower rates can further activate other properties of other things through proximity activation. Therefore, the similarity is based on the specified resolution. This is why we need to simultaneously extract multi-resolution features from the input information.
  • the specific activation value transfer function does not affect the basic working principle of the machine.
  • the optimal activation value transfer function of various activations needs to be determined through practice, but they must all comply with the limitations of the above principles.
  • the activation value obtained by information in the relational network fades over time.
  • the extinction curve needs to be optimized by practice.
  • the length of the extinction time needs to balance the connection between the activation information before and after and the activation state of the new input information. If the subsidence time is too long, the activation state brought by the new input information is easily overshadowed by the original activation state, and the connection relationship between the new information and other information cannot be clearly expressed. If the extinction time is too short, the connection between the activation information before and after is easily ignored.
  • static concepts are analogous to small parts widely used by machines, while those dynamic feature maps (including concepts representing relations) are analogous to widely used connectors.
  • dynamic feature maps including concepts representing relations
  • big frameworks that represent a class of processes are multiple small parts (static objects) and connectors (dynamic features), organized in a certain temporal and spatial order. They can often be used for empirical imitation.
  • dynamic feature maps including concepts representing relationships
  • the common attributes may include certain features at multiple resolutions (such as some similar features at low resolutions, or one or more similar features at high resolutions), and may also include the use of cognitive layers as bridges to connect similar properties (e.g. language, analogy to dynamic mode).
  • these generalization relationships and bridges can be reflected in the associative activation process of the relationship network, and are generalized through prediction, decision-making and execution systems, so we will not specifically explain how generalization is achieved here.
  • another feature of memory storage is to store the machine's motivation and motivational state data (such as needs and demand state data, mood and emotional state data) in memory, so that this information is the same as other information, and other An activation value transmission pathway is established between the information. That is to say, when a memory information is activated, it may transmit activation values to many needs and emotional symbols in related memories through chain activation.
  • the size of the transfer coefficient is the direct or indirect connection strength in the relational network optimized by the memory and forgetting mechanism.
  • the essence of relational network is a causal network, which reflects the causal relationship between two pieces of information. Those information that are adjacent in time usually have causal relationships, and those similar connections connect causal relationships that are connected in different times to form a large causal relationship network.
  • the application layer includes:
  • the input data of the sensor must first be simplified in step S102.
  • the machine uses these simplified information features to assign initial activation values to these input information using an initial activation value assignment procedure.
  • the initial activation value assignment program is a preset program, and its input parameters include input information, the machine's current needs and motivational states such as emotions. Its output includes the initial activation values assigned to the input information.
  • the machine can also repeatedly assign activation values to the input data after simple processing of the data. This step is one of the response outcomes of the prediction, decision and execution system output.
  • the machine After the machine assigns the initial activation value to the input information, the machine adopts the principle of "proximity activation”, “similar activation” and “strong memory activation” to carry out chain activation.
  • the machine realizes the association function through chain activation, so we also call it association activation.
  • Machines use associative activation to find (a) experience related to the input information.
  • Other types of information associated with the input information exist in memory.
  • the information that the machine activates through association is usually a memory with a higher memory value.
  • These memories are usually higher-level memories through memory and forgetting mechanisms because they can be repeatedly activated. The reason why these memories can be repeatedly activated is because such information relationships can be repeated in our lives. They are the "experiences" that machines gain. Therefore, the mechanism of memory and forgetting is the optimization mechanism of the relational network and the basis of intelligence.
  • Memories activated by machines based on input information may include all types of memories in memory. For example, when speech is input, the machine may activate images, words, feelings, emotions, related other speeches, or a past memory that are closely associated with these languages based on similar speech sounds in memory.
  • the specific activation content depends on: (a) The relational network obtained by the machine through learning experiences and learning parameter settings. (b) The chain activation parameter setting of the machine (this is equivalent to setting the association mode of the machine). (c) The initial activation value assigned by the machine to the input information. The larger the initial activation value, the more content the machine can activate.
  • the machine Since the size of an object is one of the low-resolution features, the machine also uses the size of the specific object in the field of view compared to the normal size of the object in the feature map to assist the machine in establishing the depth of field in the environment.
  • the machine typically resizes the reconstructed environment based on the size of its own target (usually the conceptually relevant image with the highest activation value).
  • Reconstructing the 3D environment through multiple local information is a mature technology, and will not be repeated here.
  • the consciousness of the machine is a behavioral way for the machine to distinguish itself from the outside world, and to decide how to interact with the outside world according to the method of "seeking advantages and avoiding disadvantages". So the essence of consciousness is a way of behavior. It is precisely because of the distinction between the self and the non-self that the machine generates consciousness and establishes a connection between the self and the external things. The essence of this connection is the relationship between external things and one's own "benefit" and "harm”. And this relationship is gradually established in the process of machine learning.
  • the machine uses these memories to create a mirror space and a mirror image of the machine itself, and uses the mirror space and the mirror image of the machine itself as a way to combine these experiences.
  • the machine views these memories from a third-person perspective.
  • These memories contain information about the machine's emotions, needs, and results.
  • the machine uses this information to plan its own response to the input information in a way of seeking advantages and avoiding disadvantages, and may evaluate it multiple times. The possible outcomes of these responses.
  • the source of power to drive the machine is the motive of the machine, and the motive of the machine can be summarized as “seeking advantages and avoiding disadvantages”.
  • "Benefits” and "harms” are partly preset; partly established in acquired learning, because they are related to the machine's own needs. Analogy to humans, for example, at the beginning, “water”, “milk” and “food” are innate “benefits”, and later through learning, we obtained the connection between "exam scores”, "money” and our innate needs, Later, we also found that the objects of operation can also be insubstantial things such as “love” and "time”, and we even pursue dominance in the group. This is a kind of underlying motivation in our genes to seek advantages and avoid disadvantages. extension.
  • the trainer may also link the behavior to the outcome in a single memory frame later on by pointing out the behavior itself and giving feedback. The trainer does not even need to specify which behavior is good or bad. The machine only needs to receive the correct feedback every time, and through memory and forgetting, it can gradually establish the connection between the correct behavior and the demand value.
  • the machine will establish the relationship between various specific things and the underlying motivation according to the feedback from the outside world. And these relationships are the basis for machines to make decisions.
  • the underlying motivation of the machine is relatively simple and can be pre-set. For example, giving the machine the motivation to learn and obey human laws, giving the machine the motivation to seek human approval, giving the machine the motivation to avoid danger, giving the machine the motivation to protect the safety of the owner, etc.
  • Machine motivation and motivational states are closely related to emotions.
  • the machine learns what emotions can bring gains and losses in what environment, and in turn adjusts emotions or emotional expressions. Because of the close relationship between emotion and machine motivation (including motivational state), machine motivation can also be expressed by emotional needs. Machines can use the pursuit of emotional needs to make choices and respond.
  • the storage of any input information by the machine will also store the machine's motivation and motivational state (such as needs and demand states, emotions and emotional states, etc.).
  • motivation is represented by a symbol, and its memory value is positively related to its activation value when storage occurs. So those motivations that get high activation values, usually because of the high memory value acquired when their storage occurs, may become long-term memory. These long-term memories have the potential to be activated again and again, thereby affecting the machine's decision-making and behavior in the long run. And those information related to motivational states with high activation value (such as strong emotions, large gains or losses, etc.) may be affected by high activation value because they are temporally adjacent to motivational states with high activation value.
  • the high activation value of the motivational symbol is activated in reverse, thereby increasing the activation value of oneself during storage, and also improving the memory value obtained during storage.
  • Language plays an important role in machine intelligence.
  • Language is a set of symbols established by humans to better communicate experiences. Each symbol represents some specific thing, process and scene.
  • the associated memory represented by the language is activated. These memories may have both information about the language itself, as well as memories about language use that are activated (such as phonetic emphasis for emphasis or text emphasis, such as distrustful or mocking intonations, etc.).
  • the activated information constitutes an activation information flow.
  • the activation value of the activated information will decay over time. The parameters of decay are related to the motivations and states of the machine (eg needs and wants states, moods and emotional states).
  • the chain activation of language realizes the context-related recognition of all input information.
  • the input information here includes both environmental information and activated memory information.
  • the mutual activation and assignment of these information reflects the contextual relationship. This association is more extensive than the content of the statistically generated semantic library. It's not just about language, it's about all sensory input and associated memory. Therefore, through the multi-resolution information formed by S102 and the common sense network formed by S103, the machine can realize the connection between language and static and dynamic images, feelings, needs and emotions, and also realize the connection between language and related language and memory. When this connection is incorporated into the machine's understanding of the language input, and based on the understanding of the language and the response based on the relevant experience, it reflects that the machine truly understands the true meaning of the input language.
  • Language input constitutes an input information flow
  • the corresponding activation memory also constitutes an activation information flow.
  • Reconstruction is to reconstruct the environmental information in it, by overlapping the same parts of the activated environmental-related information (such as images, sounds and feelings, etc.) to form an imaginary process.
  • the machine also needs to integrate the information activated by the input language about sensation, emotion, vision, action, body state, etc. and the machine's own existence. This information often also activates a similar experience in the machine itself. Therefore, the machine can experience the sensory-related information such as sensation, vision, emotion, movement or body state brought by these languages.
  • the essence of forecasting is a statistical behavior.
  • the prediction of the machine is to infer the various possibilities and corresponding probabilities of the development of things, or the various possibilities and corresponding probabilities of the behavior of others, based on past experience or similar experience.
  • the machine When the information is entered, the machine does not need to exhaustively predict all possible outcomes, which is also an impossible task. The machine only needs to evaluate the possible outcomes of those activated experiences in relation to the input information. This amounts to using common sense to limit the range of possible outcomes. Within this limited scope, the machine can use any current artificial intelligence prediction methods, such as Monte Carlo search, decision tree, Bayesian estimation, rule-based and other machine reasoning methods, to infer the current development of things to every possible probability .
  • current artificial intelligence prediction methods such as Monte Carlo search, decision tree, Bayesian estimation, rule-based and other machine reasoning methods
  • the machine uses the associative chain activation method to limit the relevant decision-making range according to the activation state of the memory. Therefore, the evaluation and response of the machine to the input information are based on the input information and the scope limited by the activated memory to search, evaluate and respond. This amounts to using common sense to define what needs to be searched, evaluated, and responded to.
  • Machines can imitate these past experiences and establish possible response processes by means of segmental imitation. Then, within this limited scope, the machine can use any current artificial intelligence prediction method, such as Monte Carlo search, decision tree, Bayesian estimation, rule-based and other machine reasoning methods, according to the "benefit” and "harm” of the prediction. ” to choose their own decisions and responses.
  • any current artificial intelligence prediction method such as Monte Carlo search, decision tree, Bayesian estimation, rule-based and other machine reasoning methods, according to the "benefit” and "harm” of the prediction. ” to choose their own decisions and responses.
  • the basic starting point of the machine's response to input information is to make its own response based on past experience, so as to maximize the probability of those things that generate “benefit”, especially Scenarios with high return value. And reduce the probability of those things that produce "harm”, especially those scenarios that can bring huge loss value. Therefore, driven by the motivation of weighing the pros and cons, the machine combines its own responses based on experience to achieve the goal of "seeking advantages and avoiding disadvantages".
  • the decision of the machine is a path planning method based on the prediction ability of the machine.
  • the purpose of the path is to maximize the benefits and minimize the losses.
  • the machine With the ability to predict, the machine turns the completely open problem of decision-making and response into a series of relatively closed problems of how to increase or decrease the probability of happening within a certain range.
  • the relational network Because common sense is established in the previous steps, when every thing happens (this is the effect in causality), the conditions related to it (this is the cause in causality) can be obtained through the relational network.
  • Those causal relationships with strong associations are strongly connected in the relational network due to repeated occurrences. Therefore, the relational network can express the causal relationship layer by layer.
  • each decision-making step is to make the development direction of things "seek advantages and avoid disadvantages". This may be a process of interaction with the outside world.
  • the interaction itself is a kind of "seeking advantages and avoiding disadvantages” based on past experience to promote the development direction of things.
  • the probability of occurrence of events with high profit value is continuously increased, and the probability of occurrence of events with high loss value is continuously reduced. This is an iterative process.
  • each step is handled in the same way.
  • the machine increases the probability of occurrence of events leading to high revenue value layer by layer. This is similar to a chained activation process, step-by-step activating those events on the path to high gain, while carefully avoiding those events that might lead to high loss values.
  • the response planning problem of the whole machine becomes the problem of finding the optimal path in the causal chain network, which is the problem that the current machine intelligence has solved .
  • a machine can determine the prior probability of an event (such as an event that brings a high gain value or a loss value) by searching its memory.
  • the causal strength (posterior probability) between a condition and that event can then be determined through a network of relationships.
  • the strength of connection between different conditions in the relational network can reflect whether different conditions are independent.
  • the machine only needs to select some relatively independent conditions, and through the naive Bayes algorithm, it can predict the probability of the event.
  • the machine can decide its own response based on the calculated probability.
  • responses can take various forms, such as: increasing the probability of this event, or reducing the probability of this event, or not affecting the probability of this event. It depends on the value of gain and loss of this event to the machine. To increase or decrease the probability of this event occurring, it can be further planned to increase or decrease the probability of occurrence of conditions related to the occurrence probability of this event. This process is essentially an iterative probabilistic path search problem.
  • the machine uses the probabilities of various possible outcomes in memory as prior probabilities under conditions similar to the current situation. The probability of occurrence of various gain and loss values is then calculated based on the conditions associated with each outcome and the posterior probability between the outcomes. The machine then generates the next target, further determining the probability of each condition occurring.
  • the response of the machine at this time can be (a) search and count the posterior probability between each condition and the occurrence of the corresponding gain value and loss value. It is then used to update the overall gain and loss assessment. This can be done by searching for the connection strength between the relational networks. (b) Further update the probability that a certain condition occurs at present.
  • the prediction of possible external feedback after the machine responds to itself also includes activation of two types of motivational state memory.
  • One is the need and emotional state in which the self is in the reappearing memory, which comes from the various feelings and emotions about the self in the activated memory.
  • One is the need and emotional state of oneself when observing a similar situation from the perspective of an observer, which comes from observing the various feelings and emotions generated by the machine in the activated memory of others in similar situations. Therefore, when a machine predicts "benefit” and "harm”, it simultaneously evaluates the "benefit” and "harm” brought by an event from its own perspective and from the perspective of others.
  • the predictive ability of a machine does not only include predicting the "benefit” and “harm” that a thing may bring. It is also necessary to predict the response that oneself or others may take under the drive of "benefit” and “harm”, and the impact of others' response on their “benefit” and “harm”. These are obtained by statistical relationship network, related needs and emotional states and other motivational state values. Therefore, the evaluation result of the machine changes dynamically with more input information.
  • the decision-making and response process of the machine is a dynamic path planning process. It is driven jointly based on empirical responses and probabilistic calculations based on gains and losses.
  • the machine can decompose an abstract target of profit-seeking and avoiding harm through layer-by-layer iterative decomposition. These tasks can be subdivided into very specific target tasks, such as all the way down to the underlying driving capabilities of the machine. This process is the machine's decision-making and response system.
  • Mimicry is the ability that humans have in their genes. For example, for a babbling child, if every time he (she) returns home, we greet him (her) and say “you are back”. After a few times, when he (she) goes home again, he (she) will take the initiative to say "you are back”. This shows that he or she has already begun to imitate others to learn without understanding the meaning of the information. In the same way, we let machine learning take the same approach. Therefore, the machine needs to put imitation into the machine as an underlying motivation.
  • these words or actions will activate the machine's own relevant memory.
  • These memories may be a similar pronunciation, or a basic action fragment. These memories further activate sensory information, need and emotional information, language or motor memory associated with these memories.
  • the machine will make similar speech output or action output by adjusting the underlying driving parameters in the experience through the decision-making system based on these activated memories.
  • the underlying driver refers to the voice output of the underlying experience, or the action output of the underlying experience. They are muscle-driven commands corresponding to specific voices or actions, where the parameters are learned and continuously updated through feedback.
  • Humans can preset some basic speech or action (including facial expressions and body language) capabilities for machines.
  • the optimization of their parameters can be through subsequent learning and training, so that the results of these parameters and behaviors can be associated through memory, and continuously adjusted through the emotional and demand system (influenced by self or external feedback), and finally driven by the underlying motivation,
  • the machine obtains the relationship between different parameters stimulated by different external information to form memory.
  • These memories are the knowledge and skills of the machine in the face of external information input. They include behavioral habits such as language, movements, expressions, and body movements.
  • Humans can also give machines preset conditioning systems. What these systems do is what a human would want a machine to do given a given input. For example, the avoidance action of the machine in a critical situation, or the specific output action of the machine under the input of specific information (for example, these conditioned reflex systems can achieve the purpose of self-checking of the machine, or emergency shutdown, or adjusting the working state of the machine, etc.).
  • the machine can specifically execute the response according to its own decision-making.
  • the execution response step is a process of translating the plan into the actual output.
  • the machine chooses speech output in selecting various possible response steps, it is relatively simple, just convert the image feature map to be output into speech, and then use the relationship between languages in the relational network (existing in grammatical knowledge in relational networks), organized into language output sequences, and invoked pronunciation experience to implement.
  • the machine may choose some dynamic features expressing the whole sentence (such as using different movement patterns of tone, audio pitch or stress change) to express doubts, proteste, distrust, emphasis on human beings, etc. Common methods. These methods are usually low-resolution features of a sentence or an entire speech). Because machines learn these expressions from human life, any human expression can theoretically be learned by machines.
  • the machine needs to respond to the sequence goals to be output. According to these goals, different time and space are involved, and they are divided in time and space to facilitate the coordination of their own execution efficiency.
  • the method adopted is by selecting temporally closely related targets and spatially closely related targets as groupings. Because the information combination formed by the combination of the dynamic feature map and the static feature map, the environment space of its related memory has time and space information, so this step can adopt the classification method. This step is equivalent to rewriting from the general script to the sub-script.
  • the machine needs to combine the intermediate goals in each link with the real environment again, and use the method of segmental imitation to unfold it layer by layer.
  • the response plan proposed by the machine at the top level is usually composed of highly generalized process features and highly generalized static concepts (because these highly generalized processes can find multiple similar memories, so use them to establish The responses are also highly generalized). For example, under the total output response of "business trip", "going to the airport” is an intermediate link goal. But this goal is still very abstract, and machines cannot perform imitation.
  • the machine needs to be divided according to time and space, and the link that needs to be executed in the current time and space is the current goal. And take other time and space goals as inheritance goals and put them aside for the time being.
  • the machine targets the intermediate link the machine still needs to further subdivide time and space (write down the graded script again).
  • This is a process of increasing temporal and spatial resolution.
  • the process of converting a target into multiple intermediate targets is still the process of using decision-making ability, analyzing various possible outcomes and probabilities, and choosing its own response according to the principle of "seeking advantages and avoiding disadvantages".
  • the above process is iterative, and the process of dividing each goal into multiple intermediate goals is a completely similar processing flow. It has to be broken down to the underlying experience of the machine.
  • the underlying experience for language is mobilizing muscles to make syllables. In the case of movements, it's broken down into driving commands to the relevant "muscles". This is a tower-shaped breakdown structure.
  • the machine starts with a top-level goal and decomposes a goal into multiple intermediate goals. This process is to create dummy intermediate process objects, which are retained if they "fit”. If it "doesn't meet the requirements" then recreate it. This process unfolds layer by layer, eventually building up the colorful responses of the machine.
  • the machine may encounter new information at any time, causing the machine to process various information, and these original goals become inheritance motivations.
  • This is equivalent to the process of organizing activities, constantly encountering new situations that need to be resolved immediately, otherwise the activities cannot be organized. So the director called off other activities and first came to solve the problems encountered in front of him. Once resolved, the activity continues. Another situation is that during this process, the director suddenly received a new task, so after weighing the pros and cons, the director decided to suspend the activity first and prioritize the new task.
  • Machines perform imitation tasks that can be performed while decomposing other objects into more detailed ones. So the machine thinks as it does. This is because the reality is very different, and it is impossible for a machine to make a plan knowing the outside world in advance. So it's a process where the environment and the machine interact to accomplish a goal.
  • the machine can use the above capabilities to complete an understanding and response to the input information.
  • This process acts as a minimal cycle of interaction between the machine and the outside world.
  • the continuous repetition of the machine is to use this process to complete a larger goal, which is expressed as a continuous interaction process between the machine and the outside world, and shows machine intelligence.
  • the experience of machines not only forms connections in relational networks through memory and forgetting mechanisms, but also actively strengthens them.
  • This active reinforcement connection can take many forms: for example, through language learning from the experiences of others.
  • This experience is stored in memory as new input, which is part of memory.
  • the machine actively turns these memories into long-term experience by repeating the memory of the information that is closely related to the relationship between "benefit” and "harm”.
  • the S105 module in FIG. 1 is a communication connection module of the machine.
  • the machine can exchange data with other external machines (including computers) through the communication module according to a preset protocol.
  • Machines can realize distributed computing through data exchange.
  • on-site machines may process part of the data, and may also transmit some or all of the information to the central brain, and make decisions with the help of the central brain's powerful information storage and processing capabilities.
  • Machines can also share memory through data exchange. For example, the experience gained by one machine can be passed on to other machines, or the memories of multiple machines can be fused to form a more complete experience. Computing power can also be shared among machines, thus constituting distributed thinking capabilities.
  • machines Through the communication connection module of machines, machines can realize cognitive sharing, decision sharing and behavior coordination, so as to realize super artificial intelligence.
  • Figure 1 is a schematic diagram of one possible component of the machine.
  • the S101 module In order for the machine to generate a cognitive way similar to that of humans, the S101 module needs to use one or more general-purpose sensors: visual sensor, auditory sensor, taste and smell sensor, tactile sensor, gravity direction sensor and attitude information sensor information, etc., and can also Add application-specific sensors (for example, autonomous driving can add infrared sensors, lidar sensors, etc.). Machines also need to use sensors that monitor their own state, and these sensors are also part of the machine's perception information.
  • S101 is a module mainly composed of sensor hardware and software corresponding to the sensor. The purpose is to perceive the information outside the machine and the machine itself through the sensor. The data generated by the sensors of the machine, which is input to the processing unit of the machine, is called input data.
  • the communication between the sensor data of the machine and the processing unit of the machine can use any existing communication system or a self-defined communication system, and these specific communication forms do not affect the implementation of this patent application.
  • the machine uses symbols to represent a type of motivation. For example, language symbols, or some kind of sensory information data are directly used, or a symbol is directly artificially specified to represent a certain type of motivation of the machine.
  • the "Sa" symbol is used to represent the safety requirement of the machine, for example, a value from 0 to 100 is used to represent the state of the safety requirement of the machine. Where 0 is very insecure, and 100 is completely insecure.
  • the "dangerous" symbol is used to represent danger, and H, HM, M, ML, and L are used to represent the degree of danger from high to low.
  • a "smiley face” is used to represent a happy mood, and 0 to 10 is used to represent the degree of happiness.
  • how many motive types are established, what form of symbols is used to represent the motive types, and what form is used to express the state of the symbol does not affect the claims of the present application.
  • all symbols and corresponding states are handled in a completely similar manner. For example, we can use different symbols to create emotions such as excitement, anger, sadness, nervousness, anxiety, embarrassment, boredom, calmness, confusion, disgust, pain, ashamedy, fear, happiness, romance, sadness, sympathy, and contentment for machines. .
  • Each emotion has its own quantitative space.
  • the motivation of the machine to seek advantages and avoid disadvantages is to weigh the advantages and disadvantages in the space established by various motivational needs to find an acceptable range of space. Then use the experience to push the event into a space that is acceptable to you.
  • the strategy of this push may be to directly promote the direct goal, or it may be a precondition to promote the realization of the direct goal, thereby increasing the probability of the direct goal occurring.
  • Multi-resolution information feature extraction system can use the current artificial intelligence perception processing capabilities to extract the features of input information at different resolutions.
  • the difference from the current artificial intelligence perception processing method is that the current algorithm realizes the mapping from the data space to the label space.
  • the goal of the S102 module is to extract local common features in the input data.
  • the existing artificial intelligence perception processing method is to optimize the mapping network by optimizing the error function through the mapping of a large amount of data to the label.
  • This mapping is data to specific tags, and the optimized mapping relationship is closely related to specific tags, so this algorithm is not universal.
  • all input data are directly mapped to local shared features.
  • the local shared features are the basic feature library established through pre-training.
  • the basic assumption for the establishment of these basic feature libraries is that the evolutionary process of human beings develops along the direction of efficient use of computing power. Because only in this way, can the maximum energy consumption be saved while increasing the algorithm complexity to deal with the complex environment, thereby increasing the probability of survival.
  • the innately formed extraction algorithm for widely existing local features is a concrete manifestation of such an evolutionary direction. Because in this way, these algorithms can be reused to the greatest extent possible, and the energy efficiency ratio of the calculation can be improved.
  • multi-resolution features For example, if its claws resemble cat claws, then we might activate the prediction that it might scratch us. Therefore, the extraction of multi-resolution features, is the key to knowledge generalization. Because multi-resolution action is widely present in different things. So multi-resolution action is a widely existing bridge of knowledge generalization.
  • the specific method of the multi-resolution extraction method is to compress the data at different resolutions, and then use data extraction windows of different sizes to repeatedly search for local similarity.
  • PCT/CN2020/000109 entitled “A method for imitating human memory to realize general machine intelligence”
  • Multi-resolution feature extraction does not require a large number of samples because it is a mapping from features to concepts, and usually a concept contains limited features.
  • Associative activation refers to the chain activation process under the principle of “proximity activation”, “similar activation” and “strong memory activation”.
  • the machine assigns its initial activation value according to the motivation, and performs chain activation through the principle of "proximity activation”, “similar activation” and “strong memory activation”.
  • the node (i) will be activated. It will pass the activation value to other feature graph nodes with which it is connected.
  • the transfer coefficient is determined according to the principle of "proximity activation”, “similar activation” and “strong memory activation”. If a node receives the passed activation value and accumulates its own initial activation value, the total activation value is greater than the preset activation threshold of its own node, then it is also activated, and it will also be connected to other nodes that are connected to it. Feature maps pass activation values. This activation process is passed on in a chain until no new activation occurs, and the entire activation value transmission process stops. This process is called the associative activation process.
  • the associative activation process is a search method for related memories, so it can be replaced by other search or lookup methods that perform similar functions.
  • the specific method on how to realize the association chain activation system is proposed, which will not be repeated here.
  • the preset basic response system is a machine response system realized by a preset program. These systems are instinctive responses to which machines respond. These instinctive responses can be gradually optimized in acquired learning.
  • Basic actions include programs that give machines the ability to imitate human beings to make basic actions.
  • Actions include language pronunciation and expression, and the imitation of body movements.
  • Instinct response refers to the realization of the output response of the machine under a specific input through a preset program. For example, the machine's avoidance response to high temperature, the instinctive avoidance response to falling, and the avoidance response to sudden impact can be achieved through preset programs. These responses are adjusted through experience in subsequent learning. For example, through the posterior results, it is learned that in some specific information input situations, the instinctive response may bring serious losses. Therefore, when the input situation that stimulates the instinctive response is accompanied by specific information that may cause the instinctive response to cause serious losses, the machine can suppress the instinctual response and seek more optimized income and Loss balance.
  • Initial activation value assignment system In step S102 of FIG. 1 , after the machine extracts the multi-resolution features, the initial activation value assignment program is used to assign initial activation values to these input information.
  • the initial activation value assignment program is a preset program, and its input parameters include input information, the machine's motivation and motivation state at this time, such as needs and demand states, emotions and emotional states. Its output includes the initial activation values assigned to the input information.
  • a simple approach is to directly assign equal activation values to all input information based on the machine’s motivational states such as expected information, needs, and emotions at this time. Another method is to do a simple classification according to the expected information of the machine, and use different initial activation values for different expected information.
  • a method of assigning different initial activation values to different resolution information can also be used. For example, a larger activation value is assigned to low-resolution information, and a lower activation value is assigned to high-resolution information.
  • the assignment can also be reversed, depending on what the machine expects from the input. The machine's expectation of input information comes from the results of previous information processing.
  • the machine can also adjust the initial activation value assigned to it according to the frequency of input information. For example, for those frequent stimuli, the initial activation value assigned to it gradually decreases. For information with a low initial activation value, during the chain activation process initiated, the activation value obtained by other information is also low.
  • the initial memory value it obtains is positively correlated with its activation value when the memory occurs, so those things that are commonplace in daily life, because the initial activation value obtained by it is low, the chain activation it initiates gives The activation value assigned by other information is also low, and the memory value obtained when it is stored is also low.
  • the memory storage is first put into the temporary memory bank. Only the information that obtains enough memory value in the temporary memory bank will be moved into other memory banks.
  • the initial activation value of the machine can also be assigned to the initial activation value multiple times. For example, the aforementioned method is used to simply assign the initial activation value to the input information. Then, according to the preliminary processing results of the input information, the degree of connection with "benefit” and “harm” is initially determined, and then according to the degree of connection between "benefit” and “harm", it is assigned again according to the preset procedure. Initial activation value. For example, for those information that may bring great "benefit” and "harm", the machine's strategy may be to further analyze the information.
  • the above basic action system, preset instinct response and initial activation value assignment system can be accomplished by using currently mature computer programs. These specific implementation manners can be implemented by common knowledge in the industry.
  • the patent of the present invention mainly discloses the method and approach for realizing general machine intelligence by using the existing technology, and the specific implementation details of the specific steps that can be realized by those skilled in the industry based on the known knowledge in the industry do not need to be included in the application of the present invention. Further explanation.
  • Environment information is part of the machine input information. This part of the information activates similar environmental information that the machine remembers.
  • the environmental information in these memories may be memories of other parts of the same environment or other perspectives in the memory, or memories of similar environments. These activated information may include memories such as visual information, auditory information, tactile information, motivation, and motivational state information.
  • the machine needs to use (1) to overlap similar parts of the environment to construct a 3-dimensional environment.
  • (2) Use as a model about shared structural features in similar environments to predict missing information.
  • the environment formed by the machine is the current environment constructed by integrating the currently input environmental information and the memory-related environmental information. This is a reconstruction of the environment by the machine.
  • the reconstruction of the environment by the machine is essentially the prediction of the environment by the machine.
  • the corresponding auditory recall, tactile recall, motivation, and motivational states may be triggered by the machine through associative activation, so the environment reconstructed by the machine is a subjective emotion. environment of.
  • the size of the machine's prediction range for the environment is related to the machine's expected goals. Generally, when the expected target is large, the forecast range of the environment is also large. When the target resolution is expected to be high, the predicted resolution of the environment is also high. And the machine's expected goal is the goal that the machine produces in the previous prediction, decision-making and execution process.
  • Self-reconstruction Similar to how the environment is reconstructed, the machine also uses the same method to reconstruct its self-image. In reconstructing self-image, in addition to using the current input information, the machine also needs to integrate the relevant information in the memory activated by the current input information: such as sight and hearing, touch, smell, taste and feeling, gravity sense, limb state sense, motivation and Motivational states (such as mood and emotional states, needs and wants states, etc.). The machine needs to use (1) to overlap similar parts of this information to construct a 3D image. (2) Use common structural features in one's own memory, such as a model of a person, or a typical image of oneself, as a model to organize this information and supplement the missing information.
  • the whole information assembled by the machine is the image and feeling about itself established by the machine. For example, when we do movements with our hands behind our backs, we seem to be able to see them. This is because after we send out neural commands and obtain tactile perception, we activate the visual connection of similar neural command connections in memory, and also activate the similar visual and tactile connections of proprioceptive posture perception information, and also activate similar tactile perception information connections. Vision, these information are integrated into our overall image by overlapping similar parts. We create a self-mirror in our mind, and we seem to see the self-mirror in action.
  • the machine has a specific concept of "self" through the preset software and hardware system: the limbs or vocal organs that can be driven by its own commands, the sensory organs that can transmit various information to itself, and the totality of its various senses. are visual and auditory information together. Since these information always appear at the same time, the relationship between them is very close in the relationship network, thus forming the concept of "self". So the concept of "self” is formed in the same way as other concepts and is not mysterious. For example, if a person can only feel pain when a table is struck, he must consider the table as part of his "self".
  • the self-consciousness of a machine is essentially a behavior of a machine.
  • the machine does not need a mysterious self-awareness to be added to itself. It is a machine that learns various information and the cognition of its own “benefit” and “harm” relationship through relational network and associative activation. A way to determine a behavioral way of interacting with the outside world.
  • the focus of the patent application of the present invention is to innovatively reveal the way to realize strong artificial intelligence, and the focus is on the proposed realization way.
  • the specific software and algorithms required for the self-reconstruction of the machine can be achieved by the technical personnel in the industry based on the known knowledge in the industry to achieve the sub-goal of realizing self-awareness establishment proposed in the patent application of the present invention, so no further description is required here.
  • Language plays an important role in machine intelligence.
  • Language is a set of symbols established by humans to better communicate experiences. Each symbol represents some specific thing, process and scene.
  • the S102 module performs multi-resolution feature extraction on the input language (such as speech or text). For example, the overall characteristics of speech, such as rising and falling tone, stress, tone, speech rate, voice size and changes, voice intonation frequency and changes, are obtained at low resolution; specific vocabulary is obtained at medium resolution; Obtain specific syllable pronunciation at high resolution.
  • these features simultaneously activate related memories in memory through associative activation. Because languages are used so frequently, languages are often closely associated with other sensors such as typical images, moving images, sensations, sounds, etc. of the things, processes, and scenes they represent. And these closely related local networks are part of the relational network, they are concepts.
  • the associated memory represented by the language is activated.
  • These memories may have both information about the language itself, as well as memories about language use that are activated (such as phonetic emphasis for emphasis or text emphasis, such as distrustful or mocking intonations, etc.).
  • the activated information constitutes an activation information flow.
  • the activation value of the activated information will decay over time.
  • the parameters of decay are related to the motivations and states of the machine (eg needs and wants states, moods and emotional states).
  • the chain activation realizes the contextual identification of all input information.
  • the input information here includes both environmental information and activated memory information.
  • the mutual activation and assignment of these information reflects the contextual relationship. This association is more extensive than the content of the statistically generated semantic library.
  • the machine can realize the connection between language and static and dynamic images, feelings, needs and emotions, and also realize the connection between language and related language and memory.
  • this connection is incorporated into the machine's understanding of the language input, and based on the understanding of the language and the response according to the relevant experience, it reflects that the machine truly understands the true meaning of the input language.
  • Language input constitutes an input information flow
  • the corresponding activation memory also constitutes an activation information flow.
  • Reconstruction is the environmental reconstruction of the environmental information in it, and the self-reconstruction of the information related to the self.
  • the machine views the reconstructed activation information flow from the perspective of a third party.
  • the reconstructed information flow is used as a new information input, and the machine can re-store this new information flow into the memory as part of the memory. So what the machine learns through language is a virtual experience. It is a virtual scene viewed from the perspective of a third party.
  • the reconstructed self-image is the representative of self-consciousness, and the reconstructed environment is virtual environmental information.
  • Machines can learn cognition from these virtual experiences, and learn the connection between various information and "benefit” and "harm”. This virtual experience will also activate relevant information in the memory, which will also be stored in the memory, together with other memories, as part of the relational network, also optimized by the mechanism of memory and forgetting.
  • Language reconstruction is the foundation of machine learning and understanding language, and it is the foundation of machine learning all the experience accumulated in human history. With the ability to learn language, machines can learn all human knowledge, which is the basic ability to superintelligence.
  • the machine from perception to cognition is established through shared local features, memory and forgetting mechanisms, the number of parameters involved in decision-making is much smaller than the current multi-layer neural network, and each step of decision-making is understandable , so the artificial intelligence proposed in the present application is an interpretable and controllable artificial intelligence.
  • Machine's prediction, decision-making, and response process The machine's prediction, decision-making, and response process is based on a relational network, with associative activations to limit the search range, and through past experience (including causal probabilities between events, including the "benefit” brought by events) and “harm”), use statistical methods to calculate the size and probability of occurrence of “benefit” and “harm”, and then increase or decrease the probability of an event under the principle of seeking benefits and avoiding disadvantages under a predetermined algorithm. . This is the whole process of the machine's prediction, decision-making and response.
  • the machine When the information is entered, the machine does not need to exhaustively predict all possible outcomes, which is also an impossible task. The machine only needs to evaluate the possible outcomes of those experiences that are activated in relation to the input information. This is equivalent to using common sense to limit the search scope.
  • the machine can use any current artificial intelligence prediction method, such as Monte Carlo search, decision tree, Bayesian estimation, rule-based and other machine reasoning methods, to infer the most likely outcome of the current event development to the next step The probability of , and the "benefit” and "harm” that this result brings to you.
  • current artificial intelligence prediction method such as Monte Carlo search, decision tree, Bayesian estimation, rule-based and other machine reasoning methods
  • One possible evaluation method is to use various types of needs and emotional states to establish a multi-dimensional space, within this space, preset various spaces such as optimal pursuit space, acceptable area and unacceptable area for the machine. .
  • This is equivalent to establishing various calculation rules for the machine to seek advantages and avoid disadvantages.
  • due to the variety of needs and emotional types such rules are difficult to express clearly, and all methods of calculating spatial distance can be used.
  • the strategy employed by the machine can be to get as close as possible to the optimal pursuit space and away from the unacceptable space in spatial distance. Therefore, these rules are quantified by calculating the spatial distance.
  • Y is the final evaluation value
  • f i (x1, x2, 7) are various benefit values (determined by needs, emotion types and states according to preset rules)
  • p1 i (x1, x2, ...) are the corresponding benefit values probability of occurrence.
  • G j (y1, y2, ...) are various loss values
  • p2 j (y1, y2, 7) is the probability that the corresponding loss value may occur.
  • is the summation symbol, summing i and j respectively.
  • the purpose of the machine is to maximize Y. This approach is to quantify needs, emotion types, and states into gain and loss values, and then simply sum the probabilities.
  • the above assessment of the possibility of a single step is a technical assessment.
  • This evaluation can be achieved by using existing machine intelligence methods and by using existing statistical algorithms, such as averaging probability statistics, or methods such as minimum spatial distance.
  • the machine On the basis of evaluating the possibility of a single step, the machine also needs to predict and plan the "benefit” and "harm” brought by its own response and the environmental response. This is a multi-step process of "benefit” and "harm” prediction and evaluation. This is a strategic planning, equivalent to finding the optimal path.
  • the principle of choosing a path is to seek advantages and avoid disadvantages.
  • the selection methods can include decision trees, rule-based expert systems, Bayesian networks, evolutionary algorithms and other existing decision-making algorithms. These algorithms are based on relational networks with causal probabilities and "benefit” and "harm” relationships.
  • a simple evaluation method can be: use the planned response of each step as a virtual process as input information again, go through all the information extraction again, associative activation, The environment, self and language reconstruction process, and then again using the same one-step assessment method to assess the possible "benefit” and "harm”. It is a matter of using one's own experience to assess the possible impact of one's own response. These effects include the external feedback on one's own response (including the feedback of others in the environment to oneself), as well as the possible "benefit” and "harm” effects of these feedbacks on oneself.
  • the machine can use an iterative method to fully evaluate the input information and plan its response according to the principle of seeking advantages and avoiding disadvantages.
  • This kind of planning may have multiple iterations. The essence of multiple iterations is a process of expanding the search range and finding the optimal path. This is the planning nature of machines.
  • the optimal path search range of the machine during planning is determined by the possible "benefit” and "harm". Usually, if the possible "benefit” and “harm” values are very high, then the machine will expand the search range under the pre-set principle of the algorithm of seeking advantages and avoiding disadvantages, such as lowering the activation threshold or increasing the number of iterations of evaluation, so that A wider range of information is activated, and a wider range of experience is incorporated into the path prediction process, from which the optimal path is selected.
  • the decision-making process of the machine is not mysterious.
  • the decision-making process of the above-mentioned machine is relatively simple to implement.
  • Preset programs and existing machine reasoning methods can be used to calculate the probabilities of the main possible outcomes, and the associated Probabilities of various gains and losses to come.
  • the methods machines use to assess "benefit” and "harm” involve only probabilistic and statistical methods, which are well-established and well-known.
  • the methods for machines to evaluate "benefit” and "harm” are not limited to the above two methods, and the machine can use any existing statistical decision-making algorithm.
  • the decision-making process of the machine is based on the relationship network.
  • the network of relationships and the education the machine receives are closely related. Even the same learning materials and different learning sequences will have different relational networks, thus constituting different cognitive and decision-making processes of the machine.
  • the decision-making ability of the machine is also affected by the machine's associative activation parameters, statistical algorithm parameters, underlying motivation and motivational states (including needs and demand states, emotions and emotional states, etc.). Therefore, different experiences and different settings of machines may make different decisions.
  • the machine can also adjust the algorithm parameters for its decision-making based on the feedback of its own experience. For example, after the machine executes a series of responses according to its own decision, the actual result is quite different from the expected result, then these memories will also enter the memory and become part of the relational network. The next time the machine makes a decision, these memories become new experiences that affect the machine's next decision-making behavior in a similar state.
  • the machine can also learn and summarize experience under the preset algorithm.
  • the specific embodiment is: the preset algorithm can select the process of obtaining greater "benefit” and "harm” by the machine. Let the memory of these processes be virtualized multiple times in the machine's processor, so that each time the associated memory is activated. Those common features that exist in such processes will appear again and again, and will activate the relevant memory again and again, enhancing their memory value. Therefore, these enhanced memories are more likely to be activated by association and more easily used for reference. And those related memories that only exist in a single specific process are gradually forgotten due to long-term difficulty in activation. This is the process of the machine actively summing up experience. Of course, without such a preset program, the machine itself will also summarize these experiences, but the required samples and time will be longer.
  • the machine can make a decision that accepts a small loss and expects a large gain in the follow-up.
  • the trainer warned that the machine will be “heavily punished” for hitting a person.
  • the machine further activates the relevant information, and can understand that hitting someone will bring huge losses.
  • “Hitting people” may bring about the perception of loss. It can also be a machine that obtains direct experience through direct beating and feedback, or it can be a preset related experience for the machine, or it can be a mixed experience in various ways. But when the machine is faced with the situation that "its owner is being attacked, it is very dangerous”, then there is a trade-off between the "loss value” of hitting people and the huge “gain value” of saving the owner's life. At this time, the machine needs to perform a multi-step search and expand the search range to determine the probability of bringing a huge "benefit value” and whether it will bring other losses. Then make a comprehensive decision based on the probability of gain and probability of loss.
  • the machine execution system is similar to the machine decision-making system. The difference is that the machine's execution system is to concretize the goals generated by the machine decision-making system; the process of concretization is also a decision of the machine. It is divided into the underlying driver command level of the machine until the instructions that the machine can directly execute. When the machine is subdivided layer by layer, the decision system of the machine is also used.
  • the machine can specifically execute the response according to its own decision.
  • the execution response step is a process of translating the plan into the actual output.
  • the machine chooses the voice output in the selection of various possible response steps, it is relatively simple. It only needs to convert the image feature map to be output into voice, and then use the relational network and memory to replace the dynamic Feature maps (including concepts representing relations) and static concepts (grammar knowledge existing in relational networks) are combined, organized into language output sequences, and invoked with pronunciation experience to implement. It should be pointed out that the machine may choose some dynamic features expressing the whole sentence (such as using different movement patterns of tone, audio pitch or stress change) to express doubts, banule, distrust, emphasis on human beings, etc. Common methods. These methods are usually low-resolution features of a sentence or an entire speech). Because machines learn these expressions from human life, any human expression can theoretically be learned by machines.
  • the machine needs to divide the goals of the sequence (including the intermediate goal and the final goal) and divide them according to the different time and space involved in these goals, so as to coordinate their own execution efficiency. This is also from experience, and it can also be performed using a preset algorithm.
  • the method adopted is by selecting temporally closely related targets and spatially closely related targets as groupings. Because the information combination formed by the combination of the dynamic feature map and the static feature map, the environment space of its related memory has time and space information, so this step can adopt the classification method. This step is equivalent to rewriting from the general script to the sub-script. It should also be pointed out that this general script may also be only one of the stage goals of completing the machine planning, such as increasing the probability of a certain condition that may bring a good return value.
  • the machine needs to combine the intermediate goals in each link with the real environment again, and use the method of segmental imitation to unfold it layer by layer.
  • the response plan proposed by the machine at the top level is usually composed of highly generalized process features and highly generalized static concepts (because these highly generalized processes can find multiple similar memories, so use them to establish The responses are also highly generalized). For example, under the total output response of "business trip", "going to the airport” is an intermediate link goal. But this goal is still very abstract, and machines cannot perform imitation.
  • the machine needs to be divided according to time and space, and the link that needs to be executed in the current time and space is the current goal. And take other time and space goals as inheritance goals and put them aside for the time being. After the machine targets the intermediate link, the machine still needs to further subdivide time and space (write down the graded script again).
  • This is a process of increasing temporal and spatial resolution.
  • the process of converting a target into multiple intermediate links is still the process of using decision-making ability, analyzing various possible outcomes and probabilities, and choosing its own response according to the principle of “seeking advantages and avoiding disadvantages”.
  • the above process is iterative, and the process of dividing each goal into multiple intermediate goals is a completely similar processing flow. It has to be broken down to the underlying experience of the machine.
  • the underlying experience for language is mobilizing muscles to make syllables. In the case of movements, it's broken down into driving commands to the relevant "muscles". This is a tower-shaped breakdown structure.
  • the machine starts with a top-level goal and decomposes a goal into multiple intermediate goals. This process is to create dummy intermediate process objects, which are retained if they "fit”. If it "doesn't meet the requirements” then recreate it. Whether it "meets the requirements” means that after the machine analyzes the gains and losses, it confirms whether the strategy meets the preset acceptable standards.
  • the machine may encounter new information at any time, causing the machine to process various information, and these original goals become inheritance motivations.
  • This is equivalent to the process of organizing activities, constantly encountering new situations that need to be resolved immediately, otherwise the activities cannot be organized. So the director called off other activities and first came to solve the problems encountered in front of him. Once resolved, the activity continues. Another situation is that during this process, the director suddenly received a new task, so after weighing the pros and cons, the director decided to suspend the activity first and prioritize the new task.
  • Machines perform imitation tasks that can be performed while decomposing other objects into more detailed ones. So the machine thinks as it does. This is because the reality is very different, and it is impossible for a machine to make a plan knowing the outside world in advance. So it's a process where the environment and the machine interact to accomplish a goal.
  • the machine can use the above capabilities to complete an understanding and response to the input information.
  • This process acts as a minimal cycle of interaction between the machine and the outside world.
  • the continuous repetition of the machine is to use this process to complete a larger goal, which is expressed as a continuous interaction process between the machine and the outside world, showing machine intelligence.
  • the prediction, decision-making and response processes of the above machines do not require new algorithms. They are existing algorithms, which can realize the prediction, decision-making and response process of machines through reasonable organization on the basis of relational network.
  • the present invention focuses on revealing how to establish methods, processes and steps for realizing general machine intelligence (strong artificial intelligence) by organizing existing algorithms and data.
  • general machine intelligence strong artificial intelligence
  • the existing statistical algorithms and artificial intelligence algorithms are not within the scope of the application of the present invention, and will not be repeated here.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

一种实现强人工智能的方法,在感知层上是采用输入数据到局部共有特征的映射,在认知层上是从局部共有特征到概念的映射。而概念的建立是通过关系网络、记忆与遗忘机制来实现的。关系网络的基础是多分辨率特征和联想激活来实现的。机器的决策是通过关系网络中建立的机器需求和情绪动机以及动机状态和具体事物之间的连接关系,采用趋利避害的原则,采用迭代使用预测、决策和响应的方法来实现的。通过该方法,机器可以逐步获得从简单到复杂的对输入信息的响应,并拥有和人类相似的动机和情绪表达,该机器学习方法和目前业界已有的机器学习方法存在巨大差异,目前在业界还没有与之类似的方法。

Description

一种建立强人工智能的方法 技术领域
本发明申请涉及人工智能领域,尤其涉及如何建立强人工智能。
背景技术
当前人工智能通常是为特定任务设计的,还没有能够完成多种不确定性任务的通用人工智能。实现通用人工智能最大的障碍在于如何在纷繁复杂的事物之间建立类似于人类常识的认知网络。只有机器拥有了类似于人类的常识,机器才可能产生类似于人类的思维活动。目前的深度学习产生的结果是一种精巧的特征映射方法,它和人类的学习过程差异较大,所以深度学习的成果难以泛化和通用。目前的知识工程、专家***或者知识图谱,都是采用编码的方式,把人类的知识采用计算机能够识别的方法组织起来。但这些***难以让机器自主学习和归纳,所以在面对差异化的场景时,机器无法自主产生新的策略和方法。所以到目前为止,这些***只能应用于某一个特定的领域和特定范围,无法产生类似于人类的智能。
本发明是基于同一申请人的申请号为PCT/CN2020/000109,名为“一种模仿人类记忆来实现通用机器智能的方法”的专利发明申请基础上,对于如何实现通用人工智能的进一步说明,并进一步阐述实施方法的细节。发明内容
在此处键入发明内容描述段落。
发明内容
在申请号为PCT/CN2020/000109,名为“一种模仿人类记忆来实现通用机器智能的方法”的专利发明申请中,揭示了一种通过记忆来建立事物之间关系网络的方法。在本发明申请中,我们进一步深化如何通过在记忆中建立关系网络,来建立强人工智能(通用人工智能)。
在本发明申请中,我们提出一种机器的构成概念图。在图1中,S101模块是机器的传感器模块。为了让机器产生和人类类似的认知方式,S101模块需要采用一到多种通用的传感器:视觉传感器、听觉传感器、味觉和嗅觉传感器、触觉传感器、重力方向传感器和姿态信息传感器信息等,还可以增加面向特定应用的传感器(比如自动驾驶可以增加红外传感器、 激光雷达传感器等)。机器也需要使用监控自身状态的传感器,这些传感器也是机器感知信息的一部分。S101主要由传感器硬件和传感器对应的软件组成的模块,目的是通过传感器感知机器外部和机器自身的信息。这些传感器类型的差异和多少,不影响本发明申请的权利要求。因为在本发明申请中,所有的传感器数据都是同样的处理方法。
在图1中,S102模块是机器对传感器输入信息的简化模块。机器对输入信息的简化,主要是指机器对输入信息提取底层特征。它可以采用任何已有的特征提取方法,包括但不限于卷积神经网络,图像分割、轮廓提取、降采样特征提取等。任何已有的机器图像识别算法都可以用于S102模块中。
但S102模块和目前主流的机器算法差异在于:1,S102模块不是以识别具体事物为目的。在目前流行的神经网络中,机器对输入数据逐层进行数据处理,然后通过误差反向传播来优化数据处理参数,目标是在大样本统计下实现误差最小。算法实现的是数据空间到标签空间的映射。而本发明申请中,S102模块目标是提取输入数据中的局部共有特征。在申请号为PCT/CN2020/000109,名为“一种模仿人类记忆来实现通用机器智能的方法”的专利发明申请中,我们提出了一种采用不同大小的取样窗口,不同分辨率,对输入数据重复提取,并把这些数据通过在记忆相邻放置来建立联系。并通过记忆和遗忘机制来强化普遍存在的联系,而弱化那些偶发的联系。所以,本发明申请中,S102模块的目的是寻找那些广泛存在的局部共有特征,而不是寻找具体的样本空间到标签空间之间的映射关系。在本发明申请中,同一个输入样本空间,可能包含大量的“局部共有特征”,它们分别是在不同的分辨率下提取的。需要指出的是,事物局部特征的组合方式本身也是一种局部特征。局部特征和大小没有关系,而是指在不同分辨率下提取的事物的一部分信息。有些图像的局部特征可以和图像本身一样大,但分辨率低,只包含原始图像的部分信息,比如有可能只包含原始图像的其他局部特征的组成方式。比如在图像中,局部特征可能包含在不同分辨率下的轮廓、直线、曲线、纹理、顶点、垂直、平行、曲率等底层几何特征,也包含不同分辨率下的颜色、亮度等特征,还可 能包含不同分辨率下的运动模式,也包含底层几何特征的组合拓扑方式等。而目前的流行的深度卷积神经网络,是寻找同一个输入样本空间到少量特定标签之间的映射关系。在S102模块中,深度卷积神经网络可以作为一种应用算法来实现输入数据到局部共有特征之间映射。这种算法本身不属于本发明申请的权利要求,但从输入数据中寻找多分辨率局部共有特征,并利用这些多分辨率局部共有特征来建立事物之间的连接关系,则属于本发明申请要求的权利范围。
在S102模块中,还包括提取多分辨率动态特征。类似于图像特征提取,S102模块也是提取局部共有动态特征。这里的局部共有动态特征是指基本的运动模式,比如摆动、圆周、直线、曲线、波动等广泛存在于我们这个世界中那些相似的基础动态特征。所以它也不是在大量的运动样本空间和具体表示动态的标签(比如舞蹈、跑步、游行、狂欢等标签空间)之间建立映射,而是输入样本空间到广泛存在于我们这个世界中那些相似的基础动态特征之间的映射。
特别指出,动态特征是知识泛化的基础。人类对于知识的类比应用(泛化)一定是基于某种相似性而建立的联想。而这种相似性可以是静态相似(比如外观相似、或者抽象特征相似),也可以是动态相似(比如运动方式相似、或者抽象特征的变化方式存在相似)。而动态特征本身可以采用抽象的质点或者体积来代表,所以运动特征可以作为不同事物的经验泛化之间的桥梁。
在S102模块中,机器也采用类似的方法对其他传感器输入数据做处理,包括提取静态多分辨率特征和动态多分辨率特征。比如对于语音,基础语音、语速部分可以作为一个静态特征,而音频、音调、语速的变化就是一种动态特征。机器按照不同长度时间窗口对语音滑动取样,就相当于不同的时间分辨率。机器需要在不同的时间分辨率和不同的细节分辨率下提取静态和动态特征。在我们提出的强人工智能实现方法中,采用多分辨率提取事物动态特征是至关重要的部分。而动态特征的多分辨率提取方法在申请号为PCT/CN2020/000109, 名为“一种模仿人类记忆来实现通用机器智能的方法”的专利发明申请中已有说明,这里不再重复。
S102模块是一种感知层特征提取,对输入数据进行多分辨率提取静态和动态特征至关重要。因为事物之间的联系,在不同分辨率下是不同的。事物之间的相似性,在不同分辨率下也是不同的。所以机器需要建立事物在不同分辨率下的关系网络。两个在日常认知中并不相识的事物,在不同的分辨率下,可能存在相似的属性。而这些属性正是相关知识泛化的桥梁。
S102模块是一种感知层局部特征提取,但它并不是每次都提取所有的多分辨率局部特征。而是根据机器对传感器数据的搜索目标来确定使用那些分辨率和提取的重点区间。而机器对传感器数据的搜索目标来自于机器在之前活动中产生的预期目标。
S102是机器感知层信息的处理层,是对传感器输入数据做简化的软件层。S102的输入是传感器所采集的数据,还有从机器认知层发送过来的参数。这些参数是认知层告诉感知层需要采用的分辨率范围和需要重点提取的范围。这些参数来自于机器对之前信息处理后,根据需要进一步识别的目标大小和属性来确定的,是机器对输入信息响应的一部分。
在S102中,可以采用目前流行的卷积神经网络、循环神经网络、图像滤波处理等各种具体算法,但它们的输出目标是局部共有特征,而不是直接映射到特定的分类空间。而局部共有特征到具体的分类空间是由认知层来完成的。
在本发明申请中,多分辨率特征提取和时间上相邻输入的信息彼此之间存在连接关系的假设,是认知层建立的关键步骤。在S103模块中,通过联想激活、记忆和遗忘机制来优化记忆中的信息连接关系(关系网络),这就是建立认知层。
在本发明申请中,我们提出了一个基本假设:“传感器组在时间上相邻输入的信息彼此之间存在有连接关系”。这是我们提出的认知层建立的关键假设。
机器按照输入次序存入输入信息。这些信息包括外部传感器数据、内部传感器数据、 需求和情绪数据等动机数据。外部传感器数据是机器对外界信息的感知。内部传感器是机器自身的各项监控信息。需求是一种机器被预置的动机和动机的状态。情绪也是一种机器被预置的动机和动机的状态。需求和情绪属于机器的预置动机的一部分。
在本发明申请中,我们可以对机器赋予各种动机,这些动机是驱动机器对输入信息作出响应的动力来源。它们是机器行为背后的控制机制。在本发明申请中,我们以给机器赋予需求和情绪为例,来说明机器如何根据这些动机来决定自己的行为。机器可以被赋予的动机不仅仅包括需求和情绪,还可以包括其他类型的动机。这些动机类型的差异和多少,不影响本发明申请的权利要求。因为在本发明申请中,所有类型的动机数据都是同样的处理方法。
在本申请所提方法中,机器采用符号来代表各种人类赋予给机器的各种底层需求。比如机器自身的安全需求,追求快乐,希望获得人类的认可,希望得到人类的尊重,再比如机器自我目标实现(目标达成)带来的自我奖励(成就感),比如机器对探索未知的好奇心等。这些需求都可以采用一个符号来表示,并且这个符号可以被赋值来表示所处的状态。需求类型的差异和多少,不影响本发明申请的权利要求。因为在本发明申请中,所有的需求都是同样的处理方法。
在本申请所提方法中,机器采用符号来代表各种人类赋予机器的底层情绪。机器的情绪可以多种多样,每类情绪可以使用一个符号来代表,这些符号可以被机器赋值来表示所处状态。这些情绪类型的差异和多少,不影响本发明申请的权利要求。因为在本发明申请中,所有的情绪都是同样的处理方法。
在本申请所提方法中,机器的情绪和机器的需求状态之间的关系,可以通过预置的程序来联系起来。这些预置程序的参数可以通过机器在自身的学习过程中,根据“趋利避害”的原则进行自我调整。
在本申请所提方法中,机器的情绪状态和机器情绪的外显表达方式,也可以通过预置的程序来联系起来。这些预置程序的参数可以通过机器在自身的学习过程中,根据“趋利避 害”的原则进行自我调整。
在S103模块中,我们需要把不同分辨率下的外界信息、记忆信息、机器内部传感器信息、以及机器的动机相关符号和所处状态之间建立连接关系。这种连接关系需要能正确的反映我们所处世界的常识,这种连接关系在本发明申请中被称为关系网络。
我们所处的世界,事物之间的关系纷繁复杂,人为建立事物之间的各种关系是非常困难的,也难以量化和灵活运用。在本发明申请中,我们是通过记忆来提取事物之间的关系。首先,我们采用镜像空间的概念来存储所提取的多分辨率信息。镜像空间是指我们按照输入信息的时间次序来存储信息,而对于同时输入的信息(比如图像),则按照对应的原始空间组织方式来存储信息。具体的方法是:当机器从输入中提取了多分辨率信息特征后,机器需要使用这些特征建立镜像空间。机器首先把提取的特征,通过缩放和旋转,按照和原始数据相似度最高的位置、角度和大小,来调整底层特征的位置、角度和大小,把它们和原始数据重叠放置,这样就能保留这些底层特征在时间和空间上的相对位置,并建立镜像空间。机器在记忆中存储的是按照和原始数据最相似的方法组织的多分辨率特征数据,我们称其为镜像空间。
存入记忆中的信息,都有自身的记忆值。存入记忆中的新记忆,包括需求符号的记忆值和情绪符号的记忆值,它们和存储发生时对应信息(符号)拥有的激活值相关,通常是正相关,可以是线性关系,也可以是非线性关系。
为了建立记忆中的关系网络,在本发明申请中,我们提出了一个基本假设:“传感器组在时间上相邻输入的信息彼此之间存在连接关系”。这是我们建立关系网络的关键假设。同时,我们提出了另外三条假设用于关系网络的优化:“临近关系”假设、“相似关系”假设和“记忆强度关系”假设。
“临近关系”假设:我们假设记忆中,时间上相邻输入的记忆信息彼此存在联系。相邻是指记忆存储中时间上的相邻。“相似关系”假设:在记忆中,相似的记忆信息彼此之间也 存在联系。“记忆强度关系”假设:在记忆中,那些记忆值高的记忆更加容易被激活。
当记忆中一个信息被激活后,它会采用“临近激活”原则、“相似激活”原则和“强记忆激活”原则来激活其他信息。
“临近激活”是指记忆中特定的信息激活后,它会激活和它临近的信息(是指存在临近关系的信息)。
“相似激活”是指记忆中的特定特征,接收其他特征发出的激活信号时,接收能力和彼此之间相似度成正相关。这是一种定向接收能力。所以一个相似的记忆被激活后,它会发出自己的激活信号,并可能很容易进一步激活其他与其相似的记忆。这是因为相似的记忆之间彼此接收对方的激活信号能力强。在本发明申请中,一种简单的激活值传递关系可以是传递系数正比于相似度。当然,也可以采用其他传递函数,但原则必须是传递系数和相似度正相关。
“强记忆激活”是指记忆值越高的记忆,接收其他特征发出的激活信号的能力越强。所以那些记忆深刻的信息更容易被激活。在本发明申请中,每一个记忆信息都被赋予一个记忆值,用于表示能够存在于记忆中的时间。那些记忆值高的记忆可能长期存在,其接收其他特征发出的激活信号的能力强。这是模仿人脑的突触数量多少代表记忆强度,而假设那些突触多的记忆更加容易从周围环境中获得更多的激活能量而被激活。
以上三中激活方式统称为联想激活。
在本发明申请中,我们采用记忆和遗忘机制来维护信息在记忆库中的记忆值。记忆和遗忘机制是本发明申请中广泛使用的关系提取机制。在本发明申请中,记忆中的信息每被激活一次,就认为被使用了一次,所以按照自身所处记忆库的记忆曲线增加记忆值。同时,所有记忆按照自身所处记忆库的遗忘曲线递减记忆值。记忆函数是指某些数据随重复次数增加而增加。具体的增加方式可以采用一个函数来表示,这个函数就是记忆函数。需要指出,对不同类型的数据可以采取不同的记忆函数。遗忘函数是指某些数据随时间增加而递减。具体 的减小方式可以采用一个函数来表示,这个函数就是遗忘函数。需要指出,对不同类型的数据可以采取不同的遗忘函数。记忆和遗忘机制是指对记忆信息使用记忆函数和遗忘函数。
如果我们把记忆看作是一个包含了无数信息的立体空间,那么关系网络,就是这个空间中的脉络。这些脉络的出现,是因为记忆和遗忘机制,那些不能被反复激活的关系都被遗忘了,而那些能得到反复激活的关系得到了加强。那些通过粗大的关系脉络连接起来的多分辨率信息就组成了概念。它连接同类信息的图像、语音、文字或者其他任何表达形式。由于这些表达形式频繁出现在一起,并频繁相互转换,所以它们之间的连接更加紧密。由于人类后天使用语言非常频繁,所以通常在一个概念中,语言的激活次数有可能是最多的,语言的记忆值也是最高的。同时由于语言通常和一个概念的所有属性相连接,所以它是各个属性之间相互激活的桥梁。所以表现得好像语言是我们概念思维的中心。但在本发明专利申请中,我们认为语言需要通过语言重建才能获得其真正对应的含义,所以通过语言重建后的激活信息流才是真正承担我们思维的信息。
关系网络中的连接关系是通过“临近激活”、“相似激活”和“强记忆激活”联系起来的。最紧密的局域连接关系就构成了概念(包括多分辨率下静态特征图及其语言,动态特征图及其语言);比概念松散一点的是经验。经验是那些经常重复出现的认知关系。正因为能够重复出现,所以能逐步增加彼此之间的记忆连接。而普遍存在于人类认知中的经验就是常识。比经验松散就是记忆。
在本发明申请中,我们提出一种可以简单表达关系网络的记忆组织形式是:信息按照输入时间顺序存储。那些“输入时间上相邻的关系”采用“存储位置在空间上相邻”来表达。信息在存储空间上相邻可以是物理位置上的相邻:就是把时间相邻的信息存储在相邻的存储单元上。信息在存储空间上相邻还可以逻辑相邻:就是采用逻辑位置相邻的方式来存储,而具体的物理存储单元位置由逻辑位置和物理位置之间的映射表来表示。另外的方法还可以是每个存储信息自带自己的存储时间坐标,机器通过搜索相邻的时间坐标来确定相邻的信息。 当然还可以有其他的存储方式,但它们都必须能表达出时间上相邻的信息。
在本发明申请中,我们提出一种搜索相似信息的方式:一种方法是采用一个专门的相似度对比计算单元来处理从记忆中找到和输入相似的记忆这个任务。相似度对比计算单元可以是使用硬件来实现,也可以使用软件来实现。可以是单独一个模块,也可以是集成到整个运算单元中的模块。相似度对比是很成熟的算法,这里不再赘述。
临近激活时,彼此靠近的信息激活值传递系数大。
相似激活时,相似信息之间通过相似激活传递激活值,传递系数和相似度成正相关。
强记忆激活时,那些记忆值高的信息更容易从临近激活中,获得大的激活值而被激活。所以,强记忆激活时临近激活中的一种特定情况。
上述“临近激活”、“相似激活”和“强记忆激活”中,激活值的传递函数需要通过实践来确定,但基本原则就是“临近激活”的激活值传递系数和两个信息存储发生时的时间距离成反相关。可以认为时间间隔是一种对激活值传播衰减的介质。而“相似性激活”的激活值传递系数和两个信息之间的相似度成正相关。“强记忆激活”是临近激活时,激活值传递系数除了考虑时间间隔的衰减外,还需要考虑接收信息一方的记忆值。可以把接收信息一方的记忆值认为是体现了接收激活值能力的大小。
需要特别指出,同一个事物可能有大量的不同分辨率下的特征来代表这个事物的属性。两个事物之间在不同分辨率下的相似性是不一样的。一个事物的某一个分辨率下的特征(属性)被激活后,这个特征既可能通过临近激活来激活这个事物的其他分辨率下的特征,还可能通过相似性来激活其他与这个事物在目前分辨率下相似的特征,从而进一步通过临近激活来激活其他事物的其他属性。所以,是否相似是建立在指定分辨率基础上的。这就是为什么我们需要对输入信息同时提取多分辨率特征的原因。
在本发明申请中,具体的激活值传递函数并不会影响机器的基本工作原理。但各类激活的最优激活值传递函数需要通过实践来确定,但它们都必须遵守上述原则的限定。
还需要指出,关系网络中信息获得的激活值会随时间而消退。消退曲线需要由实践来优化。消退时间的长短需要是平衡前后激活信息之间的联系和新输入信息的激活状态。如果消退时间过长,新输入信息带来的激活状态容易被原有的激活状态所掩盖,而不能清晰的表达新信息和其他信息之间的连接关系。如果消退时间过短,那么前后的激活信息之间的连接关系就容易被忽略了。
在关系网络中,静态概念类比于机器广泛使用的小零件,而那些动态特征图(包括表示关系的概念)类比于广泛使用的连接件。而那些代表一类过程的大框架,它是多个小零件(静态对象)和连接件(动态特征),按照一定的时间和空间次序组织起来的。它们常常可以用于经验模仿。其中动态特征图(包括表示关系的概念)常常可以作为经验泛化的工具,因为它们扮演的角色是抽象事物变化的抽象关系,是跨越具体事物的共有属性。泛化过程的本质就是通过共有属性把已有的经验类比应用的过程。而共有属性就可能包括多分辨率下的某种特征(比如低分辨率下某些相似特征,或者高分辨下的一种或者多种相似特征),还可以包括通过认知层作为桥梁,连接起来的相似的属性(比如语言、类比动态模式)。但这些泛化关系和桥梁都可以体现的关系网络的联想激活过程中,并通过预测、决策和执行***得到泛化应用,所以这里不再专门说明泛化如何实现。
在记忆空间中,外部信息、内部信息、需求和需求状态、情绪和情绪状态,以及可以针对机器的其他应用而增加的特定动机和对应的动机状态,都是直接保持时间相邻次序存放于记忆中。它们都是按照记忆和遗忘机制来维护其记忆值,也都是按照“临近激活”原则、“相似激活”原则和“强记忆激活”原则来进行联想激活。所以在本发明申请中,这些信息的处理方式是类似的。它们的类型、多少和数据格式,不影响本发明申请所提出的权利要求。
在本发明申请中,记忆存储的另外一个特征是把机器的动机和动机状态数据(比如需求和需求状态数据,情绪和情绪状态数据)存入记忆中,让这些信息和其他信息一样,和其他信息之间建立有激活值传递途径。也就是说当一个记忆信息被激活后,它可能通过链式激 活向相关记忆中很多需求和情绪符号传递激活值。而传递系数的大小就是通过记忆和遗忘机制优化后的关系网络中的直接或者间接连接强度。关系网络的本质是一个因果网络,它反映了两个信息之间的因果关系。那些在时间上相邻的信息通常存在因果关系,而那些相似连接又把不同时间上存在联系的因果关系串联起来,构成了一个大的因果关系网络。
在图1的S104模块中,我们需要建立利用关系网络的应用层。应用层包括:
1,初始激活值赋值***。
传感器的输入数据,首先要经过S102步骤的简化。然后机器利用这些简化后的信息特征,使用初始激活值赋值程序对这些输入信息赋予初始激活值。初始激活值赋值程序是预置程序,它的输入参数包括输入信息,机器此时的需求和情绪等动机状态。它的输出包括给输入信息赋予的初始激活值。
机器还可以在对数据简单处理后,对输入数据进行重复赋予激活值。这一步是预测、决策和执行***输出的响应结果之一。
2,联想激活。
机器赋予输入信息初始激活值后,机器采用“临近激活”原则、“相似激活”原则和“强记忆激活”原则来进行链式激活。机器通过链式激活来实现联想功能,所以我们也称之为联想激活。机器通过联想激活来寻找(a)和输入信息相关的经验。(b)和这些相关信息存在连接的动机和动机的状态(包括需求和需求状态,情绪和情绪状态)。(c)和输入信息在记忆中存在相关的其他类型信息。
机器通过联想激活的信息通常是记忆值较高的记忆。这些记忆通常是因为能够被重复激活而通过记忆和遗忘机制获得了更高的记忆。而这些记忆之所以能够被重复激活,是因为这样的信息关系能够在我们生活中不断重现。它们就是机器获得的“经验”。所以记忆和遗忘机制是关系网络的优化机制,是智力产生的基础。
机器根据输入信息激活的记忆,可能包括记忆中所有类型的记忆。比如,当语音输入 时,机器可能根据记忆中相似的语音,激活和这些语言存在紧密联系的图像、文字、感觉、情绪、相关其他语音或者一段过去的记忆。具体的激活内容取决于:(a)机器通过学习经历和学习参数设置而获得的关系网络。(b)机器的链式激活参数设置(这相当于设置机器的联想方式)。(c)机器对输入信息赋予的初始激活值。初始激活值越大,机器能激活的内容越多。
3,激活信息重建。
通过联想激活,机器获得了和输入信息相关的经验。这些经验还需要进一步处理,机器才能参考这些经验来对输入信息作出响应。
环境重建:机器进入一个环境后,通过提取图像、语言和其他传感器输入的底层特征来识别具体的事物、场景和过程。并把在记忆中找到的同类事物特征、场景特征和过程特征和现实中相似部分重叠,于是机器就能够推测目前事物、场景和过程暂时看不见的部分。
由于事物的大小是低分辨率特征之一,所以机器也使用视野中的具体事物的大小和特征图中事物正常的大小相比较,来协助机器建立环境中的景深。机器通常根据自己的目标(通常是激活值最高的概念相关的图像)大小来调整重建环境的大小。
通过多个局部信息(包括一类事物在不同分辨率下的近似框架和不同的细节)来重建三维环境,这是目前已经成熟的技术,这里不再赘述。
自我的重建:通过把多段关于自身信息的记忆(包括自身在不同分辨率下的记忆,有些记忆是关于整体框架,有些记忆是关于细节),重叠它们相似的部分,构建关于自身的立体图像,这就是自我镜像重建。由于机器关于自身信息的记忆,有些记忆是视觉,有些是触觉,有些是情绪,有些是需求,当把它们共同部分重叠在一起后,就构成了整体的自我整体感知重建。这些信息通过把相似部分重叠后,整合为我们的整体形象。我们在头脑中创建了一个自我镜像,我们仿佛能看到自我镜像的动作。这个自我镜像就是我们自我意识的一部分。它是机器用于区分自我和非我的依据。
在建立了外界镜像空间和机器自身镜像后,机器的意识就是机器区分自身和外界,并 按照“趋利避害”的方式来决定自己和外界互动的一种行为方式。所以意识的本质是一种行为方式。正是因为有了自我和非我的区分,机器才产生意识,把自我与外界的事物之间建立联系。这种联系的本质就是外界事物和自己的“利”和“害”之间的关系。而这种关系正是在机器的学习过程中逐步建立起来的。
在输入信息激活了记忆中众多相关记忆后,机器采用这些记忆建立镜像空间和机器自身镜像,并使用镜像空间和机器自身镜像作为组合这些经验的方式。机器是以第三人称视角来观看这些记忆。这些记忆里面包含有机器的情绪、需求和结果等相关信息,机器正是使用这些信息,按照趋利避害的方式,借鉴这些经验,来计划自己面对输入信息的响应,并可能多次评估这些响应可能带来的结果。
4,机器的底层动机。
驱动机器的动力来源是机器的动机,而机器的动机可以概括为“趋利避害”。“利”和“害”一部分是预置的;一部分是后天学习中建立的,因为它们和机器自身的需求相关。类比于人类,比如一开始是“水”,“奶”、“食物”是先天预置的“利”,后来通过学习获得了“考试分数”、“钞票”和我们先天需求之间的联系,再后来我们还发现操作对象还可以是“爱情”和“时间”这样的没有实体的东西,甚至我们还追求群体中的支配权,这是一种存在于我们基因中底层动机中趋利避害的延展。
采用类似的方法,我们也可以给机器赋予人类希望它们拥有的动机。因为在关系网络中,所有的记忆存储时,同时存储了当时机器的需求符号和对应的记忆值。这些记忆值是和当时需求符号的状态值正相关的。举例说明,如果机器在某种行为后,收到了责备。由于责备是一种损失(这个认知既可以预置,也可以通过训练者语言表达,还可以直接修改关系网络来实现),而且责备的程度(比如语言里面表示程度的词)给机器带来不同的损失值。责备越强烈,机器给这个记忆中的损失符号赋予的记忆值也相应比较高。那么在这个记忆中,由于损失符号记忆值比较高,所以这个记忆帧中所有其他记忆值比较高的特征图都和损失符号 之间的连接比较强。如果在类似环境,类似动作发出对象或者接受对象,再次发生了类似受到责备的行为,那么这个记忆帧中的带来损失的特征图和损失符号本身由于被重复了,它们的记忆值在这个记忆帧中都按照记忆曲线增加了,从而增加了带来损失的特征图和损失符号之间的关系。通过一次次重复,那些真正带来损失的特征图和损失符号之前的关系就按照记忆和遗忘机制挑选出来了。机器从一开始不清楚为什么被责骂,到后面就能清楚是什么东西给自己带来了被责骂的后果。这个过程和人类孩子的学习过程是类似的。
即使机器在行为发生时没有得到及时反馈。训练者在后期也可能通过指出行为本身并发出反馈,这样就是在一个单独的记忆帧中把行为和结果连接起来了。训练者甚至无需去指明具体哪个行为好和不好,机器只需要每次收到正确的反馈,通过记忆和遗忘,就能逐步建立正确的行为和需求值之间的连接关系。
所以我们只需要给予机器基本的底层动机,机器就会根据外界的反馈来建立形形色色的具体事物和底层动机之间的关系。而这些关系正是机器做出决策的依据。机器的底层动机相对比较简单,可以采用预设的方式。比如赋予机器学习并遵守人类法律的动机,赋予机器寻求人类认可的动机,赋予机器避免危险的动机,赋予保护主人安全的动机等。
机器动机和动机状态和情绪关系密切。在本发明申请中,我们通过预置程序,用动机和动机状态来决定情绪,也通过预置程序,来实现情绪和情绪外显之间的关系。也就是说,使用预置的程序,通过动作(表情和肢体语言)和语言(语言输出方式)来表达机器的情绪。但同时,情绪或者情绪表达还可以通过“趋利避害”的动机来调整。机器通过学习和反馈,获得了在什么环境下,什么情绪可以带来收益和损失,从而反过来调整情绪或者情绪表达。正因为情绪和机器动机(包括动机状态)之间关系密切,所以,也可以把机器的动机统一采用情绪需求来表达。机器可以采用追求情绪需求来做出选择和响应。
需要特别指出的是机器对任何输入信息的存储,都会同时存储机器的动机和动机状态(比如需求和需求状态,情绪和情绪状态等)。在记忆中,动机是用一种符号来表示,而其记 忆值就是和其存储发生时的激活值成正相关。所以那些获得高激活值的动机,通常因为其存储发生时获得的记忆值高,而可能成为长期记忆。这些长期记忆有可能被一次次被激活,从而长期影响机器的决策和行为。而那些和高激活值的动机状态(比如强烈的情绪、大的收益或者损失等)相关的信息,由于和高激活值的动机状态在时间上事相邻的,它们可能会被高激活值的动机符号的高激活值反向激活,从而提高了自己在存储时的激活值,也就提高了自己在存储时获得的记忆值。
5,语言信息重建。
语言在机器智能中扮演了重要的角色。语言是人类为了更好的交流经验而建立的一套符号。每个符号都代表一些具体的事物、过程和场景。当语言输入时,语言所代表的相关记忆被激活。这些记忆既可能有语言本身的信息,也会有关于语言使用的记忆被激活(比如强调重点的语音强调方式或者文字强调方式,比如表示不信任的语气或者嘲弄的语调等)。这些被激活的信息构成了一个激活信息流。为了平衡语言的前后关联和目前语义识别,被激活的信息的激活值会随时间而衰退。衰退的参数和机器的动机以及状态(比如需求和需求状态,情绪和情绪状态)相关。
语言的链式激活实现了所有输入信息的上下文关联识别。这里的输入信息既包含环境信息,也包含被激活的记忆信息。这些信息的相互激活赋值,就体现了上下文关联。这种关联比统计生成的语义库内容更加广泛。它不仅仅涉及到语言,更涉及到所有的感官输入和相关记忆。所以通过S102构成的多分辨率信息、S103构成的常识网络,机器可以实现语言到静态和动态图像、感觉、需求和情绪之间的连接,也实现语言到相关语言和记忆的连接。当这种连接被纳入机器对语言输入的理解中,并根据对语言的理解,根据相关经验做出响应,就体现了机器真正的理解了输入语言的真是含义。
语言输入构成了一个输入信息流,而对应的激活记忆也构成了一个激活信息流。机器在理解语言时,需要重建这个激活信息流。重建就是对其中的环境信息进行重建,通过把被 激活的环境相关信息(比如图像、声音和感觉等)相同部分重叠起来,构成一个想象中的过程。
同理,机器还需要把输入语言激活的关于感觉、情绪、视觉、动作、肢体状态等和机器自身存在相关的信息整合起来。这些信息通常也会激活机器自身类似的经验。所以机器能体会到这些语言带来的感觉、视觉、情绪、动作或者肢体状态等感觉相关的信息。
6,机器预测能力的产生
预测的本质是一种统计行为。机器的预测,就是根据过去的经验,或者类似的经验,来推测事物发展的各种可能性以及对应概率,或者他人的行为的各种可能性以及对应概率。
当信息输入后,机器不需要去穷尽预测所有可能的结果,这也是无法完成的任务。机器只需要评估那些被激活的、和输入信息相关的经验中那些可能发生的结果。这相当于利用了常识来限定了可能结果范围。在这个有限的范围内,机器可以采用目前任何人工智能预测方法,比如蒙特卡洛搜索、决策树、贝叶斯估计、基于规则等机器推理的方法,来推测目前事物发展到每一个可能的概率。
在过去的经验里,每一种可能的结果,都通过关系网络联系着需求状态和情绪状态,这些状态代表这种结果给机器带来的“利”和“害”。所以把可能发生的概率和相关的需求和情绪状态结合,机器就可以推测出事物发展的每一种结果可能给自己带来的“利”和“害”的大小、类型以及它们可能发生的概率。所以机器在预测可能发生的结果给自己带来的“利”和“害”时,需要同步把可能发生的概率考虑进去。所以机器既要评估“利”和“害”,还要评估其对应的概率,联合两者做出决策。
7,机器的决策和响应***
机器在信息输入后,通过联想链式激活方式,根据记忆的激活状态,限定了相关的决策范围。所以机器对输入信息的评估和响应,都是基于输入信息和被激活记忆所限定的范围来搜索、评估和响应。这相当于利用了常识来限定了需要搜索、评估和响应的范围。
在这个范围内,可能存在一段或者多段和输入信息相关的记忆。机器可以通过分段模仿的方式,模仿这些过去的经验,建立起可能的响应过程。然后,在这个有限的范围内,机器可以采用目前任何人工智能预测方法,比如蒙特卡洛搜索、决策树、贝叶斯估计、基于规则等机器推理的方法,根据预测的“利”与“害”的概率,来选择自己的决策和响应。
因为机器的目的就是“趋利避害”,所以机器对输入信息的响应基本出发点就是根据过去的经验,做出自己的响应,尽可能让那些产生“利”的事情发生概率变大,尤其是那些能获得很高收益值的情景。而让那些产生“害”的事情发生的概率减小,尤其是那些能带来巨大损失值的情景。所以机器在权衡利弊的动机推动下,根据经验来组合自己的响应,来达到“趋利避害”的目标。
机器的决策,是基于机器的预测能力之上的路径规划方法。而路径的目的就是利益最大话,损失最小化。有了预测能力,机器就把决策和响应这样一个完全开放性的问题,转变成了一串如何让一定范围内的事情发生的概率增加或者减小的相对封闭的问题。而由于在前面的步骤中建立了常识,所以每一件事情发生时(这是因果关系中的果),与之相关的条件(这是因果关系中的因)通过关系网络就可以得到。那些存在强关联的因果关系由于一次次重复发生,所以它们在关系网络中的连接关系很强。所以关系网络就可以逐层表达的因果关系。
每一步决策的目标都是让事情的发展方向“趋利避害”。这有可能是一个和外界互动的过程。而互动本身就是一种依据过去的经验,来推动事情的发展方向“趋利避害”。通过互动获得的信息和行为,来不断提高收益值高的事件发生的概率,来不断降低损失值高的事件发生概率。这是一个迭代过程。但每一步都是处理的方式都是一样的。机器在因果链的基础上,逐层提高那些通向收益值高的事件发生的概率。这类似于链式激活过程,一步步激活那些通向高收益路径上的事件,而小心的避免那些可能通向高损失值的事件。
由于路径之间的因果联系的概率由关系网络来表达,所以整个机器的响应规划问题就变成了在因果链网络中寻找最优路径问题,而这正是目前的机器智能已经解决了的问题。举 例说明,机器通过搜索记忆就能确定一个事件(比如带来高收益值或者搞损失值的事件)的先验概率。然后通过关系网络就能确定某一个条件和该事件之间的因果强度(后验概率)。而不同条件之间在关系网络中的连接强度,就能反映不同条件之间是否独立。而机器只需要挑选一些相对彼此独立的条件,通过朴素贝叶斯算法,就能预测出该事件发生的概率。机器可以根据计算出来的概率来决定自己的响应。这些响应可以有各种形式,比如:提高这件事请发生的概率,或者降低这件事情发生的概率,或者不去影响这件事情发生的概率。这取决于这件事是给机器带来的收益值和损失值。而提高或者降低这件事情发生的概率,又可以进一步规划为提高或者降低和这件事发生概率相关的条件发生的该概率。这个过程本质上是一个迭代的概率路径搜索问题。
举例说明:如果机器的响应是进一步确定可能的收益和损失。首先,机器根据记忆中,在和目前情况类似的情况下,各种可能结果的概率作为先验概率。然后根据每一个结果相关的条件和结果之间的后验概率来计算各种收益值和损失值发生概率。然后,机器就产生了下一个目标,进一步确定每个条件发生的概率。比如这时机器的响应可以是(a)搜索和统计每一个条件和对应收益值和损失值发生事件之间的后验概率。然后用于更新整体收益和损失评估。这可以通过搜索关系网络之间的连接强度来完成。(b)进一步更新目前某一个条件发生的概率。比如根据模仿过去的经验,直接询问信息源关于某一个条件是否已经发生或者可能发生的概率。或者通过其他途径获取某一个条件是否已经发生或者可能发生的概率。这取决于机器学习过程中获得的行为模仿记忆。(c)根据趋利避害的原则,把某些和收益、损失联系密切的条件推动其发生或者避免其发生作为新的目标。在新的目标驱动下,采用同样的评估过程,进行响应。通过这样的迭代响应,最终目标依然是获得收益和避免损失。
所以,当信息输入时,我们通过关系网络确定的因果联系,通过趋利避害原则,通过机器在关系网络中建立的事件和“利”、“害”之间的关系,就可以把看似完全开放的机器对信息输入的响应,变为多级目标。这些目标都是为提高某些事件发生的概率,或者降低某些 事件发生的概率服务的。所以,通过关系网络的因果关系,机器就可以把趋利避害的目标转变成具体情况下的一连串彼此关联的目标。这些目标就构成了机器最大化收益,最小化损失的实现路径。
在这个过程中,机器做出响应后,可能通过不断寻找新的信息,或者不断被动获得新的信息,并利用新信息和结果之间的后验概率来更新目标路径。机器在对自身响应后可能的外界反馈预测,同样包含激活两类动机状态记忆。一种是自身处于重现记忆中的需求和情绪状态,它来自于被激活的记忆中关于自身的各种感觉和情绪。一种是自身处于观察者角度观察类似情景时的需求和情绪状态,它来自于被激活的记忆中观察他人在类似情景下,机器产生的各种感觉和情绪。所以机器在预测“利”和“害”时,是同时从自身的观看角度和从他人观看的角度,来同时评估一个事件对自身带来的“利”和“害”。
机器的预测能力,不仅仅包括预测一件事情可能带来的“利”和“害”。还需要预测在“利”和“害”驱动下,自己或者他人可能采取的响应,以及他人做出响应后对自己的“利”和“害”带来的影响。这些都是通过统计关系网络中,相关的需求和情绪状态等动机状态值而获得的。所以,机器的评估结果是随更多的输入信息而动态变化的。机器的决策和响应过程,是一个动态的路径规划过程。它是基于经验响应和基于收益和损失的概率计算来联合驱动的。
通过上述方法,机器通过层层迭代分解的方式,就可以把一个抽象的趋利避害目标,在特定输入条件下,层层分解成大量的提高或者降低某些具体事件发生的概率的任务。这些任务可以层层细分到非常具体的目标任务,比如一直分解到机器的底层驱动能力。这个过程就是机器的决策和响应***。
8,模仿能力
模仿能力是人类存在于基因里的能力。比如对一个呀呀学语的孩子,如果每次他(她)回家后,我们和他(她)打招呼,说“你回来了”。经过几次后,当他(她)再次回家时,他(她)会主动 说“你回来了”。这表明他(她)在并不理解信息含义的情况下,就已经开始模仿他人进行学习。同理,我们让机器学习也采用同样的方法。所以,机器需要把模仿作为一种底层动机置入机器。使得机器愿意模仿他人(他机器)的行为,并根据自己的评估或者外界的反馈信息来不断改进,从而不断锻炼自己的各种感官、肢体、语言和动作的协调一致的能力,从而提高学习效率。在机器学习的不同阶段,我们可以给机器赋予不同强度的模仿动机。比如在机器学习语言和动作输出时,我们可以给机器直接赋予较强的模仿动机,而在其他阶段,则可以赋予正常的模仿动机。
当机器获得外界的语音或者动作输入后,这些语言或者动作会激活机器自己的相关记忆。这些记忆可能是一个相似的发音,或者一个基础的动作片段。这些记忆会进一步激活和这些记忆相关的感觉信息、需求和情绪信息、语言或者动作记忆。机器在模仿动机的驱动下,会以这些被激活的记忆为基础,通过决策***来通过调整经验中的底层驱动参数来做出类似的语音输出或者动作输出。而底层驱动是指语音输出底层经验,或者动作输出底层经验。它们是特定语音或者动作对应的肌肉驱动命令,其中参数是通过后天学习并不断通过反馈来更新的。
人类可以给机器预置一些最基本的语音或者动作(包括表情和肢体语言)能力。它们的参数优化可以通过后续学习和训练,让这些参数和行为的结果通过记忆联想起来,并通过情绪和需求***(受到自我或者外界反馈的影响)来不断调整,最终在底层动机的驱动下,机器通过记忆和遗忘机制,获得在不同外部信息激励下的不同参数之间的关系,形成记忆。这些记忆都是机器在面对外部信息输入时的知识和技能。它们包括语言、动作、表情、肢体动作等行为习惯。
人类还可以给机器赋予预置的条件反射***。这些***的作用就是在特定的输入情况下,人类希望机器做出的响应。比如机器在危急情况下的躲避动作,或者机器在特定信息输入下的特定输出动作(比如这些条件反射***可以达到用于机器的自检,或者紧急停机,或 者调整机器的工作状态等目的)。
9,执行过程
在有了以上各种基础能力后,机器才能够根据自己的决策,来具体执行响应。比如语言输出、动作输出(包括表情和肢体语言输出)或者其他形式的输出(比如输出数据流、图像等)。执行响应步骤是一个把规划翻译成实际输出的过程。
如果在选择各种可能的响应步骤中,机器选用的是语音输出,这就比较简单,只需要把准备输出的图像特征图转变为语音,然后利用关系网络中的语言之间的关系(存在于关系网络中的语法知识),组织成语言输出序列,并调用发音经验来实施就可以了。
需要指出,机器可能根据经验(自己或者他人经验),选用一些表达整个句子的动态特征(比如使用语气、音频音调或者重音变化的不同运动模式,来表达疑问、嘲弄、不信任、强调重点等人类常用方式。这些方式通常是一句话或者整段语音的低分辨率特征)。因为机器是从人类生活中学习到这些表达方式的,所以人类任何表达方式,理论上机器都可以学习到。
如果机器选用的是动作输出,或者是语音和动作混合输出,那么问题就会变得复杂很多。这相当于组织起一场活动。机器的响应计划中,可能只有主要步骤和最终目标,其余都需要在实践中随机应变。
机器需要把准备输出的序列目标响应,按照这些目标涉及到不同的时间和空间,对它们在时间和空间上做划分,便于协调自己的执行效率。采用的方法是通过选择时间上紧密联系的目标和空间上紧密联系的目标作为分组。因为动态特征图和静态特征图结合后构成的信息组合,其相关记忆的环境空间是带有时间和空间信息的,所以这一步可以采用归类方法。这一步相当于从总剧本改写到分剧本。
机器需要把每个环节中的中间目标,再次结合现实环境,采用分段模仿的方法,来逐层展开。机器在顶层提出的响应计划,通常只是使用概括性很高的过程特征,和概括性很高的静态概念组成的(因为这些概括性很高的过程才能找到多个相似的记忆,所以借鉴它们建 立的响应也是高度概括的)。比如“出差”这个总输出响应下面,“去机场”是一个中间环节目标。但这个目标依然很抽象,机器是无法执行模仿的。
所以机器需要按照时间和空间划分,把在目前时间和空间中,需要执行的环节作为目前的目标。而把其他时间和空间的目标作为继承目标,暂时放到一边。机器把中间环节作为目标后,机器还是需要进一步细分时间和空间(再次写下级分剧本)。这是一个时间和空间分辨率不断增加的过程。机器把一个目标转换成多个中间环节目标的过程,依然是使用决策能力,分析各种可能的结果和可能发生的概率,并按照“趋利避害”的原则来选择自己的响应的过程。上述过程是不断迭代,每一个目标划分成多个中间目的的过程是完全相似的处理流程。一直要分解到机器的底层经验为止。底层经验对语言来说就是调动肌肉发出音节。对动作而言,就是分解到对相关“肌肉”发出驱动命令。这是一个塔形分解结构。机器从顶层目标开始,把一个目标分解成多个中间环节目标。这个过程就是创建虚拟的中间过程目标,如果这些中间过程目标“符合要求”就保留。如果“不符合要求”就重新创建。这个过程逐层展开,最终建立机器丰富多彩的响应。
在这个过程中,机器随时可能碰到新信息,导致机器需要处理各种信息,而这些原来的目标就变成继承动机。这就相当于组织活动的过程中,不断碰到新情况,需要立即解决,否者活动就无法组织下去了。于是导演叫停其他活动,先来解决眼前碰到的问题。解决后,活动继续进行。另外一种情况就是在这个过程中,导演突然接到一个新任务,于是导演权衡利弊后,决定活动先暂停,优先处理新任务。
机器是一边执行可以进行的模仿任务,一边分解其他目标到更细致目标的。所以机器是边做边想的。这是因为现实情况千差万别,机器不可能事先都知道外界情况而做出计划。所以这是一个环境和机器互动来完成的一个目标的过程。
至此,机器利用上述各项能力就可以完成一次对输入信息的理解和响应。这个过程作为机器和外界互动的一个最小周期。机器不断重复是使用这个过程,来完成更大的目标,表 现为机器和外界的持续互动过程,表现出机器智能。
10,机器的经验总结
机器的经验,不仅仅是通过记忆和遗忘机制来形成关系网络中的连接,还可以主动强化这种连接。这种主动强化连接可以表现为多种形式:比如通过语言学习他人的经验。语言形成的激活信息流,和语言一起构成了学习到的他人的经验。这种经验是作为新输入信息存入记忆中的,它们是记忆的一部分。再比如,机器通过对那些和“利”与“害”关系连接紧密的信息,通过重复记忆来主动把这些记忆变成长期经验。
机器可以采用预置算法,把那些能够重复出现,并且能够较大的影响“利”与“害”关系的记忆,通过把对应事件的信息做依次虚拟的输入,重新走一遍虚拟的处理过程,强化了关系网络中相关的连接,从而增强了经验。在这个增强的过程中,类似经验中的那些共有部分之间的连接可能逐步增强,所以经验会变得越来越简洁和通用,最终形成某种机器自我总结的规则。所以机器的自我总结实现上,可以采用预置程序和可调参数来实现。这些可调参数可以通过机器在学习中,逐步总结的“利”与“害”和具体的事物的连接关系而调整,也可以根据机器被刺激的强度和频率来调整。
图1的S105模块是机器的通信连接模块。机器可以通过通信模块和外界的其他机器(包括计算机)按照预设的协议进行数据交换。机器通过数据交换,可以实现分布式计算。比如现场的机器可能处理部分数据,还可能把部分或者全部信息传给中央大脑,借助中央大脑强大的信息存储和处理能力来做出决策。机器通过数据交换,机器还可以共享记忆。比如一个机器获得的经验,可以传递给其他机器,也可以把多个机器的记忆融合,从而形成更加完备的经验。机器之间也可以共享计算能力,从而构成分布式思维能力。通过机器的通信连接模块,机器之间可以实现认知共享、决策共享和行为协调能力,从而实现超级人工智能。
附图说明
图1为机器的一种可能的组成部分示意图。
具体实施方式
下面结合附图和具体的实施例对本发明申请作进一步的阐述。应该理解,本申请文本主要是提出了实现通用人工智能的主要步骤。这些主要步骤中,每一个步骤都可以采用目前公知结构和技术来实现。所以本申请文本的重点在于这些步骤以及其组成,而不是局限于采用已知技术来实现每个步骤的细节上。所以这些实施例描述只是示例性的,而并非要限制本申请文本的范围。在以下说明中,为了避免不必要地混淆本申请文本的重点,我们省略了对公知结构和技术的描述。本领域技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请文本保护的范围。
传感器***的建立:在图1中,S101模块机器的传感器模块。为了让机器产生和人类类似的认知方式,S101模块需要采用一到多种通用的传感器:视觉传感器、听觉传感器、味觉和嗅觉传感器、触觉传感器、重力方向传感器和姿态信息传感器信息等,还可以增加面向特定应用的传感器(比如自动驾驶可以增加红外传感器、激光雷达传感器等)。机器也需要使用监控自身状态的传感器,这些传感器也是机器感知信息的一部分。S101主要由传感器硬件和传感器对应的软件组成的模块,目的是通过传感器感知机器外部和机器自身的信息。机器的传感器产生的数据,输入到机器的处理单元,被称为输入数据。
机器的传感器数据和机器的处理单元之间的通信,可以采用任何现有的通信***,或者自定义的通信***,这些具体的通信形式不影响本专利申请的实现。
预置动机***的建立:机器采用符号来代表一类动机。比如直接采用语言符号,或者某种感觉信息数据,或者直接人为规定一种符号来代表机器的某一类动机。比如采用“Sa”符号来代表机器的安全需求,比如采用0到100的数值来代表机器的安全需求的状态。其中0代表非常没有安全感,而100代表完全的安全感。比如采用“危险”符号来代表危险,采用H,HM,M,ML,L分别来代表从高到底的危险程度。比如采用“笑脸”来代表愉快的情绪,采用0到10来代表愉快的程度。在本发明申请中,建立多少动机类型,采用什么形式的符号来代表动机类型,以及采用什么形式来表达符号所处的状态,并不影响本发明申请的 权利要求。因为在本发明申请中,所有的符号和对应的状态的处理方式是完全类似的。比如,我们可以采用不同的符号给机器建立兴奋、生气、伤心、紧张、焦虑、尴尬、厌倦、冷静、困惑、厌恶、痛苦、嫉妒、恐惧、快乐、浪漫、悲伤、同情和满足等不同的情绪。每一种情绪有自己的量化空间。机器趋利避害的动机,具体表现就是在各种动机需求建立的空间中,权衡利弊,寻找可以接受的空间范围。然后根据经验来推动事件向自己可以接受的空间发展。这个推动的策略可能是直接推动直接目标,也可能是推动实现直接目标的前提条件,从而提高直接目标发生的概率。
多分辨率信息特征提取***:多分辨率信息特征提取,可以采取目前的人工智能感知处理能力,提取输入信息在不同分辨率下的特征。和目前人工智能感知处理的方法差异在于:目前的算法实现的是数据空间到标签空间的映射。而本发明申请中,S102模块目标是提取输入数据中的局部共有特征。
现有的人工智能感知处理方式是通过大量的数据到标签的映射,通过优化误差函数来优化映射网络。这种映射是数据到具体的标签,优化得到的映射关系是和具体的标签紧密联系的,所以这种算法没有通用性。而本发明申请中,所有输入数据都是直接到局部共有特征的映射。而局部共有特征是通过预训练建立的基础特征库。建立这些基础特征库的基本假设就是:人类的进化过程,是沿高效率利用计算能力的方向发展的。因为只有这样,才能在提高算法复杂性来处理复杂环境的情况下,最大限度节省能源消耗,从而增加生存几率。而先天形成对广泛存在的局部特征的提取算法,则是这样的进化方向的具体体现。因为这样才能最大可能的复用这些算法,提高计算的能效比。
而需要采用多分辨率的原因就是因为“相似”是一个模糊的概念,只有建立的特定的分辨率上,才能说明两个事物之间的“相似”程度。在人类的进化中,人类已经总结了在我们这个星球上,部分相似的事物之间可能存在其他方面的相似性。所以我们需要从已知的相似性去推测其他可能的相似性。比如从狗的外形推测狗可能的行为。所以多分辨率就是建立 事物之间在不同分辨率上的相似性。比如在粗略的分辨率上,所有的狗都是相似的。但在进一步分辨率下,我们就会认为不同的狗之间存在差异。所以通过粗略的分辨率上,我们可以把狗的一些共有行为泛化。比如看见一只陌生的狗,我们就能通过输入的狗的信息特征,来激活相关的记忆。首先,所有记忆中粗略的分辨率上狗的信息都会被激活,这些粗略信息会进一步激活狗的其他特征。然后关于输入的狗的信息更加细致的特征,会激活和这些细致特征相关的记忆,最终激活累计的状态是关于狗的公共信息和关于这只狗的细致特征的信息被激活,从而获得对这只狗的初步信息评估。而那些和这只狗的细致特征不相符合的其他关于狗的细致特征则不会被激活,所以这些信息不会来干扰认知。同理,当我们看到一只和狗比较相似的动物,从粗略的分辨率,我们会激活关于生命的信息,激活关于动物信息,激活关于狗的共有信息,激活关于这只特定动物表现出来的特征相关的信息。比如如果它的爪子很像猫爪,那么我们就可能激活它可能会抓伤我们的预测。所以多分辨率特征的提取,尤其是多分辨率动作特征的提取,是知识泛化的关键。因为多分辨率动作广泛存在于不同的事物中。所以多分辨率动作是一个广泛存在的知识泛化桥梁。
多分辨率提取方法的具体方法,就是对数据做不同分辨率的压缩,然后采用不同大小的数据提取窗口来重复寻找局部相似性。在申请号为PCT/CN2020/000109,名为“一种模仿人类记忆来实现通用机器智能的方法”的专利发明申请,有关于如何实现多分辨率提取方法的具体方法,这里不再赘述。
所以尽管我们可以使用目前任何具体特征提取算法,比如数据坐标基底变换和滤波处理,比如卷积神经网络,或者各种形式的延时神经网络等具体算法来实现多分辨率特征提取。但在本发明申请中,这些算法是实现多分辨率特征提取其中一个步骤。多分辨率特征提取和已有的人工智能在算法目标上不同,输出也不同,需要的数据量也不同。多分辨率特征提取不需要大量的样本,因为它是从特征到概念的映射,而通常一个概念包含的特征是有限的。
联想链式激活***:联想激活是指在“临近激活”原则、“相似激活”原则和“强记忆 激活”原则下进行的链式激活过程。当信息输入时,机器根据动机来赋予其初始激活值,并通过“临近激活”原则、“相似激活”原则和“强记忆激活”原则来进行链式激活。在关系网络中,当某个节点(i)被赋予一定的激活值,如果这个值大于自己的预设激活阈值Va(i),那么节点(i)将被激活。它会把激活值传递到和它有连接关系的其他特征图节点上。传递系数按照“临近激活”原则、“相似激活”原则和“强记忆激活”原则来确定。如果某个节点收到传过来的激活值,并累计上自己的初始激活值后,总激活值大于自己节点的预设激活阈值,那么自己也被激活,也会向和自己有连接关系的其他特征图传递激活值。这个激活过程链式传递下去,直到没有新的激活发生,整个激活值传递过程停止,这个过程称为联想激活过程。
联想激活过程是一种相关记忆的搜索方法,所以它可以被能够实现类似功能的其他搜索或者查找方法代替。在申请号为PCT/CN2020/000109,名为“一种模仿人类记忆来实现通用机器智能的方法”的专利发明申请,提出了有关于如何实现联想链式激活***具体方法,这里不再赘述。
预置基础响应***的建立:预置基础响应***是通过预置程序实现的机器响应***。这些***是机器做出响应的本能反应。这些本能反应在后天的学习中,可以被逐步优化。
1,基础动作***。基础动作包括赋予机器可以模仿人类做出基础动作的程序,动作包括语言发音和表情、肢体动作的模仿。
2,预置本能响应。本能响应是指通过预置程序,实现机器在特定输入下的输出响应。比如机器对高温的躲避反应,对坠落的本能躲避反应,对突然撞击的躲避反应等可以通过预置程序实现的响应。这些响应在后续的学***衡。
需要本能响应的紧急情况并不是很多,所以可以采用预置程序来实现。但伴随激发本 能响应的环境却可能纷繁复杂,所以机器需要通过学习,通过后验经验来建立关系网络,在预测、规划和执行***的控制下,对不同的情况做出不同的响应。
3,初始激活值赋值***。在图1的S102步骤中,机器提取多分辨率特征后,使用初始激活值赋值程序对这些输入信息赋予初始激活值。初始激活值赋值程序是预置程序,它的输入参数包括输入信息,机器此时的动机和动机状态,比如需求和需求状态,情绪和情绪状态。它的输出包括给输入信息赋予的初始激活值。一种简单的方法是根据机器此时的预期信息、需求和情绪等动机状态,直接给所有输入信息赋予相等的激活值。另外一种方法是根据机器的预期信息做简单的分类,对不同的预期信息采用不同的初始激活值。还可以采用给不同分辨率信息赋予不同的初始激活值的方法。比如对低分辨率信息赋予较大的激活值,而对高分辨率信息赋予较低的激活值。也可以反过来赋值,这取决于机器对输入信息的预期。而机器对输入信息的预期则是来自于之前的信息处理结果。
机器还可以根据输入信息的频繁程度来调整对其赋予的初始激活值。比如对那些频繁的刺激,对其赋予的初始激活值逐渐降低。这些初始激活值低的信息,发起的链式激活过程中,其他信息获得的激活值也低。而新记忆存储时,其获得的初始记忆值是和记忆发生时其激活值正相关的,所以那些日常生活中,司空见惯的事物,因为其获得的初始激活值低,其发起的链式激活给其他信息赋予的激活值也低,其存储时获得的记忆值也低。而记忆存储是首先放入临时记忆库。只有在临时记忆库中获得足够的记忆值的信息才会移入其他记忆库。所以那些日常生活中,司空见惯的事物难以在临时记忆库中存活下来,形成长期记忆。这也是一种智能的进化结果,因为智能体需要的是那些普遍规律和特殊例外来应对外界环境。普遍规律是通过重复总结,当总结完成后,就不再需要一次次的长期记忆类似情况。而那些例外的,或者是偶发的强烈刺激,可能因为高的初始激活值带来强烈的链式激活过程,从而单次的记忆值就达到长期记忆阈值,从而变成机器的长期记忆。
机器的初始激活值,还可以采用多次赋予初始激活值。比如采用前述方法来对输入信 息简单赋予初始激活值。然后根据对输入信息的初步处理结果,初步确定其和“利”与“害”的连接紧密程度,然后再根据其和“利”与“害”的连接紧密程度,按照预设程序再次赋予其初始激活值。比如对那些可能带来大的“利”与“害”的信息,机器的策略可能是进一步分析信息。而进一步分析就意味着再次采用更多的分辨率提取输入信息,也再次赋予多分辨率信息特征更高的初始激活值,让整个激活范围更大,从而搜索更多的相关记忆。上述过程还可以迭代进行,从而形成多次初始激活值赋值。
以上的基础动作***、预置本能响应和初始激活值赋值***,可以使用目前成熟的计算机程序来完成。这些具体的实现方式,通过行业的公知知识就可以实现。本发明专利主要是揭示采用现有技术来实现通用机器智能的方法和途径,而行业内技术人员基于现有行业内公知知识就能实现的特定步骤的具体实现细节,不需要在本发明申请中进一步说明。
环境重建:环境信息是机器输入信息的一部分。这部分信息会激活机器在记忆中相似的环境信息。这些记忆中的环境信息有可能是记忆中关于同一环境的其他部分或者其他角度的记忆,或者是类似环境的记忆。这些被激活的信息可能包括视觉信息、听觉信息、触觉信息、动机以及动机状态信息等记忆。
机器需要采用(1)把环境中相似的部分重叠,用于构建3维环境。(2)使用关于类似环境中的共有结构特征,作为模型,来预测缺失的信息。通过(1)、(2)步骤,机器组合而成的环境,就是融合了目前输入的环境信息和记忆相关的环境信息构建的目前所处环境。这是机器对环境的重建。机器对环境的重建,本质上是机器对环境的预测。在建立好环境空间后,这样我们就能根据被借鉴的记忆空间的其他部分,来了解现实空间中目前看不到的部分。比如我们看熟悉的柜子时,仿佛能看到柜子里面的图像。这其实是因为我们叠加了柜子里面的记忆图像。同时,重建这些环境的过程中,可能通过联想激活触发机器相应的听觉回忆、触觉回忆、动机以及动机状态(比如各种情绪和情绪状态的回忆)等,所以机器重建的环境是带有主观感情的环境。
机器对环境的预测范围大小,和机器的预期目标相关。通常预期目标大的情况下,环境的预测范围也大。预期目标分辨率高的情况下,环境的预测分辨率也高。而机器的预期目标是机器在之前的预测、决策和执行过程中产生的目标。
通过局部来重建整体的算法很多,比如GAN神经网络,比如游戏中常见采用的3维重建技术。机器可以利用这些现有的技术,来把输入的多分辨率环境信息和记忆中被激活的多分辨率特征信息结合做3维重建。行业内技术人员基于现有行业内公知知识就能实现这些目的,这些技术也不在本发明申请的权利要求范围中,所以这里不需要进一步说明。
自我重建:和环境重建的方法类似,机器也采用同样的方法来重建自我形象。在重建自我形象中,机器除了使用目前的输入信息外,还需要融合目前输入信息激活的记忆中相关信息:比如视觉和听觉,触觉、嗅觉、味觉和感觉,重力感觉、肢体状态感觉、动机和动机状态(比如情绪和情绪状态、需求和需求状态等)。机器需要采用(1)把这些信息中相似的部分重叠,用于构建3维形象。(2)使用关于自身记忆中的共有结构特征,比如人的模型,或者自己的典型形象,作为模型来组织这些信息,补充缺失的信息。通过(1)、(2)步骤,机器组合而成的信息整体,就是机器建立的关于自身的形象和感觉。比如,当我们的双手放在背后做动作,我们仿佛能看到这些动作。这就是因为我们在发出神经指令和获得触觉感知后,激活了记忆中相似的神经指令连接的视觉,也激活了相似的本体姿态感知信息连接的视觉和触觉,也激活了相似的触觉感知信息连接的视觉,这些信息通过把相似部分重叠后,整合为我们的整体形象。我们在头脑中创建了一个自我镜像,我们仿佛能看到自我镜像的动作。
机器首先是通过预置的软硬件***,有了具体的“自我”概念:能够在自己命令下驱动的肢体或者发音器官,能够给自己传来各种信息的感觉器官,和自己各种感觉总是在一起的视觉和听觉信息。这些信息由于总是同时出现的,它们之间的关系在关系网络中非常紧密,从而形成了“自我”这个概念。所以“自我”这个概念和其他概念形成的方式是一样的,并不神秘。比如,如果一个人,只要在某个桌子被敲打时就能感觉到疼痛,那么他一定会把这 张桌子认为是自己“自我”的一部分。
有了狭义的“自我”概念后,机器通过在学习中,逐步获得各种“利”和“害”与包含在“自我”之间的关系。这些“利”和“害”又是与机器的底层动机(比如需求和情绪)紧密连接的。所以在“趋利避害”的动机驱动下,机器的行为模式就可能是“占有”那些给自己带来“利”的事物,而“躲避”那些给自己带来“害”的事物。所以,有了“自我”这个概念后,才会有“占有”和“避免”这些概念。因为“占有”和“避免”这些概念是在收益最大化,损失最小化的原则的驱动下延伸出来的。有了“占有”和“避免”这些概念,机器就能理解我们这个社会的组织、法律、行为和道德。因为我们这个社会的组织形式的核心内容就是“占有”和“避免”的各种表述。
所以机器的自我意识本质是一种机器的一种行为方式。机器并不需要神秘的自我意识加注于自身,它是机器通过关系网络和联想激活学习到各种信息和自身“利”和“害”关系的认知后,按照“趋利避害”的方式来决定自己和外界互动的一种行为方式。
本发明专利申请的重点是创新性地揭示实现强人工智能的途径,重点在于提出的实现途径。而机器的自我重建所需要的具体软件和算法,在行业内技术人员基于现有行业内公知知识就能实现本发明专利申请提出的实现自我意识建立的这个子目标,所以这里不需要进一步说明。
语言重建:语言在机器智能中扮演了重要的角色。语言是人类为了更好的交流经验而建立的一套符号。每个符号都代表一些具体的事物、过程和场景。当机器接收到外界的语言信息时,通过S102模块对输入的语言(比如语音或者文字)做多分辨率特征提取。比如在低分辨率上获得语音从整体上的特征,比如升降调、重音、语气、语速、语音大小以及变化、语音语调频率以及变化等特征;在中分辨率下获得具体的词汇;在高分辨率下获得具体音节发音。
通过S103模块,这些特征会通过联想激活同时激活记忆中相关的记忆。由于语言使 用非常频繁,所以语言通常和它们所代表的事物、过程和场景的典型图像、动态图像、感觉、声音等其他传感器紧密联系在一起的。而这些紧密联系在一起的局部网络是关系网络的一部分,它们就是概念。
当语言输入时,语言所代表的相关记忆被激活。这些记忆既可能有语言本身的信息,也会有关于语言使用的记忆被激活(比如强调重点的语音强调方式或者文字强调方式,比如表示不信任的语气或者嘲弄的语调等)。这些被激活的信息构成了一个激活信息流。为了平衡语言的前后关联和目前语义识别,被激活的信息的激活值会随时间而衰退。衰退的参数和机器的动机以及状态(比如需求和需求状态,情绪和情绪状态)相关。而链式激活实现了所有输入信息的上下文关联识别。这里的输入信息既包含环境信息,也包含被激活的记忆信息。这些信息的相互激活赋值,就体现了上下文关联。这种关联比统计生成的语义库内容更加广泛。它不仅仅涉及到语言,更涉及到所有的感官输入和相关记忆。所以通过S102构成的多分辨率信息、S103构成的常识网络,机器可以实现语言到静态和动态图像、感觉、需求和情绪之间的连接,也实现语言到相关语言和记忆的连接。当这种连接被纳入机器对语言输入的理解中,并根据对语言的理解,根据相关经验做出响应,就体现了机器真正的理解了输入语言的真实含义。
语言输入构成了一个输入信息流,而对应的激活记忆也构成了一个激活信息流。机器在理解语言时,需要重建这个激活信息流。重建就是对其中的环境信息进行环境重建,对其中涉及到自我的信息进行自我重建。机器以第三者的角度来观看重建后的激活信息流。重建后的信息流作为一种新的信息输入,机器可以把这种新的信息流,重新存入记忆,作为记忆的一部分。所以机器通过语言学习到的是一种虚拟体验。是一种虚拟的从第三者角度观看到的场景。其中重建的自我形象是自我意识的代表,重建的环境是虚拟的环境信息。机器可以从这些虚拟体验中学习到认知,学习到各种信息和“利”和“害”之间的连接关系。这种虚拟体验也会激活记忆中相关信息,它们也会被存在于记忆中,和其他记忆一起,作为关系网 络的一部分,也同样受到记忆和遗忘机制的优化。
语言重建是机器学习和理解语言的基础,也是机器学习人类历史上积累的所有经验的基础。在有了语言的学习能力后,机器就可以学习所有人类的知识,这是通往超级智能的基础能力。
语言是传递他人经验的一种方式,机器之间既可以通过语言来传递经验,还可以通过关系网络来直接共享认知,所以机器的学习过程可以是非常迅速的,人类必须通过预设的规则来避免机器对各种认知组合后产生的不利于人类的行为。由于本发明申请涉及到的机器智能技术对人类是可以看见的,机器的激活过程,机器的预测结果,机器的决策算法和决策结果,都是人类可以直接读取到,并可以理解的。这一点和目前的深度学习是不一样的。目前的深度学习,由于参与决策的参数数量过于庞大,意义并不明确,所以决策过程难以解释。而本发明申请中,机器从感知到认知是通过共有局部特征,通过记忆和遗忘机制来建立的,参与决策的参数数量上远小于目前多层神经网络,决策的每一步都是可以理解的,所以本发明申请提出的人工智能是一种可解释、可控制的人工智能。
机器的预测、决策和响应过程:机器的预测、决策和响应过程是基于关系网络,通过联想激活限定搜索范围,通过过去的经验(包括事件之间的因果概率,包括事件带来的“利”和“害”),采用统计的方法来计算“利”和“害”的大小和发生概率,然后在预定的算法下,通过趋利避害的原则,来增加或者降低某一事件发生的概率。这就是机器的预测、决策和响应的整个过程。
当信息输入后,机器不需要去穷尽预测所有可能的结果,这也是无法完成的任务。机器只需要评估那些被激活的、和输入信息相关的经验中可能发生的结果。这相当于利用了常识来限定了搜索范围。
在过去的经验里,每一种可能的结果,都通过关系网络联系着需求状态和情绪状态等动机状态,这些状态代表这种结果给机器带来的“利”和“害”。所以把可能发生的概率和相 关的需求和情绪状态结合,机器就可以推测出事物发展的每一种结果可能给自己带来的“利”和“害”的大小、类型以及它们可能发生的概率。所以机器在预测可能发生的结果给自己带来的“利”和“害”时,需要同步把可能发生的概率考虑进去。所以机器既要评估“利”和“害”,还要评估其对应的概率,联合两者做出决策。
比如在有限的范围内,机器可以采用目前任何人工智能预测方法,比如蒙特卡洛搜索、决策树、贝叶斯估计、基于规则等机器推理的方法,来推测目前事件发展到下一步最可能结果的概率,以及这个结果给自己带来的“利”和“害”。
一种可能的评估方式是:使用各种类型的需求和情绪状态建立一个多维空间,在这个空间内,给机器预置最佳追求空间、可以接受区域和不可接受区域等预置的各种空间。这相当于给机器建立各种趋利避害的计算规则。但由于需求和情绪类型多种多样,这样的规则难以明确表述,所有可以采用计算空间距离的方式。机器采用的策略可以是在空间距离上尽可能靠近最佳追求空间,远离不可接受空间。从而把这些规则采用求空间距离的方式来量化。
另外一种比较简单的方法就是计算Y=∑f i(x1,x2,…)*p1 i(x1,x2,…)-∑G j(y1,y2,…)*p2 j(y1,y2,…)。其中Y是最后评估值,f i(x1,x2,…)是各种收益值(由需求、情绪类型和状态按照预设规则确定),p1 i(x1,x2,…)是对应收益值的可能发生概率。G j(y1,y2,…)是各种损失值,p2 j(y1,y2,…)是对应损失值可能发生的概率。∑是求和符号,分别对i和j求和。机器的目的就是最大化Y。这种方法就是先把需求、情绪类型和状态量化成收益值和损失值,然后简单的求概率和。
上述对单步可能性的评估是一种技术性评估。这种评估可以采用现有的机器智能方法,采用已有的统计算法就可以实现,比如求概率统计平均,或者最小空间距离等方法。
机器在有了对单步可能性的评估的基础上,还需要预测和规划后续自己的响应和环境的响应带来的“利”和“害”。这是一个多步的“利”和“害”预测和评估过程。这是一个策略规划,相当于寻找最优路径。选择路径的原则就是趋利避害,选择的方法可以包括决策树、 基于规则的专家***、贝叶斯网络、进化算法等已有的决策算法。这些算法的基础都是建立在带有因果概率和“利”和“害”关系的关系网络。
机器在评估多步“利”和“害”时,一种简单的评估方法可以是:采用把每一步规划的响应作为一个虚拟的过程再次作为输入信息,再次走所有的信息提取,联想激活,环境、自我和语言重建过程,然后再次采用单步评估一样的方法来评估可能带来的“利”和“害”。这是一个利用自己的经验,来评估自己的响应可能带来的影响。这些影响包括外界对自己响应的反馈(也包含了环境中他人对自己的反馈),以及这些反馈可能给自己“利”和“害”带来的影响。
采用这样的方法,机器就可以使用迭代方法,充分评估输入信息,并按照趋利避害的原则来规划自己的响应。这种规划有可能多次迭代,多次迭代本质是一个扩大搜索范围,寻找最优路径的过程。这就是机器的规划本质。
机器在规划时的最优路径搜索范围,是由可能带来的“利”和“害”决定的。通常,如果可能带来的“利”和“害”值很高,这时机器在预设的趋利避害算法原则下,会扩大搜索范围,比如降低激活阈值或者增加评估的迭代次数,使得更大范围的信息被激活,把更大范围的经验纳入路径预测过程,从中选择最优路径。
所以机器的决策过程并不神秘,在有了常识的帮助下,上述机器的决策过程实现上相对简单,可以采用预置程序和现有的机器推理方法来计算主要的可能结果的概率,以及带来的各种收益和损失的概率。机器用于评估“利”和“害”的方法只涉及到概率和统计方法,这些是成熟的公知知识。但所以机器评估“利”和“害”的方法并不限于上述两种方法,机器可以利用任何已有的统计决策算法。
需要指出,机器的决策过程是建立在关系网络的基础之上的。而关系网络和机器受到的教育密切相关。甚至相同的学习材料,不同的学习次序,都会出现不同的关系网络,从而构成机器的不同认知和决策过程。机器的决策能力还受到机器的联想激活参数、统计算法参 数、底层动机和动机状态(包括需求和需求状态、情绪和情绪状态等)的影响。所以不同经历,不同设置的机器,做出的决策可能是各不相同的。
机器也可以根据自身经验的反馈,对自己做出决策的算法参数做调整。比如机器按照自己的决策执行一系列响应后,实际的结果和预期的结果差异较大,那么这些记忆也会进入记忆,成为关系网络的一部分。下次机器在做决策时,这些记忆就成为新的经验,这些经验就会影响机器的下一次类似状态下的决策行为。
机器还可以在预置算法下,学习总结经验。具体实施方式是:预置算法可以把机器获得较大的“利”和“害”过程挑选出来。让这些过程的记忆在机器的处理器中多次虚拟输入,这样每一次都会激活相关的记忆。那些存在于这类过程中的共有特征会一次次出现,并一次次激活相关记忆,增强它们的记忆值。所以这些增强了的记忆,更加容易被联想激活,更加容易被参考使用。而那些只是存在于单个特定过程中的相关记忆,则会由于长期难以激活而逐渐被忘记。这就是机器主动总结经验的过程。当然,没有这样的预置程序,机器自身也会同样的总结出这些经验,但需要的样本和时间就会变长。
由于机器的决策***会搜索多步的最优路径,所以机器可以出现接受小的损失而期望在后续获得大的收益的决策。
比如训练者告诫机器打人是会被“重罚”的。而机器通过对“重罚”这个词的解释,进一步激活了相关信息,就能理解打人是会带来巨大损失的。“打人”可能带来损失这个认知,还可以是机器通过直接的打人和反馈得到直接经验,还可以是给机器预置的相关经验,还可以是多种方式的混合经验。但当机器面对“自己的主人正在被攻击,非常危险”的情况下,那么机器就存在打人会带来“损失值”,而救主人生命会带来巨大“收益值”之间权衡,这时机器就需要进行多步搜索,扩大搜索范围,来确定这个带来巨大“收益值”的概率有多大,会不会带来其他损失。然后根据收益概率和损失概率来综合做出决策。
比如根据经验(或者预设经验)如果主人的生命受到威胁,那么救主人的“收益值” 会很高。但如果攻击者是“警察”,那么自己就不能插手,否则会以概率百分百地受到巨大“损失”等。这些知识可以是后天告知或者直接先天预置经验。
机器执行***类似于机器决策***,差异在于机器的执行***是对机器决策***产生的目标进行具体化;具体化的过程也是机器的一种决策,这种决策是把具体的目标通过逐层细分到机器的底层驱动命令层面,直到机器可以直接执行的指令为止。机器在逐层细分时,采用的也是机器的决策***。
在有了以上各种基础的能力后,机器才能够根据自己的决策,来具体执行响应。比如语言输出、动作输出(包括表情和肢体语言输出)或者其他形式的输出(比如输出数据流、图像等)。执行响应步骤是一个把规划翻译成实际输出的过程。
如果在选择各种可能的响应步骤中,机器选用的是语音输出,这就比较简单,只需要把准备输出的图像特征图转变为语音,然后利用关系网络和记忆,采用概念替换的方法把动态特征图(包括表示关系的概念)和静态概念结合起来(存在于关系网络中的语法知识),组织成语言输出序列,并调用发音经验来实施就可以了。需要指出,机器可能根据经验(自己或者他人经验),选用一些表达整个句子的动态特征(比如使用语气、音频音调或者重音变化的不同运动模式,来表达疑问、嘲弄、不信任、强调重点等人类常用方式。这些方式通常是一句话或者整段语音的低分辨率特征)。因为机器是从人类生活中学习到这些表达方式的,所以人类任何表达方式,理论上机器都可以学习到。
如果机器选用的是动作输出,或者是语音和动作混合输出,那么问题就会变得复杂很多。这相当于组织起一场活动。机器的响应计划中,可能只有主要步骤和最终目标,其余都需要在实践中随机应变。
机器需要把序列的目标(包括中间目标和最终目标),按照这些目标涉及到不同的时间和空间对它们划分,便于协调自己的执行效率。这也是来自于经验,也可以是采用预置算法来执行。采用的方法是通过选择时间上紧密联系的目标和空间上紧密联系的目标作为分组。 因为动态特征图和静态特征图结合后构成的信息组合,其相关记忆的环境空间是带有时间和空间信息的,所以这一步可以采用归类方法。这一步相当于从总剧本改写到分剧本。还需要指出,这个总剧本也可能只是完成机器规划的阶段目标之一,比如提高某一个可能带来好的收益值的条件发生的概率。
机器需要把每个环节中的中间目标,再次结合现实环境,采用分段模仿的方法,来逐层展开。机器在顶层提出的响应计划,通常只是使用概括性很高的过程特征,和概括性很高的静态概念组成的(因为这些概括性很高的过程才能找到多个相似的记忆,所以借鉴它们建立的响应也是高度概括的)。比如“出差”这个总输出响应下面,“去机场”是一个中间环节目标。但这个目标依然很抽象,机器是无法执行模仿的。
所以机器需要按照时间和空间划分,把在目前时间和空间中,需要执行的环节作为目前的目标。而把其他时间和空间的目标作为继承目标,暂时放到一边。机器把中间环节作为目标后,机器还是需要进一步细分时间和空间(再次写下级分剧本)。这是一个时间和空间分辨率不断增加的过程。机器把一个目标转换成多个中间环节目标的过程,依然是使用决策能力,分析各种可能的结果和可能发生的概率,并按照“趋利避害”的原则来选择自己的响应的过程。上述过程是不断迭代,每一个目标划分成多个中间目的的过程是完全相似的处理流程。一直要分解到机器的底层经验为止。底层经验对语言来说就是调动肌肉发出音节。对动作而言,就是分解到对相关“肌肉”发出驱动命令。这是一个塔形分解结构。机器从顶层目标开始,把一个目标分解成多个中间环节目标。这个过程就是创建虚拟的中间过程目标,如果这些中间过程目标“符合要求”就保留。如果“不符合要求”就重新创建。是否“符合要求”是指机器分析收益和损失情况后,确认这个策略是否达到了预设的可以接受的标准。
上述过程逐层迭代展开,最终可以建立起机器丰富多彩的响应。
在这个过程中,机器随时可能碰到新信息,导致机器需要处理各种信息,而这些原来的目标就变成继承动机。这就相当于组织活动的过程中,不断碰到新情况,需要立即解决, 否者活动就无法组织下去了。于是导演叫停其他活动,先来解决眼前碰到的问题。解决后,活动继续进行。另外一种情况就是在这个过程中,导演突然接到一个新任务,于是导演权衡利弊后,决定活动先暂停,优先处理新任务。
机器是一边执行可以进行的模仿任务,一边分解其他目标到更细致目标的。所以机器是边做边想的。这是因为现实情况千差万别,机器不可能事先都知道外界情况而做出计划。所以这是一个环境和机器互动来完成的一个目标的过程。
至此,机器利用上述各项能力就可以完成一次对输入信息的理解和响应。这个过程作为机器和外界互动的一个最小周期。机器不断重复是使用这个过程,来完成更大的目标,表现为机器和外界的持续互动过程,表现出机器智能。
以上机器的预测、决策和响应过程,并不需要新的算法。它们是现有算法,在关系网络的基础上,通过合理的组织起来就可以实现机器的预测、决策和响应过程。本发明重点是在揭示如何通过把现有算法和数据组织起来,建立实现通用机器智能(强人工智能)的方法、流程和步骤。而现有的统计算法、人工智能算法的说则不在本发明申请范围内,这里也不再赘述。

Claims (14)

  1. 一种建立信息之间连接关系的方法,其特征包括:
    认为在输入时间上相邻的信息彼此存在连接关系。
  2. 根据权利要求1所述的建立信息之间连接关系方法,其特征包括:
    这种连接关系是通过记忆和遗忘机制来优化的。
  3. 根据权利要求1所述的建立信息之间连接关系方法,其特征包括:
    信息本身是采用多分辨率特征来表示,包括代表其特征组合方式的特征来表示;这种连接关系是建立的信息的多分辨率特征的基础上的,信息之间的连接关系在不同的分辨率上是可以有差异的。
  4. 根据权利要求1所述的建立信息之间连接关系方法,其特征包括:
    输入信息包括外部输入信息,内部监控信息,机器的动机和动机激活状态信息。
  5. 根据权利要求1所述的建立信息之间连接关系方法,其特征包括:
    机器采用一种信息存储方法,其特征在于机器可以通过信息存储的方式来表达输入时间上相邻信息之间存在连接关系。
  6. 一种机器预测、决策和执行方法,其特征包括:
    机器在外部或者内部信息输入后,机器在关系网络中,采用联想激活的方法激活相关的信息;机器通过统计每种被激活的动机发生的概率和对应的记忆值,来预测类似事件可能给自己的动机带来的影响;机器根据统计得到的事件可能给自己动机带来的影响,按照趋利避害的方式来选择和推动事件发展的路径,这些路径上的序列目标就是机器的行为决策。
  7. 根据权利要求6所述的方法,其特征包括:
    机器把计划输出的响应,作为自己的虚拟信息输入,继续采用联想激活、信息重建和预测、决策和执行的方式,来评估自己的响应可能得到的反馈。
  8. 根据权利要求6所述的方法,其特征包括:
    机器把预期得到的外界反馈,作为自己的虚拟信息输入,继续采用联想激活、信息重建和预 测、决策和执行的方式,来评估外界对自己响应的反馈可能给自己带来的“利”和“害”。
  9. 根据权利要求6所述的方法,其特征包括:
    机器可能通过降低激活阈值或者增加预测和评估的迭代次数来扩大搜索范围,从而从更大的搜索范围中寻找最优的响应路径。
  10. 根据权利要求6所述的方法,其特征包括:
    机器在推动事件发展的路径的方法是一个动态的过程,在执行过程中,机器在新信息下不断按照同样的预测方法,把新信息纳入预测、决策和执行过程,并更新行为决策。
  11. 根据权利要求6所述的方法,其特征包括:
    机器在执行序列目标的过程中,按照同样的机器预测、决策和执行算法,把序列目标中的单个目标分解成更多的底层目标,通过在执行中逐层分解,一直分解到机器可以直接执行的底层驱动命令为止;机器在系列的底层驱动命令下的行为构成了机器对输入信息的响应。
  12. 根据权利要求6所述的方法,其特征包括:
    机器在做出预测时,不仅仅预测自己的行为可能给自己带来的动机状态影响,还会预测非我可能的行为以及这些行为可能给自己带来的动机状态影响;机器在预测非我可能的行为时,依据的假设是(1)过去关于非我在类似状态下的行为记忆和(2)假设非我也是按照趋利避害的方式做出决策。
  13. 一种可以自主学习、决策和执行的装置,其特征在于,包括:
    传感器***,包括模拟人类感官的通用传感器组,和用于机器特定用途的传感器组,以及用于机器监控自身运行状态的传感器组;
    预置动机***,包括采用符号来代表一类动机,并使用某种方式表示动机的状态,机器在做决策时的出发点是驱动某些动机处于某些状态;
    关系网络***,包括对输入信息的多分辨率简化,并认为相邻输入信息之间存在关系,并在记忆中采用特定方式来表达这种关系;多分辨率信息之间的关系采用记忆和遗忘机制来优化; 关系网络包含信息本身,还包含这些信息所激发的动机和动机状态;
    联想激活***,包括对输入信息的单次或者多次赋予初始值的***,这些***采用预置程序的方式来实现,并根据学习结果,按照趋利避害的方式来调整这些预置程序中的参数;也包括按照“临近激活”原则、“相似激活”原则和“强记忆激活”原则的联想激活方式;联想激活***在存储输入信息时,给输入信息赋予的记忆值既是单次或者多次被赋予的初始激活值的正相关统计函数,也是记忆中由该信息激活的动机状态的激活值的正相关函数;
    机器决策***,包括使用输入信息,通过联想激活方式激活相关记忆;机器通过联想激活方式限定了自己对记忆的搜索范围和对自身决策的评估范围,并根据这些动机和动机状态可能发生的概率,来统计可能给自己带来的“利”和“害”的大小和概率;包括区分自我和非我,并基于自己的经验和趋利避害的原则来预测自我和非我可能的响应,并进一步迭代评估这些响应可能给自己带来的“利”和“害”的大小和概率;机器通过这样的迭代评估来确定事件可能的发展方向和事件发展方向对自身的影响;机器按照趋利避害的方式,采取响应来提高使得事件发展方向趋向自身有利的方向发展的概率,来降低使得事件发展方向趋向对自身有害的方向发展的概率;机器在执行响应中,需要随时更新机器内外信息,并采用一样的决策方法来修正自己的响应,目标依然是提高使得事件发展方向趋向对自身有利的方向发展的概率,降低使得事件发展方向趋向对自身有害的方向发展的概率;
    机器执行***类似于机器决策***,差异在于机器的执行***是对机器决策***产生的目标进行具体化;具体化的过程也是机器的一种决策,这种决策是把具体的目标通过逐层细分到机器的底层驱动命令层面,直到机器可以直接执行的指令为止;机器在逐层细分时,采用的也是机器的决策***;
    机器的预置响应***,包括以下全部或者部分功能***:机器的情绪和动机状态之间的关联***、情绪和情绪表达之间的关联***,机器预置的发音和动作***、预置的条件反射***等;这些***主要采用预置程序和可以通过学习获得的经验来调整的参数构成。
  14. 一种建议通用人工智能的方法,其特征在于,包括:
    机器把输入数据映射到局部共有特征;机器把局部共有特征映射到概念;机器通过关系网络建立机器的动机需求和外界信息之间的关系,并利用这种关系来做预测、决策和响应。
PCT/CN2020/000154 2020-07-20 2020-07-20 一种建立强人工智能的方法 WO2022016299A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/000154 WO2022016299A1 (zh) 2020-07-20 2020-07-20 一种建立强人工智能的方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/000154 WO2022016299A1 (zh) 2020-07-20 2020-07-20 一种建立强人工智能的方法

Publications (1)

Publication Number Publication Date
WO2022016299A1 true WO2022016299A1 (zh) 2022-01-27

Family

ID=79729524

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/000154 WO2022016299A1 (zh) 2020-07-20 2020-07-20 一种建立强人工智能的方法

Country Status (1)

Country Link
WO (1) WO2022016299A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107924384A (zh) * 2015-03-11 2018-04-17 阿雅斯迪公司 用于使用预测学习模型预测结果的***和方法
US20190206026A1 (en) * 2018-01-02 2019-07-04 Google Llc Frame-Recurrent Video Super-Resolution
US20190384303A1 (en) * 2018-06-19 2019-12-19 Nvidia Corporation Behavior-guided path planning in autonomous machine applications
CN110599789A (zh) * 2019-09-17 2019-12-20 北京心中有数科技有限公司 一种道路气象预测方法、装置、电子设备及存储介质
CN110632931A (zh) * 2019-10-09 2019-12-31 哈尔滨工程大学 动态环境下基于深度强化学习的移动机器人避碰规划方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107924384A (zh) * 2015-03-11 2018-04-17 阿雅斯迪公司 用于使用预测学习模型预测结果的***和方法
US20190206026A1 (en) * 2018-01-02 2019-07-04 Google Llc Frame-Recurrent Video Super-Resolution
US20190384303A1 (en) * 2018-06-19 2019-12-19 Nvidia Corporation Behavior-guided path planning in autonomous machine applications
CN110599789A (zh) * 2019-09-17 2019-12-20 北京心中有数科技有限公司 一种道路气象预测方法、装置、电子设备及存储介质
CN110632931A (zh) * 2019-10-09 2019-12-31 哈尔滨工程大学 动态环境下基于深度强化学习的移动机器人避碰规划方法

Similar Documents

Publication Publication Date Title
CN110851760B (zh) 在web3D环境融入视觉问答的人机交互***
US20180314942A1 (en) Scalable framework for autonomous artificial intelligence characters
WO2021226731A1 (zh) 一种模仿人类记忆来实现通用机器智能的方法
CN111553467B (zh) 一种实现通用人工智能的方法
Gorniak et al. Situated language understanding as filtering perceived affordances
CN112115246A (zh) 基于对话的内容推荐方法、装置、计算机设备及存储介质
WO2021223042A1 (zh) 一种类似于人类智能的机器智能实现方法
Cuayáhuitl A data-efficient deep learning approach for deployable multimodal social robots
US11715291B2 (en) Establishment of general-purpose artificial intelligence system
CN111046157B (zh) 一种基于平衡分布的通用英文人机对话生成方法和***
CN112215346B (zh) 一种实现类人通用人工智能机器的方法
WO2018195307A1 (en) Scalable framework for autonomous artificial intelligence characters
WO2022016299A1 (zh) 一种建立强人工智能的方法
CN114492465B (zh) 对话生成模型训练方法和装置、对话生成方法、电子设备
CN113962353A (zh) 一种建立强人工智能的方法
Rajesh et al. Development of Powered Chatbots for Natural Language Interaction in Metaverse using Deep Learning with Optimization Techniques
CN115204186A (zh) 具有用于响应选择的以事件为中心的常识性知识的神经表示的***和方法
Tawiah Machine Learning and Cognitive Robotics: Opportunities and Challenges
WO2022109759A1 (zh) 一种类人通用人工智能的实现方法
CN112016664A (zh) 一种实现类人通用人工智能机器的方法
Taylor et al. Toward a hybrid cultural cognitive architecture
Yue A world-self model towards understanding intelligence
Sabry Symbolic Artificial Intelligence: Fundamentals and Applications
CN114127748A (zh) 具身智能体中的记忆
Egges et al. Imparting individuality to virtual humans

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20946421

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20946421

Country of ref document: EP

Kind code of ref document: A1