WO2020130687A1

WO2020130687A1 - System and method for automated execution of user-specified commands

Info

Publication number: WO2020130687A1
Application number: PCT/KR2019/018132
Authority: WO
Inventors: Artem Mikhailovich GRACHEV; Alexey Yurievich NEVIDOMSKIY; Alexander Vadimovich PODOLSKIY; Dmitry Alexandrovich LIPIN; Ilya Sergeevich POLENOV; Irina Igorevna PIONTKOVSKAIA; Maxim Stanislavovich KODRYAN
Original assignee: Samsung Electronics Co., Ltd.
Priority date: 2018-12-19
Filing date: 2019-12-19
Publication date: 2020-06-25
Also published as: RU2701090C1

Abstract

The disclosure invention relates to automated execution by an electronic device of commands specified by user in natural language. With an electronic device comprising the disclosure system, user can simply specify commands in usual language and obtain execution of the commands as the result. The system comprises agent, predictor, database, command processing unit. According to the method, agent explores current state of environment, chooses an action from a plurality of actions. Based on said current state of target environment and action, chosen by agent, predictor predicts next state of the environment, agent executes the chosen action, and upon receiving information about real state of said environment, compares real state of said environment with prediction, calculates predictor error, and based on the calculated error learns to find states with maximum predictor error; based on calculated error, predictor learns to minimize error; database stores map of said environment received from agent; command processing unit processes at least one command in accordance with map of said environment received from database, generates scenario script for execution of at least one command, and sends scenario script for execution of at least one command to agent for execution. The disclosure system can be used in electronic devices.

Description

SYSTEM AND METHOD FOR AUTOMATED EXECUTION OF USER-SPECIFIED COMMANDS

The disclosure relates to automated execution by an electronic device of commands specified by user in natural language.

At present, there are a great number of applications where user commands are specified in natural language, in particular with words, gestures, facial expressions. Such applications can be used in various smart devices, e.g. smartphones, home robot assistants, etc.

Scenario of user interaction with an electronic device application generally consists in that the user issues informal commands with voice, gestures, or facial expressions. The application recognizes the user commands and executes them. The application is pretrained on servers and in its operation almost no online algorithms are used to learn directly from feedback, i.e. to learn online when upon trying to execute a user command the application receives feedback from the user. To date, there are no effective algorithms for online learning of electronic device applications.

The feature of conventional solutions is that they can only be pretrained, and the conventional applications can be used in electronic devices only after such separate training.

US 20130254139 A1 (publ. 26.09.2013) discloses systems and methods for building a universal intelligent assistant with learning capabilities. The disclosed systems and methods relate to building an intelligent assistant that can take in human requests/commands in simple text form and in natural language format and perform tasks specified by the user. Before the use, the user should teach the assistant through the provided user interface and/or by referring to some knowledge that the assistant already knows. The taught assistant may generate more generic knowledge based on what it learns, and can apply the more generic knowledge to serve requests that it has never seen and never directly learned, and can revise and improve the knowledge according to execution result/feedback. The document does not describe methods for learning the assistant during its operation i.e. online.

US 20140095931 A1 (publ. 03.04.2014) discloses a method and system for automating the process of testing electronic devices. The invention provides a device test automation framework for automating the process of testing embedded systems. The device test automation framework (DTAF) allows the user to test embedded device software using test scripts, which can capture various interfaces of the device under test. A Graphical User Interface (GUI tool) is created based on the device under test configuration and user input. The GUI tool shows various interfaces of the device under test. Device test automation framework hardware enables communication between the test tool and the device under test. DTAF allows testing process to dramatically improve productivity, effectiveness, efficiency and coverage of embedded software testing. The drawback is in completely not using machine learning, i.e. not tolerant to changes in user interface.

Currently, there are two conventional approaches to testing smartphones.

The first, straightforward approach is based on user reading the manual on his own and testing his smartphone following the instructions. Although the human user is capable of self-learning and making decisions in unusual situations, a lot of time is spent for reading and reproducing the instructions; it is also evident that there is the factor of human errors also taking time to realize and correct. Furthermore, a human cannot simultaneously perceive and reproduce multiple instructions at the same time.

The second approach involves the use of automated tools that take instructions in formal language format, i.e. commands come in the form of instructions from a predetermined set of instructions (e.g. a set of mobile phone commands, a set of robot motion commands), and these instructions are straightforward, i.e. not allow multiple interpretations, and do not require direct user participation (human only uses the result). Automated tools can simultaneously take and reproduce multiple instructions at the same time. Moreover, the use of automated tools is free of error factor. However, conventional automated tools are unable to cope with unusual situations that were not previously trained.

There is provided an intelligent system for execution of commands by an electronic device, having two key features that distinguish it from conventional applications:

the system automatically collects information online about structure of the environment in which commands are to be executed, e.g. a robot butler itself collects information about a room before doing work, while e.g. a smartphone testing application itself collects information about the smartphone before work.

Using the collected information, such devices can execute commands that are specified by a human.

The disclosure system can generate e.g. a map of mobile phone application and use the map to execute human commands upon receiving them.

There is provided a system for automated execution of at least one command by an electronic device, comprising: an agent configured to collect information about the environment where at least one command is to be executed and to build a map of said environment based on the information collected; a predictor configured to predict information that the agent is collecting, wherein the agent and the predictor are able to learn based on the information received by the agent upon the prediction; a database configured to receive the map of said environment from the agent, store the map of said environment, update the map of said environment upon receiving a new map of said environment from the agent; a command processing unit configured to receive and process at least one command in accordance with the map of said environment, and generate a scenario script for execution of the at least one command; wherein the agent is further configured to perform the scenario script for execution of the at least one command, received from the command processing unit. Furthermore, the agent is configured to calculate prediction error of the predictor. Furthermore, the agent learns to find states of said environment with maximum predictor error on the basis of the calculated prediction error. Furthermore, the predictor learns to minimize the prediction error on the basis of the calculated prediction error. Furthermore at least one command is a user command. Furthermore, user command comprises visual information, or sound information, or text information. Furthermore, the database is a neural network.

There is also provided a method of operating a system for automated execution of commands by an electronic device, comprising the steps of:

(a) an agent:

exploring current state of the environment where at least one command is to be executed;

choosing an action from a plurality of actions, and

before executing the action, sending the current state of the environment and the chosen action to the predictor;

(b) a predictor:

based on said current state of the target environment and the action chosen by the agent, predicting the next state of said environment;

sending the prediction to the agent;

(c) the agent:

executing the chosen action, and upon receiving information about real state of said environment:

comparing the real state of said environment with the prediction received from the predictor;

calculating predictor error;

based on the calculated error, learning to find states of said environment with maximum predictor error;

(d) based on the calculated error, the predictor learning to minimize the error;

(e) after each action, the agent updating the map of said environment by entering updating information into the map of said environment and sending the map of the said environment to the database;

(f) the database storing the map of said environment received from the agent;

(g) a command processing unit:

processing at least one command in accordance with the map of said environment received from the database;

generating a scenario script for execution of the at least one command based on the processing of at least one command and the map of said environment;

transferring the scenario script for execution of the at least one command to the agent for execution;

wherein

in case the map of said environment, stored in the database, is coincident with the current state of the environment where the at least one command is to be executed, the agent performs the scenario script for execution of the at least one command;

in case the map of said environment, stored in the database, is not coincident with the current state of the environment where the at least one command is to be executed, steps (a) to (g) are repeated.

Furthermore, at least one command is a user command. Furthermore, user command comprises visual information, or sound information, or text information.

There is also provided an electronic device comprising a system for automated execution of at least one command by an electronic device, comprising: an agent configured to collect information about the environment where at least one command is to be executed and to build a map of said environment based on the information collected; a predictor configured to predict information that the agent is collecting, wherein the agent and the predictor are able to learn based on the information received by the agent upon the prediction; a database configured to receive the map of said environment from the agent, store the map of said environment, update the map of said environment upon receiving a new map of said environment from the agent; a command processing unit configured to receive and process at least one command in accordance with the map of said environment, and generate a scenario script for execution of the at least one command; wherein the agent is further configured to perform the scenario script for execution of the at least one command, received from the command processing unit.

This disclosure provides that the system automatically collects information online about structure of the environment in which commands are to be executed. After using the collected information, such devices can execute commands that are specified by a human.

The above and other features and advantages of the disclosure will be better understood from the following description with reference to the drawings, in which:

Fig. 1 is a schematic diagram of two work modes of a system for automated execution by an electronic device of commands specified in natural language.

Fig. 2 is a schematic diagram of a first work mode of a system for automated execution by an electronic device of commands specified in natural language.

Fig. 3 is a schematic diagram of a second work mode of a system for automated execution by an electronic device of commands specified in natural language.

Fig. 4 illustrates how the target environment map is changing on an example of application of the disclosure system in testing a mobile phone.

Fig. 5 is a schematic diagram of a second work mode of the system in testing a mobile device.

Fig. 6 is a schematic diagram of a second work mode of the system in operation of a robot cleaner.

Fig. 7 is a schematic diagram of a second work mode of the system when used as an electronic assistant.

The disclosure invention enables using natural human language, as well as language of gestures and facial expressions, to specify commands. Using an electronic device containing the disclosure system, a user can simply specify commands in usual language and get result in the form of command execution. The disclosure invention enables learning the electronic device, which executes commands, just at the instant of receiving the command, and the electronic device can also solve unusual situations that were not incorporated during pretraining period, which is provided owing to updating the map of target environment and the use of methods based on reinforcement training (trial and error methods).

There is provided a method and a system for automated execution by an electronic device of commands specified by the user in natural language.

By the term "natural language", as used herein, is meant human speech, human gestures and human facial expressions.

The disclosure system receives data from a source of a command that may be visual, sound, textual, and a space (target environment) in which the system should execute commands.

The term "target environment", as used herein, will refer to the environment in which user commands are executed. It can be e.g. a smartphone application that needs to be tested or with which other actions are to be performed, or a room in which a robot must clean, etc.

The disclosure system contained in an electronic device instructs the electronic device to explore the target environment for which commands will be issued, and builds a map (structure)'s features of the target environment. This is attained through preliminary machine learning of units of the system disposed in the electronic device, or with assistance of the user who can enter data through user interface for generating a map or structure of features of the target environment.

In other words, units of the system are pretrained through machine learning to automatically explore target environment and generate a map or structure of the target environment. Furthermore time, the electronic device containing the disclosure system is ableto explore the target environment and detect its features and generate a target environment map using this system. In addition, the disclosure system can solve the task of studying target environment and generating a map of the target environment through interaction with user interface, i.e. the target environment map can be modified at any instant as needed by the user. The disclosure system instructs the electronic device to execute commands issued by the user in natural language, using the target environment map. Moreover, the electronic device will interact with the target environment according to the target environment map, which can be modified by the system at any time in response to user interests, e.g. through user interface, and automatically, periodically revising the target environment, e.g. at specific intervals that can be set by the user.

The system can find semantic relations between map elements and is able to follow imprecise human commands in the form of natural language. That is, commands that require a large number of identical actions can be automated.

The disclosure approach does not require special software designed for each individual task to be performed by an electronic device.

Fig. 1 is a schematic diagram of two work modes of a system for automated execution by an electronic device of commands specified by the user in natural language.

The disclosure system can operate in two key modes. In a first mode, the system generates and updates a map of target environment; in a second mode, the system recognizes and executes user commands in natural language in accordance with the target environment map generated in the first mode, or by updating the target environment map just upon receiving the user command.

Therefore:

- in the first mode, the system is exploring the target environment in the most effective way and builds a map of the target environment;

- in the second mode, the system is recognizing and executing user command based on the target environment map.

In the second mode, the system uses user commands and the target environment map as input information. In this mode, the system can also update the target environment map from user feedback.

The disclosure system can also execute user commands from scratch without previously generated target environment map. In this case, the system can build the target environment map online, i.e. simultaneously with receiving commands from the user.

Fig. 2 is a schematic diagram of a first work mode of the system for automated execution of electronic commands specified in natural language. The first work mode comprises machine learning to automatically construct a target environment map without user supervision.

The first work mode involves operation of a first unit containing an agent, a second unit containing a predictor, and a third unit containing a database.

Database may be a neural network, or an entity where data can be stored and updated. Database can store data related to the target environment.

When operating in the first mode, the system for automated execution of user commands by an electronic device receives through the first unit (agent) from the target environment information that may be visual information, text information or sound information, depending on the type of environment in which user commands are to be executed.

Agent is a unit that can be a machine learning algorithm, e.g. a neural network, used to decide which action is to be executed in current state of the target environment.

State of the target environment is a visible (observable) predetermined part or the entire target environment at given instant. For a mobile phone, state of target environment is e.g. screen of the mobile phone and all information on its settings. For a vacuum cleaner, state of target environment is e.g. picture of the house at current time, respectively, and so on.

Predictor is a second unit that can be a machine learning algorithm, e.g. a neural network, used to predict the next state of target environment provided that current state and a planned action are known. Prediction error of the next state by the predictor is used by the agent to decide how effective the chosen action is. Large error means that this state is less explored than the state in which a small error is obtained.

In learning mode and in process of operation, at each instant the agent sees only one state of the target environment. The agent receives states of the target environment from its sensors and, based on them, generates the following action in order to pass to the next state of the target environment. The agent's task is to explore the map of the target environment. Accordingly, the agent learns to perform actions leading to the least known states (parts) of the target environment. Level of knowledge is determined just by the predictor. Each time before the agent performs an action, the predictor predicts the result of this action, i.e. to what state of the target environment the agent will passes or how one state that will be observed (will be visible) at the next instant will look like. After performing the action, the agent sees real state of the target environment. Accordingly, the agent can calculate the error between the reality and prediction; thereby both the agent and the predictor are learned.

The predictor predicts the next state of the target environment always upon receiving information about intention to perform an action from the agent. At instants when the agent observes a state dramatically different from the one predicted by the predictor, the agent updates the target environment map. Thus, the agent and the predictor are learning always and in pair. The predictor learns to predict the next state of the target environment. The agent learns to go to the least predictable states of the target environment. This happens all the time - both at the stage of building the map and at the stage of interaction with the user. If the map has been already built, it will be updated with less intensity, since predictions will more or less match the reality.

Now work of the agent and the predictor in the first work mode of the system will be described in more detail at the stage of building a map of target environment.

Mode 1. Building a map of target environment

(A) Agent explores the state of target environment, chooses an action (e.g. where to send the robot vacuum cleaner for further exploration of the target environment). Before performing the action, the agent sends the state of target environment and the chosen action to the predictor.

(B) Based on the current state of target environment and the action chosen by the agent, the predictor predicts the next state of target environment (i.e. what the agent will see after performing the action), and then sends the prediction to the agent.

(C) The agent performs the action chosen in (A) (e.g. causes the robot vacuum cleaner device to turn, etc.), and acquires real state of the target environment.

(D) The agent compares the received real state with the state received from the predictor, and then calculates the predictor error.

(E) Based on the calculated error, the agent learns to find states of the target environment with maximum predictor error. Based on the calculated error, the predictor learns to minimize the prediction error.

(F) All states of the target environment, detected by the agent, are recorded in the map of the target environment alternately, just in the sequence in which the agent chooses them, and this is important point because the agent tries to choose the most relevant, less known states of the target environment by searching for states with maximum error.

The agent generally receives information about the target environment from sensors that are defined directly by the target environment.

The predictor takes the current state of the target environment and the action that the agent is going to perform as input. Each environment has its own set of actions. For example, for the task of testing a phone, the set of actions may comprise: press a button, swipe a screen, and so on. For a robot, the set of actions comprises move left, right, straight, back, in the current state. From this data received from the agent, the predictor tries to predict the next state of the target environment. After performing the action, the agent receives real state, sends it to the predictor, and the predictor compares the real state with the already predicted one. Thereby, the agent determined the prediction error of the predictor.

When taking information from the target environment, the agent is in fixed state. Having noticed that changes have occurred in the target environment, the agent switches to the state of exploration of the part of the target environment where the changes have occurred, i.e. the state of the target environment, in order to register the changes. The change that occurred in the target environment is determined by processing the input signal corresponding to the environment by the agent.

The predictor predicts what will be detected by the agent, i.e. what information will come from the agent at the next instant. Then the agent receives real information. Predicted information and real information from the agent enter the database. The predictor does not interact directly with the database. The agent writes the observed states to the database in any form, which may depend on the implementation of the database or another structure used to store the map.

Incorrect prediction by the predictor and correct information from the agent are used to learn the agent and predictor.

Based on the observed states of the target environment, the agent generates an updated map of the target environment. The target environment map can be updated both in the mode of collecting information about the environment and in the mode of executing a user command.

Fig. 3 is a schematic diagram of a second work mode of the system for automated execution by an electronic device of commands specified in natural language.

In the second work mode, the system recognizes and executes the user command (s) issued by the user in natural language.

In the second work mode of the system, input data in the form of user commands and the target environment map from the database are entered into a command processing unit.

The command processing unit generates a language model suitable for recognizing the command. Then, the command processing unit generates a scenario script comprising a set of commands to be executed directly on the electronic device used by the system for the respective target environment. The scenario script is generated on the basis of the user commands and the target environment map in the command processing unit.

The scenario script is transferred from the command processing unit to the agent that performs the scenario script. The scenario is performed using the most advanced and effective methods for scenario execution. For each state of the target environment the most appropriate commands are determined, which will match the instructions received. Primarily, the system tries to execute these particular instructions, and if the system cannot execute the instructions, it searches for the semantically nearest ones based on the generated map.

The target environment map is constantly updated in the database, and the agent is responsible for updating the target environment map; this process can be carried out in parallel with performing the scenario i.e. online, just when the electronic device is interacting with the user.

Now a work mode of the system, in which user commands are executed, will be described in more detail.

Mode 2. Execution of user commands

(A) User command(s) and the target environment map from the database are input into the command processing unit.

B) Based on the user command and the target environment map, the command processing unit generates a scenario script to execute the user command.

(C) The scenario script (next action) is transferred to the agent, and the agent sends the scenario script to the predictor.

(D) The agent and the predictor see the state of the target environment.

If the target environment map received by the command processing unit from the database and used for generating the scenario script matches real state of the target environment at the instant the agent performs the scenario script, the agent performs the scenario script.

However, if the scenario script at the instant is performed by the agent the real state of the target environment does not match the map received by the command processing unit from the database and used to generate the scenario script:

the agent determines that the scenario script cannot be executed;

the first work mode of the system is enabled; as a result, the agent receives real state and the state predicted by the predictor, compares them, calculates error and rebuilds the target environment map;

sends the target environment map to the command processing unit and to the database;

based on the new target environment map and the same user command, the command processing unit generates a new scenario script;

the new scenario script is transferred to the agent, and the agent performs it.

Fig. 4 illustrates how the target environment map is changing on the example of application of the disclosure system in mobile phone test.

Suppose a user instructs the system to automatically execute by an electronic device commands to find "Font size" option in the mobile phone menu.

The left side of Fig. 4 shows a target environment map stored in the system at the instant of issuance of the command by the user. Initially the system starts using the target environment map stored in the database; upon issuance of the command by the user the second module is responsible for direct execution of the command by the electronic device. Since no "Font size" option has been found in the mobile phone when using the target environment map, the mode of exploring the target environment and updating the target environment map is automatically enabled. That is, the first work mode of the system, described above, is activated. The target environment map is updated in the database. The right side of Fig. 4 shows the updated map; then the system switches to the second work mode to execute the command execution scenario.

The system may include a survey mode, upon activation of which the target environment map can be updated periodically.

Fig. 5 shows an example of a second work mode of the system in testing a mobile device. Input to the command processing unit is a test script i.e. a description of error reproduction process, or a scenario on the phone, which is specified in ordinary human language, and a target environment map received from the database. In this particular case, target environment is the mobile phone application that needs to be tested. A scenario script, in this case a list of commands to be run on the mobile phone in order to execute each of the commands of the test script, is generated in the command processing unit based on the application map and the test script. The resulting list of commands is sent from the command processing device to the agent. The agent instructs the smartphone application to reproduce the commands one after the other or all at the same time, depending on the type of the command list received. Therefore, the smartphone is tested without user supervising.

Fig. 6 is a schematic diagram of a second work mode of the system in operation of a robot cleaner. The database periodically updates the target environment map. The user issues visual and/or voice commands to the data processing unit. If voice commands are received from the user, a voice command processing device is used to process them, while a visual command processing device is used to process visual commands; both the devices are provided in the command processing unit. Updated target environment map is also input to the command processing unit. In this case, the target environment is the room in which the robot butler is to execute user commands. Having processed the user command and the target environment map, the command processing unit generates a scenario script; the data is transmitted to the agent that outputs appropriate set of commands to the robot butler. The robot butler moves to the proper place and performs the required action e.g. cleaning.

If the next command does not lead to expected result, or if the predictor cannot predict expected result, this means that the target environment map has changed. This fact is detected by the agent that interacts with the predictor. If during the execution of commands by the robot it is detected that the target environment map has changed, the map is updated just before being transferred to the command processing unit.

Conventional robot butlers do not use machine learning; therefore, they can only execute a programmed set of commands and cannot respond to unusual situation.

For example, the user wishes that the robot brought an item. If the robot cannot find the item in the place stored in its database, it will not be able to complete the task because the robot cannot learn and adapt to unknown situations.

The robot butler, using the disclosure system for automated execution by an electronic device of commands specified by the user in natural language, will be able to adapt to change in locations of items online i.e. upon receiving a command from the user.

In the final product using the disclosure system for automated execution by an electronic device of commands specified by the user in natural language, the robot butler will be able to perform the following:

Stage 1. House map exploration:

Robot explores surrounding (target) environment (e.g. house), updates map in the database, trains neural network: builds associations (kitchen - for eating, bedroom - for sleep, etc.), learns similarities and relations of items, and so on. It should be noted that both the agent and the predictor comprise software modules that are machine learning algorithms, accordingly, each of them can be a neural network or another self-supervised machine learning algorithm.

Stage 2. Commands execution

Robot is trained and is ready to execute commands.

Command processing unit, e.g. containing a neural network, translates user commands into sequence of actions - scenario, and sends the scenario to the agent. Robot follows the created scenario. If something goes wrong at any step, robot will be able to adapt owing to the associations generated at the learning stage and available in the neural network, as well as owing to the fact that the house map can be updated just when the robot is performing the task.

Robot must periodically update its knowledge owing to the availability of the agent and the predictor.

Fig. 7 is a schematic diagram of another example of a second work mode of the system when used as an electronic assistant. In this case, target environment will be a set of user's preferences and interests. Information is collected by the agent during watching user's preferences, e.g. while the user is surfing the Internet, i.e. statistics on websites he visits, videos he watches, music he listens, etc. Then, upon sorting the collected information using the statistics displaying user's preferences, the agent builds a map of user's interests.

The map of user's interests is periodically updated in the database.

User's visual and/or voice commands and the map of user's interests are input into the command processing unit. In this case, user commands may concern e.g. the choice of news on one topic or music tracks, films, or the choice of goods preferred by the user in online shop, or the choice of hotel, etc.

Having received commands from the user and the map of interests, the command processing device builds a scenario script and sends it to the agent. The agent issues to the electronic device commands corresponding to the scenario script on the choice of e.g. news, music tracks or goods, etc. or any recommendations that match the user's request.

Only some examples of possible application of the invention are given herein. The disclosure system for automated execution by an electronic device of commands specified by user in natural language can find application in almost all spheres of human life where electronic devices are used.

Although the invention has been described in conjunction with some illustrative embodiments, it is to be understood that the invention is not limited to these specific embodiments. On the contrary, it is evident that the invention embraces all alternatives, modifications and equivalents that can fall within the spirit and scope of the appended claims.

Furthermore, the invention retains all equivalents of the claimed invention, even if the claims are amended during the examination process.

Claims

A system for automated execution of at least one command by an electronic device, comprising:

an agent configured to collect information about the environment where at least one command is to be executed and to build a map of said environment based on the information collected;

a predictor configured to predict information that the agent is collecting, wherein the agent and the predictor are able to learn based on the information received by the agent upon the prediction;

a database configured to receive the map of said environment from the agent, store the map of said environment, update the map of said environment upon receiving a new map of said environment from the agent;

a command processing unit configured to receive and process at least one command in accordance with the map of said environment, and generate a scenario script for execution of the at least one command;

wherein the agent is further configured to perform the scenario script for execution of the at least one command, received from the command processing unit.
A system according to claim 1, wherein the agent is configured to calculate prediction error of the predictor.
A system according to claim 2, wherein the agent learns to find states of said environment with maximum predictor error on the basis of the calculated prediction error.
A system according to claim 2, wherein the predictor learns to minimize the prediction error on the basis of the calculated prediction error.
A system according to any one of claims 1 to 4, wherein at least one command is a user command.
A system according to claim 5, wherein user command comprises visual information, or sound information, or text information.
A system according to any one of claims 1 to 6, wherein the database is a neural network.
A method of operating a system for automated execution of commands by an electronic device, comprising the steps of:

(a) an agent:

exploring current state of the environment where at least one command is to be executed;

choosing an action from a plurality of actions, and

before executing the action, sending the current state of the environment and the chosen action to the predictor;

(b) a predictor:

based on said current state of the target environment and the action chosen by the agent, predicting the next state of said environment;

sending the prediction to the agent;

(c) the agent:

executing the chosen action, and upon receiving information about real state of said environment:

comparing the real state of said environment with the prediction received from the predictor;

calculating predictor error;

based on the calculated error, learning to find states of said environment with maximum predictor error;

(d) based on the calculated error, the predictor learning to minimize the error;

(e) after each action, the agent updating the map of said environment by entering updating information into the map of said environment and sending the map of the said environment to the database;

(f) the database storing the map of said environment received from the agent;

(g) a command processing unit:

processing at least one command in accordance with the map of said environment received from the database;

generating a scenario script for execution of the at least one command based on the processing of at least one command and the map of said environment;

transferring the scenario script for execution of the at least one command to the agent for execution;

wherein

in case the map of said environment, stored in the database, is coincident with the current state of the environment where the at least one command is to be executed, the agent performs the scenario script for execution of the at least one command;

in case the map of said environment, stored in the database, is not coincident with the current state of the environment where the at least one command is to be executed, steps (a) to (g) are repeated.
A method according to claim 8, wherein at least one command is a user command.
A method according to claim 9, wherein user command comprises visual information, or sound information, or text information.
An electronic device comprising a system according to any one of claims 1 to 7.