CN117437916A - Navigation system and method for inspection robot - Google Patents

Navigation system and method for inspection robot Download PDF

Info

Publication number
CN117437916A
CN117437916A CN202311331200.0A CN202311331200A CN117437916A CN 117437916 A CN117437916 A CN 117437916A CN 202311331200 A CN202311331200 A CN 202311331200A CN 117437916 A CN117437916 A CN 117437916A
Authority
CN
China
Prior art keywords
inspection robot
navigation
voice
navigation voice
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311331200.0A
Other languages
Chinese (zh)
Inventor
程军强
曹德政
杨参
史伟
朱连杰
岑永超
徐晓莉
李国峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Eurasia Hi Tech Digital Technology Co ltd
Original Assignee
Eurasia Hi Tech Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eurasia Hi Tech Digital Technology Co ltd filed Critical Eurasia Hi Tech Digital Technology Co ltd
Priority to CN202311331200.0A priority Critical patent/CN117437916A/en
Publication of CN117437916A publication Critical patent/CN117437916A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1815Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Manipulator (AREA)

Abstract

The utility model discloses a patrol robot navigation system and method, it is through gathering the navigation voice signal of patrol robot of user input in real time to introduce signal processing and analysis algorithm in the rear end and carry out patrol robot navigation voice signal's signal analysis, so carry out the type judgement of robot navigation instruction, thereby control patrol robot and carry out corresponding action, in order to improve patrol robot navigation system's performance and intelligent degree, thereby promoted user's experience sense and practical application nature.

Description

Navigation system and method for inspection robot
Technical Field
The present application relates to the field of intelligent navigation, and more particularly, to a navigation system and method for a patrol robot.
Background
The inspection robot is an intelligent robot capable of autonomously inspecting and monitoring the state of equipment, and is generally used in the fields of industry, warehouse, medical treatment, etc. for performing inspection tasks such as inspecting the running state of equipment, checking the failure of equipment, etc.
In the inspection task of the inspection robot, a navigation system is a key part for ensuring that the inspection robot completes the task, and is responsible for navigating the robot from one position to another to complete the inspection task. However, conventional inspection robot navigation systems typically navigate based on pre-set paths or landmarks, and such rigid planning approaches are sensitive to environmental changes. If the environment changes, for example, a new obstacle appears or the path is blocked, the robot may not be able to adapt to the new situation and successfully navigate. Also, conventional navigation systems typically use sensors (e.g., lidar, cameras, etc.) to sense the environment and make navigation decisions based on the sensing results. However, the sensing range and accuracy of the sensor are limited, and the real state of the environment may not be completely acquired, thereby affecting the accuracy and safety of navigation. Therefore, in some cases, a user operation is required to perform navigation control of the inspection robot.
However, in the conventional inspection system of the user-controlled robot, the operation is generally performed by means of buttons, a remote controller or a touch screen, and the interaction is relatively cumbersome, and is not intuitive and natural. The user may need to be trained to use the navigation system, limiting the ease of use and popularity of the robot. In addition, some ways of interaction through voice have problems, for example, semantic understanding in a traditional navigation system is generally based on simple keyword matching or rule matching, and cannot accurately understand complex semantic meaning. This results in a robot with limited understanding of the instructions of the user, possibly leading to misleading or erroneous navigation behavior.
Accordingly, an optimized inspection robot navigation system is desired.
Disclosure of Invention
The present application has been made in order to solve the above technical problems. The embodiment of the application provides a navigation system and a method for a patrol robot, which are used for collecting navigation voice signals of the patrol robot input by a user in real time, introducing a signal processing and analyzing algorithm to the rear end to analyze the signals of the navigation voice signals of the patrol robot, judging the type of a robot navigation instruction according to the signals, and controlling the patrol robot to execute corresponding actions so as to improve the performance and the intelligent degree of the navigation system of the patrol robot, thereby improving the experience and the practical applicability of the user.
According to one aspect of the present application, there is provided a patrol robot navigation system, comprising:
the navigation voice signal acquisition module is used for acquiring navigation voice signals of the inspection robot input by a user;
the voice recognition module is used for carrying out voice recognition on the navigation voice signal of the inspection robot so as to obtain a navigation voice text of the inspection robot;
the navigation voice semantic understanding module is used for carrying out semantic understanding on the navigation voice text of the inspection robot so as to obtain navigation voice semantic coding characteristics of the inspection robot;
and the navigation instruction type detection module is used for determining a navigation instruction type label based on the navigation voice semantic coding characteristics of the inspection robot.
According to another aspect of the present application, there is provided a navigation method of a patrol robot, including:
acquiring a navigation voice signal of the inspection robot input by a user;
performing voice recognition on the inspection robot navigation voice signal to obtain an inspection robot navigation voice text;
carrying out semantic understanding on the navigation voice text of the inspection robot to obtain navigation voice semantic coding features of the inspection robot;
and determining a navigation instruction type label based on the navigation voice semantic coding characteristics of the inspection robot.
Compared with the prior art, the inspection robot navigation system and the inspection robot navigation method provided by the application have the advantages that the inspection robot navigation voice signals input by a user are collected in real time, the signal processing and analyzing algorithm is introduced into the rear end to conduct signal analysis of the inspection robot navigation voice signals, so that the type judgment of the robot navigation instruction is conducted, the inspection robot is controlled to execute corresponding actions, the performance and the intelligent degree of the inspection robot navigation system are improved, and the experience and the practical applicability of the user are improved.
Drawings
The foregoing and other objects, features and advantages of the present application will become more apparent from the following more particular description of embodiments of the present application, as illustrated in the accompanying drawings. The accompanying drawings are included to provide a further understanding of embodiments of the application and are incorporated in and constitute a part of this specification, illustrate the application and not constitute a limitation to the application. In the drawings, like reference numerals generally refer to like parts or steps.
FIG. 1 is a block diagram of a patrol robot navigation system according to an embodiment of the present application;
FIG. 2 is a system architecture diagram of a patrol robot navigation system according to an embodiment of the present application;
FIG. 3 is a block diagram of a navigation voice semantic understanding module in a patrol robot navigation system according to an embodiment of the present application;
FIG. 4 is a block diagram of a navigation instruction type detection module in a patrol robot navigation system according to an embodiment of the present application;
fig. 5 is a flowchart of a navigation method of the inspection robot according to an embodiment of the present application.
Detailed Description
Hereinafter, example embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application and not all of the embodiments of the present application, and it should be understood that the present application is not limited by the example embodiments described herein.
As used in this application and in the claims, the terms "a," "an," "the," and/or "the" are not specific to the singular, but may include the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.
Although the present application makes various references to certain modules in a system according to embodiments of the present application, any number of different modules may be used and run on a user terminal and/or server. The modules are merely illustrative, and different aspects of the systems and methods may use different modules.
Flowcharts are used in this application to describe the operations performed by systems according to embodiments of the present application. It should be understood that the preceding or following operations are not necessarily performed in order precisely. Rather, the various steps may be processed in reverse order or simultaneously, as desired. Also, other operations may be added to or removed from these processes.
Hereinafter, example embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application and not all of the embodiments of the present application, and it should be understood that the present application is not limited by the example embodiments described herein.
In a conventional inspection system of a user-controlled robot, the operation is usually performed by means of buttons, a remote controller, a touch screen or the like, and the interaction mode is relatively complicated, and is not intuitive and natural enough. The user may need to be trained to use the navigation system, limiting the ease of use and popularity of the robot. In addition, some ways of interaction through voice have problems, for example, semantic understanding in a traditional navigation system is generally based on simple keyword matching or rule matching, and cannot accurately understand complex semantic meaning. This results in a robot with limited understanding of the instructions of the user, possibly leading to misleading or erroneous navigation behavior. Accordingly, an optimized inspection robot navigation system is desired.
In the technical scheme of the application, a navigation system of a patrol robot is provided. Fig. 1 is a block diagram of a patrol robot navigation system according to an embodiment of the present application. Fig. 2 is a system architecture diagram of a navigation system of a patrol robot according to an embodiment of the present application. As shown in fig. 1 and 2, a patrol robot navigation system 300 according to an embodiment of the present application includes: a navigation voice signal acquisition module 310, configured to acquire a navigation voice signal of the inspection robot input by a user; the voice recognition module 320 is configured to perform voice recognition on the inspection robot navigation voice signal to obtain an inspection robot navigation voice text; the navigation voice semantic understanding module 330 is configured to perform semantic understanding on the navigation voice text of the inspection robot to obtain navigation voice semantic coding features of the inspection robot; the navigation instruction type detection module 340 is configured to determine a navigation instruction type tag based on the navigation voice semantic coding feature of the inspection robot.
In particular, the navigation voice signal collection module 310 is configured to obtain a navigation voice signal of the inspection robot input by a user. The navigation voice signal of the inspection robot is usually a sound instruction generated by a voice synthesis technology and is used for guiding the robot to navigate and execute tasks in a specific environment.
Accordingly, in one possible implementation, the inspection robot navigation voice signal input by the user may be obtained by, for example: preparing a suitable voice input device, such as a microphone or voice recognition device, so that a user may interact with the inspection robot by voice; the speech recognition system is activated to ensure that it is ready to receive and process the user's speech input. This may involve loading and configuring speech recognition models, setting recognition parameters, etc.; a user's voice input is received through a voice input device. This may be achieved by listening to a microphone or retrieving an audio stream of the speech recognition device; and transmitting the received user voice input to a voice recognition system for processing. The voice recognition system converts the voice signal into text representation, namely, converts the voice into corresponding navigation instruction text; and extracting the navigation instruction from the voice recognition result. This may involve text processing and keyword extraction techniques to identify and extract keywords or phrases related to navigation; and analyzing and understanding the extracted navigation instruction. This may include techniques of semantic parsing, intent recognition, etc., to determine the user's navigational intent and specific requirements.
In particular, the voice recognition module 320 is configured to perform voice recognition on the inspection robot navigation voice signal to obtain an inspection robot navigation voice text. It should be understood that, since the voice signal of the inspection robot is represented in a continuous acoustic waveform in the time domain, the voice signal cannot be directly understood and analyzed by the computer system, so that the voice signal needs to be converted into a corresponding text representation for facilitating subsequent semantic understanding, instruction parsing and navigation operations. Specifically, in the technical scheme of the application, the voice recognition is performed on the navigation voice signal of the inspection robot so as to obtain the navigation voice text of the inspection robot. That is, the goal of speech recognition is to achieve an equivalent conversion between speech and text, i.e., converting the information contained in the speech into a corresponding text representation.
Notably, speech recognition is a technique that converts speech signals into textual representations, which speech recognition systems can convert to input speech signals to, in order to achieve speech-to-machine interaction and understanding. This has important application value in many application fields, such as voice assistant, voice control, and automatic subtitle generation.
Accordingly, in one possible implementation, the voice recognition may be performed on the inspection robot navigation voice signal to obtain an inspection robot navigation voice text, for example: ensuring that there are voice input devices available in the system, such as microphones or voice recognition devices; initializing a speech recognition system using an appropriate speech recognition library or API in preparation for receiving and processing speech input; a user's voice input is received through a voice input device. This may be the user speaking directly or playing a voice pre-recorded by the user via a recording device; and transmitting the received voice input to a voice recognition system for processing. The speech recognition system will attempt to convert the speech to text; and acquiring a recognition result from the voice recognition system. This is typically a string containing the recognized text; cleaning and processing the recognition result, such as removing extraneous noise or correcting possible errors; and extracting text parts related to navigation of the inspection robot from the identification results after cleaning and processing.
In particular, the navigation voice semantic understanding module 330 is configured to perform semantic understanding on the inspection robot navigation voice text to obtain the inspection robot navigation voice semantic coding feature. In particular, in one specific example of the present application, as shown in fig. 3, the navigation voice semantic understanding module 330 includes: the voice text expression perfecting unit 331 is used for making the navigation voice text of the inspection robot pass through a voice instruction expression perfecting device based on an AIGC model to obtain a compensated navigation voice text of the inspection robot; the voice text semantic coding unit 332 is configured to perform semantic coding on the compensated inspection robot navigation voice text to obtain an inspection robot navigation voice semantic coding feature vector as the inspection robot navigation voice semantic coding feature.
Specifically, the voice text expression perfecting unit 331 is configured to make the inspection robot navigation voice text pass through a voice command expression perfector based on an AIGC model to obtain a compensated inspection robot navigation voice text. In consideration of the fact that the voice text of the inspection robot may have many ambiguities, duplicates or unclear expressions due to different language expression habits and modes of each user, errors may exist in the process of voice recognition. Therefore, in the technical scheme of the application, the navigation voice text of the inspection robot is further expressed and perfected through the voice command based on the AIGC model to obtain the navigation voice text of the compensation inspection robot. It should be appreciated that the AIGC model is an artificial intelligence based speech error correction technique that can analyze and understand the input speech text and make corrections and perfections based on context and semantic information. By inputting the navigation voice text of the inspection robot into the AIGC model, possible voice recognition errors, ambiguities or inaccurate expressions in the voice text can be corrected, so that more accurate and clear voice instructions can be obtained. In addition, by error correction and refinement processing of the AIGC model, misunderstanding or erroneous operation due to speech recognition errors or misrepresentation can be reduced. This helps to promote the user experience, enables the user to interact and navigate with the inspection robot more easily and naturally, without having to overcorrect or repeat instructions.
Accordingly, in one possible implementation, the inspection robot navigation voice text may be passed through an AIGC-model-based voice command expression perfector to obtain a compensated inspection robot navigation voice text, for example, by acquiring or training an AIGC-based voice command expression perfector model; providing the navigation voice text of the inspection robot as input to an AIGC model; the input navigation phonetic text is processed using an AIGC model. The AIGC model analyzes the grammatical, semantic, and contextual information in the text and attempts to correct possible errors or inaccuracies; and acquiring the compensated navigation voice text from the AIGC model. The text is subjected to AIGC model improvement and correction to improve the accuracy and expression definition of the voice instruction; and verifying and correcting the compensated navigation voice text. This may include manual review and proofing to ensure that the compensated text complies with the intended navigation instructions and semantics; and finally, the compensated navigation voice text of the inspection robot is obtained, and the navigation voice text is processed and verified by an AIGC model so as to improve the accuracy and the expression effect of the navigation instruction.
Specifically, the voice text semantic coding unit 332 is configured to perform semantic coding on the compensated inspection robot navigation voice text to obtain an inspection robot navigation voice semantic coding feature vector as the inspection robot navigation voice semantic coding feature. After voice recognition and semantic expression perfect processing are performed on the navigation voice signal of the inspection robot input by the user, in order to be capable of performing semantic understanding on the navigation voice signal of the inspection robot, in a specific example of the application, after word segmentation processing is further performed on the navigation voice text of the compensation inspection robot, the navigation voice text of the compensation inspection robot is encoded by a context encoder comprising a word embedding layer, so that all words in the navigation voice text of the compensation inspection robot are extracted based on global context semantic association feature information, and therefore the navigation voice semantic encoding feature vector of the inspection robot is obtained. More specifically, word segmentation processing is performed on the compensation inspection robot navigation voice text to convert the compensation inspection robot navigation voice text into a word sequence composed of a plurality of words; mapping each word in the word sequence into a word embedding vector by using an embedding layer of the context encoder comprising the word embedding layer to obtain a sequence of word embedding vectors; performing global context semantic coding on the sequence of word embedding vectors based on a converter thought by using a converter of the context encoder comprising a word embedding layer to obtain a plurality of global context semantic feature vectors; and cascading the global context semantic feature vectors to obtain the navigation voice semantic coding feature vector of the inspection robot.
Notably, the context encoder including the word embedding layer is a neural network model structure for converting text sequences into semantic representations. It is mainly composed of two components: word embedding layer and context encoder. The word embedding layer is responsible for mapping each word in the text sequence to a high-dimensional vector representation, called word embedding. These Word embedding vectors capture the semantic information and context of words and may be obtained by pre-trained Word vector models (e.g., word2Vec, gloVe) or randomly initialized Word vectors. The context encoder is a Recurrent Neural Network (RNN) or variant model for modeling context information in a text sequence. Context encoders incorporating word embedding layers are widely used in natural language processing tasks such as text classification, emotion analysis, machine translation, etc. It is capable of converting an original discrete text sequence into a continuous semantic representation, providing a richer feature expression and context understanding capability.
It should be noted that, in other specific examples of the present application, the compensation inspection robot navigation voice text may be semantically encoded in other manners to obtain an inspection robot navigation voice semantic encoding feature vector as the inspection robot navigation voice semantic encoding feature, for example: preprocessing the navigation voice text of the inspection robot, including noise removal, punctuation mark and special character processing, case-to-case conversion and the like; dividing the processed text into words or phrases, and performing lexical analysis. This may be achieved by word segmentation techniques (e.g., space segmentation, rule-based segmentation, statistical segmentation, etc.); and analyzing the text after word segmentation to capture the grammatical relation and the syntactic structure among the words. This may be accomplished using natural language processing tools or parsing algorithms; during analysis, entities in the text, such as places, names of people, dates, etc., are identified and marked. This may be accomplished using named entity recognition algorithms or pre-trained entity recognition models; semantic parsing is performed on the text subjected to lexical analysis, grammar analysis and entity recognition to extract semantic information and intention of the text. This can be accomplished using natural language processing techniques (e.g., semantic role labeling, semantic dependency analysis) or pre-trained semantic parsing models; and generating semantic coding feature vectors of navigation voice of the inspection robot according to the result of semantic analysis. This vector may be a dense vector of fixed dimensions, where each dimension represents a different semantic feature or semantic category.
It should be noted that, in other specific examples of the present application, the text of the navigation voice of the inspection robot may be semantically understood in other manners to obtain the semantic coding feature of the navigation voice of the inspection robot, for example: preprocessing the navigation voice text of the inspection robot, including removing punctuation marks, converting into lower case letters and the like. This helps unify text formats, reducing noise and interference; the text is decomposed into words or phrases and lexically analyzed. This may use Natural Language Processing (NLP) tools or techniques, such as word segmenters, to decompose text into meaningful words; the parsed text is parsed to understand the relationships between words and sentence structure. This can be implemented using a syntax analyzer or a dependency analyzer in NLP technology; named entities in the text, such as places, times, names of people, etc., are identified. This may be achieved using entity identifiers in NLP technology; the analyzed text is converted into a semantic representation to further understand the meaning of the text. This can be achieved using a semantic parser or semantic role annotators in NLP technology; from the semantic representation, the meaning and intent of the text is understood. This may involve the application of domain knowledge and matching text to predefined semantic templates or rules; and generating navigation voice semantic coding features of the inspection robot according to the semantic understanding result. This may be a vector or feature representation representing semantic information of the navigation speech.
In particular, the navigation instruction type detection module 340 is configured to determine a navigation instruction type tag based on the navigation voice semantic coding feature of the inspection robot. In particular, in one specific example of the present application, as shown in fig. 4, the navigation instruction type detection module 340 includes: the feature distribution optimizing unit 341 is configured to perform hilbert orthogonal spatial domain representation decoupling on the navigation voice semantic coding feature vector of the inspection robot to obtain an optimized navigation voice semantic coding feature vector of the inspection robot; the navigation instruction judging unit 342 is configured to pass the optimized inspection robot navigation voice semantic coding feature vector through a classifier to obtain a classification result, where the classification result is used to represent a navigation instruction type label.
Specifically, the feature distribution optimizing unit 341 is configured to perform hilbert orthogonal spatial domain representation decoupling on the navigation voice semantic coding feature vector of the inspection robot to obtain an optimized navigation voice semantic coding feature vector of the inspection robot. Particularly, in the technical scheme of the application, when the inspection robot navigation voice text is obtained through the voice instruction expression perfector based on the AIGC model, the voice text segment generated by the AIGC model is inconsistent in essential source text semantic distribution relative to the source voice text segment of the inspection robot navigation voice text, so that when the inspection robot navigation voice text is subjected to context-related semantic feature coding, the obtained inspection robot navigation voice semantic coding feature vector also corresponds to the voice text segment generated by the AIGC model and the source voice text segment of the inspection robot navigation voice text, diversified local feature expressions among various local feature distributions exist, and when the inspection robot navigation voice semantic coding feature vector passes through the classifier, the generalized effect of the inspection robot navigation voice semantic coding feature vector in the classification domain as a whole is affected, namely, the accuracy of the classification result is affected. Based on this, the applicant of the present application, when classifying the inspection robot navigation voice semantic coding feature vector, preferably performs hilbert orthogonal spatial domain representation decoupling on the inspection robot navigation voice semantic coding feature vector, for example denoted as V, expressed as:
wherein V is the navigation voice semantic coding feature vector of the inspection robot,is the global feature mean value of the navigation voice semantic coding feature vector of the inspection robot, and is II and V 2 Is the two norms of the navigation voice semantic coding feature vector of the inspection robot, L is the length of the navigation voice semantic coding feature vector of the inspection robot, I is a unit vector, and->Representing vector subtraction, cov 1D (. Cndot.) represents one-dimensional convolution processing, and V' is the navigation voice semantic coding feature vector of the optimized inspection robot. Here, the hilbert orthogonal spatial domain representation decoupling is used for decoupling from the orthogonal spatial domain of domain-invariant (domain-invariant) representation of the whole domain representation of the inspection robot navigation voice semantic coding feature vector V by emphasizing essential domain-specific (domain-specific) information within the diversified feature representation of the inspection robot navigation voice semantic coding feature vector V, that is, by performing domain-invariant (domain-invariant) representation within the whole domain representation of the inspection robot navigation voice semantic coding feature vector V based on vector self-space metrics and the hilbert spatial metrics under vector self-product representation, so as to improve domain adaptive generalization performance of the inspection robot navigation voice semantic coding feature vector V within a classification domain, thereby improving accuracy of classification results of the inspection robot navigation voice semantic coding feature vector. Therefore, the type judgment of the robot navigation instruction can be carried out based on the navigation voice of the inspection robot input by the user so as to control the inspection robot to execute corresponding actions, the performance and the intelligent degree of the navigation system of the inspection robot are improved, and the experience and the practical applicability of the user are improved.
Specifically, the navigation instruction judging unit 342 is configured to pass the navigation voice semantic coding feature vector of the optimized inspection robot through a classifier to obtain a classification result, where the classification result is used to represent a navigation instruction type label. In the technical scheme of the application, the labels of the classifier are navigation instruction type labels, so that after the classification result is obtained, the type judgment of the robot navigation instruction, such as forward, backward, leftward and rightward movement, suspension movement and the like, can be performed based on the classification result, so that the inspection robot is controlled to execute corresponding actions, and the performance and the intelligent degree of the inspection robot navigation system are improved. More specifically, performing full-connection coding on the navigation voice semantic coding feature vector of the optimized inspection robot by using a plurality of full-connection layers of the classifier to obtain a coding classification feature vector; and passing the coding classification feature vector through a Softmax classification function of the classifier to obtain the classification result.
A Classifier (Classifier) refers to a machine learning model or algorithm that is used to classify input data into different categories or labels. The classifier is part of supervised learning, which performs classification tasks by learning mappings from input data to output categories.
The fully connected layer (Fully Connected Layer) is one type of layer commonly found in neural networks. In the fully connected layer, each neuron is connected to all neurons of the upper layer, and each connection has a weight. This means that each neuron in the fully connected layer receives inputs from all neurons in the upper layer, and weights these inputs together, and then passes the result to the next layer.
The Softmax classification function is a commonly used activation function for multi-classification problems. It converts each element of the input vector into a probability value between 0 and 1, and the sum of these probability values equals 1. The Softmax function is commonly used at the output layer of a neural network, and is particularly suited for multi-classification problems, because it can map the network output into probability distributions for individual classes. During the training process, the output of the Softmax function may be used to calculate the loss function and update the network parameters through a back propagation algorithm. Notably, the output of the Softmax function does not change the relative magnitude relationship between elements, but rather normalizes them. Thus, the Softmax function does not change the characteristics of the input vector, but simply converts it into a probability distribution form.
It should be noted that, in other specific examples of the present application, the navigation instruction type tag may also be determined by other manners based on the navigation voice semantic coding feature of the inspection robot, for example: voice sample data is collected that includes different navigation instruction types. These voice samples should cover various navigation instructions that the inspection robot may encounter, such as forward, backward, left turn, right turn, stop, etc.; and extracting semantic coding features of navigation voice of the inspection robot from the collected voice samples by using a voice signal processing technology. This may include techniques such as sound spectrum analysis, mel-frequency cepstral coefficient (MFCC) extraction, etc., to convert the speech signal into feature vectors that can be used for classification; a tag is added to the collected voice sample data, indicating the type of navigation instruction corresponding to each sample. For example, a numeric or text label may be used to represent different navigation instruction types, such as "1" for forward, "2" for reverse, etc.; and dividing the marked voice sample data into a training set and a testing set. The training set is used for training the navigation instruction classification model, and the testing set is used for evaluating the performance of the model; a classification model, such as a Support Vector Machine (SVM), decision tree, deep Neural Network (DNN), etc., is trained using the speech sample data of the training set and the corresponding navigation instruction type labels. The training process of the model comprises the steps of feature vector input, label matching, model parameter optimization and the like; and evaluating the trained classification model by using the voice sample data of the test set and the corresponding navigation instruction type label. The evaluation index can comprise accuracy, recall, F1 score and the like, and is used for measuring the performance of the model on the classified navigation instruction; and predicting the new navigation voice semantic coding features of the inspection robot by using the trained classification model, and determining the corresponding navigation instruction type label. This may be done by inputting the feature vector into the model and then predicting from the output of the model.
As described above, the inspection robot navigation system 300 according to the embodiment of the present application may be implemented in various wireless terminals, such as a server having an inspection robot navigation algorithm, and the like. In one possible implementation, the inspection robot navigation system 300 according to embodiments of the present application may be integrated into the wireless terminal as one software module and/or hardware module. For example, the inspection robot navigation system 300 may be a software module in the operating system of the wireless terminal, or may be an application developed for the wireless terminal; of course, the inspection robot navigation system 300 can also be one of a plurality of hardware modules of the wireless terminal.
Alternatively, in another example, the inspection robot navigation system 300 and the wireless terminal may be separate devices, and the inspection robot navigation system 300 may be connected to the wireless terminal through a wired and/or wireless network and transmit interactive information in an agreed data format.
Further, a navigation method of the inspection robot is also provided.
Fig. 5 is a flowchart of a navigation method of the inspection robot according to an embodiment of the present application. As shown in fig. 5, the navigation method of the inspection robot according to the embodiment of the application includes the steps of: s1, acquiring a navigation voice signal of the inspection robot input by a user; s2, carrying out voice recognition on the navigation voice signal of the inspection robot to obtain a navigation voice text of the inspection robot; s3, carrying out semantic understanding on the navigation voice text of the inspection robot to obtain navigation voice semantic coding features of the inspection robot; s4, determining a navigation instruction type tag based on the navigation voice semantic coding features of the inspection robot.
In summary, the navigation method of the inspection robot according to the embodiment of the application is illustrated, by collecting the navigation voice signal of the inspection robot input by a user in real time, and introducing a signal processing and analyzing algorithm to the rear end to perform signal analysis of the navigation voice signal of the inspection robot, so as to perform type judgment of a robot navigation instruction, thereby controlling the inspection robot to execute corresponding actions, improving the performance and the intelligent degree of the navigation system of the inspection robot, and improving the experience and the practical applicability of the user.
The foregoing description of the embodiments of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the improvement of technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (8)

1. A navigation system for a patrol robot, comprising:
the navigation voice signal acquisition module is used for acquiring navigation voice signals of the inspection robot input by a user;
the voice recognition module is used for carrying out voice recognition on the navigation voice signal of the inspection robot so as to obtain a navigation voice text of the inspection robot;
the navigation voice semantic understanding module is used for carrying out semantic understanding on the navigation voice text of the inspection robot so as to obtain navigation voice semantic coding characteristics of the inspection robot;
and the navigation instruction type detection module is used for determining a navigation instruction type label based on the navigation voice semantic coding characteristics of the inspection robot.
2. The inspection robot navigation system of claim 1, wherein the navigation voice semantic understanding module comprises:
the voice text expression perfecting unit is used for enabling the navigation voice text of the inspection robot to pass through a voice instruction expression perfecting device based on an AIGC model so as to obtain the navigation voice text of the compensation inspection robot;
the voice text semantic coding unit is used for carrying out semantic coding on the compensation inspection robot navigation voice text to obtain an inspection robot navigation voice semantic coding feature vector serving as the inspection robot navigation voice semantic coding feature.
3. The inspection robot navigation system of claim 2, wherein the phonetic text semantic coding unit is configured to: and after word segmentation processing is carried out on the compensation inspection robot navigation voice text, the inspection robot navigation voice semantic coding feature vector is obtained through a context encoder comprising a word embedding layer.
4. The inspection robot navigation system of claim 3, wherein the phonetic text semantic coding unit comprises:
the word segmentation subunit is used for carrying out word segmentation processing on the navigation voice text of the compensation inspection robot so as to convert the navigation voice text of the compensation inspection robot into a word sequence consisting of a plurality of words;
a word embedding subunit, configured to map each word in the word sequence into a word embedding vector by using an embedding layer of the context encoder that includes the word embedding layer, so as to obtain a sequence of word embedding vectors;
a context coding subunit, configured to perform global context semantic coding on the sequence of word embedding vectors using the converter of the context encoder including the word embedding layer, where the global context semantic coding is based on a converter thought, so as to obtain a plurality of global context semantic feature vectors; and
and the cascading subunit is used for cascading the plurality of global context semantic feature vectors to obtain the navigation voice semantic coding feature vector of the inspection robot.
5. The inspection robot navigation system of claim 4, wherein the navigation instruction type detection module comprises:
the feature distribution optimizing unit is used for performing Hilbert orthogonal space domain representation decoupling on the navigation voice semantic coding feature vector of the inspection robot so as to obtain an optimized navigation voice semantic coding feature vector of the inspection robot;
the navigation instruction judging unit is used for enabling the navigation voice semantic coding feature vector of the optimized inspection robot to pass through the classifier to obtain a classification result, and the classification result is used for representing a navigation instruction type label.
6. The inspection robot navigation system of claim 5, wherein the feature distribution optimization unit is configured to: performing Hilbert orthogonal space domain representation decoupling on the navigation voice semantic coding feature vector of the inspection robot by using the following optimization formula to obtain the navigation voice semantic coding feature vector of the optimized inspection robot;
wherein, the optimization formula is:
wherein V is the navigation voice semantic coding feature vector of the inspection robot,is the global feature mean value of the navigation voice semantic coding feature vector of the inspection robot, and is V 2 Is the two norms of the navigation voice semantic coding feature vector of the inspection robot, L is the length of the navigation voice semantic coding feature vector of the inspection robot, I is a unit vector, and->Representing vector subtraction, cov 1D (. Cndot.) represents one-dimensional convolution processing, and V' is the navigation voice semantic coding feature vector of the optimized inspection robot.
7. The inspection robot navigation system according to claim 6, wherein the navigation instruction judging unit includes:
the full-connection coding subunit is used for carrying out full-connection coding on the navigation voice semantic coding feature vector of the optimized inspection robot by using a plurality of full-connection layers of the classifier so as to obtain a coding classification feature vector; and
and the classification result generation subunit is used for passing the coding classification feature vector through a Softmax classification function of the classifier to obtain the classification result.
8. The navigation method of the inspection robot is characterized by comprising the following steps of:
acquiring a navigation voice signal of the inspection robot input by a user;
performing voice recognition on the inspection robot navigation voice signal to obtain an inspection robot navigation voice text;
carrying out semantic understanding on the navigation voice text of the inspection robot to obtain navigation voice semantic coding features of the inspection robot;
and determining a navigation instruction type label based on the navigation voice semantic coding characteristics of the inspection robot.
CN202311331200.0A 2023-10-16 2023-10-16 Navigation system and method for inspection robot Pending CN117437916A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311331200.0A CN117437916A (en) 2023-10-16 2023-10-16 Navigation system and method for inspection robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311331200.0A CN117437916A (en) 2023-10-16 2023-10-16 Navigation system and method for inspection robot

Publications (1)

Publication Number Publication Date
CN117437916A true CN117437916A (en) 2024-01-23

Family

ID=89557531

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311331200.0A Pending CN117437916A (en) 2023-10-16 2023-10-16 Navigation system and method for inspection robot

Country Status (1)

Country Link
CN (1) CN117437916A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118092207A (en) * 2024-04-19 2024-05-28 粒子智慧(杭州)电梯有限公司 Whole-house linkage control system and method based on Internet of things

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118092207A (en) * 2024-04-19 2024-05-28 粒子智慧(杭州)电梯有限公司 Whole-house linkage control system and method based on Internet of things

Similar Documents

Publication Publication Date Title
US20210127003A1 (en) Interactive voice-control method and apparatus, device and medium
CN107562816B (en) Method and device for automatically identifying user intention
CN111353029B (en) Semantic matching-based multi-turn spoken language understanding method
CN109920415A (en) Nan-machine interrogation's method, apparatus, equipment and storage medium based on speech recognition
JP2005084681A (en) Method and system for semantic language modeling and reliability measurement
CN113223509B (en) Fuzzy statement identification method and system applied to multi-person mixed scene
CN113705238B (en) Method and system for analyzing aspect level emotion based on BERT and aspect feature positioning model
CN117437916A (en) Navigation system and method for inspection robot
CN113742733A (en) Reading understanding vulnerability event trigger word extraction and vulnerability type identification method and device
CN112101044A (en) Intention identification method and device and electronic equipment
CN114153942B (en) Event time sequence relation extraction method based on dynamic attention mechanism
CN112599129B (en) Speech recognition method, apparatus, device and storage medium
CN116978408B (en) Depression detection method and system based on voice pre-training model
KR101295642B1 (en) Apparatus and method for classifying sentence pattern for sentence of speech recognition result
CN117809655A (en) Audio processing method, device, equipment and storage medium
CN102141812A (en) Robot
Iori et al. The direction of technical change in AI and the trajectory effects of government funding
Condron et al. Non-Verbal Vocalisation and Laughter Detection Using Sequence-to-Sequence Models and Multi-Label Training.
CN112434133B (en) Intention classification method and device, intelligent terminal and storage medium
CN113593523B (en) Speech detection method and device based on artificial intelligence and electronic equipment
CN116010563A (en) Multi-round dialogue data analysis method, electronic equipment and storage medium
CN114462418A (en) Event detection method, system, intelligent terminal and computer readable storage medium
CN114239555A (en) Training method of keyword extraction model and related device
CN113887239A (en) Statement analysis method and device based on artificial intelligence, terminal equipment and medium
US6816831B1 (en) Language learning apparatus and method therefor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination