CN112132095A - Dangerous state identification method and device, electronic equipment and storage medium - Google Patents

Dangerous state identification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112132095A
CN112132095A CN202011062013.3A CN202011062013A CN112132095A CN 112132095 A CN112132095 A CN 112132095A CN 202011062013 A CN202011062013 A CN 202011062013A CN 112132095 A CN112132095 A CN 112132095A
Authority
CN
China
Prior art keywords
target user
information
lip
preset
dangerous state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011062013.3A
Other languages
Chinese (zh)
Other versions
CN112132095B (en
Inventor
谭皓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202011062013.3A priority Critical patent/CN112132095B/en
Publication of CN112132095A publication Critical patent/CN112132095A/en
Application granted granted Critical
Publication of CN112132095B publication Critical patent/CN112132095B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • G06Q50/265Personal security, identity or safety
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Business, Economics & Management (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Social Psychology (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Psychiatry (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Computer Security & Cryptography (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a dangerous state identification method and device, electronic equipment and a storage medium, and relates to the technical field of safety. The method comprises the steps of obtaining a current face image of a target user, identifying lip characteristics of the target user based on the face image, conducting lip language identification based on the lip characteristics, obtaining an identification result, matching the identification result with preset lip language information, obtaining a matching result, wherein the preset lip language information is preset according to the lip language information representing a dangerous state, and determining whether the target user is in the dangerous state based on the matching result. In this manner, a determination of whether a user is in a dangerous state based on the user's lip movements may be achieved.

Description

Dangerous state identification method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of security technologies, and in particular, to a method and an apparatus for identifying a dangerous state, an electronic device, and a storage medium.
Background
With the rapid progress of living standard and technology level, people have higher and higher requirements on safety protection, and more safety protection technologies appear in the lives of people. In the related art, a user can actively report a dangerous state to further obtain help, but the convenience and the safety of the mode are to be improved.
Disclosure of Invention
In view of the above, the present application provides a method, an apparatus, an electronic device and a storage medium for identifying a dangerous state, so as to improve the above problem.
In a first aspect, an embodiment of the present application provides a method for identifying a dangerous state, where the method includes: acquiring a current face image of a target user; identifying lip features of the target user based on the face image; performing lip language recognition based on the lip features to obtain a recognition result; matching the identification result with preset lip language information to obtain a matching result, wherein the preset lip language information is preset according to the lip language information representing the dangerous state; and determining whether the target user is in a dangerous state or not based on the matching result.
In a second aspect, an embodiment of the present application provides an apparatus for identifying a dangerous state, where the apparatus includes: the device comprises an image acquisition module, a lip recognition module, a lip matching module and a state determination module. The image acquisition module is used for acquiring a current face image of a target user; the lip recognition module is used for recognizing lip characteristics of the target user based on the face image; the lip language identification module is used for carrying out lip language identification based on the lip characteristics to obtain an identification result; the lip matching module is used for matching the identification result with preset lip information to obtain a matching result, wherein the preset lip information is preset according to the lip information representing the dangerous state; and the state determining module is used for determining whether the target user is in a dangerous state or not based on the matching result.
In a third aspect, an embodiment of the present application provides an electronic device, including: one or more processors; a memory; one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method for identifying a hazardous condition provided by the first aspect.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where a program code is stored in the computer-readable storage medium, and the program code may be invoked by a processor to execute the method for identifying a dangerous state provided in the first aspect.
Compared with the prior art, in the scheme provided by the application, the lip characteristics of the target user are identified based on the face image by acquiring the current face image of the target user, the lip characteristics are identified to obtain an identification result, the identification result is matched with the preset lip information to obtain a matching result, wherein the preset lip information is preset according to the lip information representing the dangerous state, and whether the target user is in the dangerous state is determined based on the matching result. So, can judge whether the user is in for dangerous state according to the lip recognition result through carrying out the lip recognition to the target user to realize that the user can trigger through the lip action and discern dangerous state, convenience of customers is to the report of its dangerous state of locating, and because the secret nature of lip action, also can make the security obtain promoting.
These and other aspects of the present application will be more readily apparent from the following description of the embodiments.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart illustrating a method for identifying a dangerous state according to an embodiment of the present application.
Fig. 2 is a flowchart illustrating a method for identifying a dangerous state according to another embodiment of the present application.
Fig. 3 is a flow chart illustrating sub-steps of step S240 shown in fig. 2 in one embodiment.
Fig. 4 is a flowchart illustrating a method for identifying a dangerous state according to another embodiment of the present application.
Fig. 5 is a flow chart illustrating sub-steps of step S460 shown in fig. 4 in one embodiment.
Fig. 6 is a block diagram of a dangerous state identification apparatus according to an embodiment of the present application.
Fig. 7 is a block diagram of an electronic device for executing a method for identifying a dangerous state according to an embodiment of the present application.
Fig. 8 is a storage unit for storing or carrying program codes for implementing a path generation method according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
In practical application, network communication is developed day by day, dependence and use of people on the network are increased day by day, businesses such as take-out, windmills, network friend making and the like come across the world, intersections among strangers are increased, and the strangers may be exposed to danger after the intersections are generated. In daily news, people often hear the injured news when riding a windward, taking a sale or meeting a net friend, and when the people are in danger, people can choose to call a telephone to alarm or send help-seeking information to family or friends through social software.
The inventor finds that, through long-term research, in the above processing mode when encountering a dangerous situation, the reporting of the dangerous state needs to be realized by a user through voice input or manual operation, which is very inconvenient. And the coming of danger is often conspired, and under the condition that criminals are sufficiently prepared and lose intelligence, the criminals are alarmed or asked for help or can be irritated, so that the parties are in a more dangerous state.
In order to solve the above problems, the inventor provides a method and an apparatus for identifying a dangerous state, and an electronic device, which can determine whether a user is in a dangerous state according to an identification result by performing lip language identification on a target user, so as to determine whether to send alarm information according to the state of the user. This is described in detail below.
Referring to fig. 1, fig. 1 is a schematic flowchart illustrating a method for identifying a dangerous state according to an embodiment of the present disclosure. The method for identifying a dangerous state provided by the embodiment of the present application will be described in detail below with reference to fig. 1. The method for identifying the dangerous state can comprise the following steps:
step S110: and acquiring the current face image of the target user.
In the embodiment of the application, whether the target user is in a dangerous state or not can be identified by identifying the lip action of the target user. When the electronic equipment identifies the lip movement of the target user, the current face image of the target user can be acquired, so that the lip movement of the target user can be identified through the face image of the target user. The target user can refer to any user and can also refer to a user of the electronic equipment; the current face image of the target user refers to a face image obtained by acquiring the face image of the target user at the current moment.
In some embodiments, the electronic device may perform acquiring a current face image of the target user at preset time intervals, and perform subsequent steps to determine whether the target user is in a dangerous state. The preset time may be preset, for example, preset according to an operation of a user, and a specific size of the preset time of the electronic device may not be limited, for example, the preset time may be 5 minutes, or may also be 10 minutes.
In other embodiments, the electronic device may also perform the acquiring of the current facial image of the target user when detecting an instruction for triggering the identification of the dangerous state. It is understood that the user may instruct the electronic device to identify whether it is in a dangerous state by actively inputting a corresponding instruction to the electronic device, i.e., to execute the process of the dangerous state identification method.
In this embodiment, the instructions for triggering identification of the hazardous condition may include: the method includes the steps that one or more of a preset gesture input by a target user and detected by electronic equipment, an operation on a designated key and detected by the electronic equipment, a key combination operation and a designated expression of the target user are detected, wherein the preset gesture, the designated key, the key combination operation and the designated expression can be preset, and specific contents are not limited.
In some embodiments, the electronic device may include an image capture device, such as a color camera, a depth camera, and the like. When the electronic equipment needs to identify whether a target user is in a dangerous state, the face image of the target user can be acquired through the image acquisition device. As an embodiment, the electronic device may acquire current multi-frame face images of the target user, for example, continuously acquire multi-frame face images, that is, acquire a video including a face of the target user; as another embodiment, the electronic device may also acquire a current frame of facial image of the target user.
Step S120: and identifying lip features of the target user based on the face image.
In the embodiment of the application, the electronic device can identify the lip language features of the target user based on the acquired face image of the target user so as to identify the lip motions of the target user, identify the lip language of the target user and further subsequently judge whether the target user is in a dangerous state.
In some embodiments, a lip image of a lip region may be extracted from the face image, and then a lip feature may be extracted according to the lip image, so as to obtain a lip feature of the target user. The method comprises the steps that a lip region can be segmented from a face image in a threshold segmentation mode, and then a lip image of the lip region is obtained; the face image can also be input into the region extraction model through a pre-trained region extraction model of the lip region, so as to obtain the lip image output by the region extraction model, and the specific way of obtaining the lip image may not be limited. When lip features are extracted according to the lip images, the lip images can be input into the feature extraction model through a pre-trained feature extraction model, and the lip features output by the feature extraction model are obtained; lip features may be extracted from the lip image by a method of extracting lip feature points such as a singular value decomposition method, a discrete cosine transform method, or a discrete wavelet transform method. The region extraction model may be a deep neural network, and is not limited herein.
In the embodiment, when the electronic device obtains that the face image of the target user is a continuous multi-frame face image, that is, a video including the face of the target user, the lip image in each frame of image can be extracted, the lip image in each frame of image can be adjusted to the same size, and the extracted lip image sample data is spliced and stored according to the time sequence to generate the lip feature data set. The videos containing the faces of the target users need to be subjected to post-processing to enable the frame rates of the videos to be equal, and for example, the frame rates are all 30 f/s.
In the above embodiment, when extracting the lip features, the lip image may be obtained first, and then the corresponding lip features may be obtained from the lip image, based on which, interference of other irrelevant face information may be eliminated in the process of performing lip language recognition, and the recognition accuracy may be improved.
Step S130: and performing lip language recognition based on the lip characteristics to obtain a recognition result.
In the embodiment of the application, the electronic device can identify the identified lip feature so as to acquire the lip information corresponding to the lip feature, wherein the lip information is a final identification result, and then whether the target user is in a dangerous state can be judged based on the identification result.
In some embodiments, lip language identification may be understood as obtaining the content expressed by the target user in lip language by analyzing the lip motion characteristics of the target user, and the analysis process may include two steps, i.e., pinyin sequence identification and chinese character sequence identification, where the pinyin sequence identification is to map consecutive lip feature images into pinyin sentences and the chinese character sequence identification is to translate the pinyin sequences into corresponding chinese characters sentences. As an implementation manner, the pinyin sequence recognition may be performed by a convolutional neural network model, a pinyin sequence recognition network framework, and the like, and further, the chinese character sequence recognition may be performed based on an Encoder-Decoder model and a chinese character sequence recognition network, so as to obtain the lip language information corresponding to the lip feature.
Step S140: and matching the identification result with preset lip language information to obtain a matching result, wherein the preset lip language information is preset according to the lip language information representing the dangerous state.
In the embodiment of the application, the preset lip language information is preset according to the lip language information representing the dangerous state, so that after the electronic device obtains the recognition result of the lip language of the target user, the recognition result can be matched with the preset lip language information to determine whether the target user is in the dangerous state.
In some embodiments, the preset lip language information may be lip language information recognized by the electronic device according to a lip image input by the user in advance. Exemplarily, the electronic device may display a setting interface of the lip language information, prompt the user to perform lip language input, acquire a face image of the user after detecting that the user starts the operation of the lip language input, and finally perform lip language recognition according to the acquired face image of the user, thereby obtaining the lip language information set by the user, that is, as the preset lip language information.
Step S150: and determining whether the target user is in a dangerous state or not based on the matching result.
In the embodiment of the application, after the electronic device obtains the matching result between the recognition result and the preset lip language information, whether the target user is in a dangerous state or not can be determined according to the matching result. When the recognition result is matched with the preset lip language information, the target user can be determined to be in a dangerous state; when the recognition result does not match the preset lip language information, it can be determined that the target user is not in a dangerous state.
In this embodiment, after the lip information of the target user is obtained by the lip recognition method, the content of the lip information may be matched with the content of preset lip information, where the content of the preset lip information may be the lip information characterized as being in a dangerous state, and whether the content of the lip information is matched with the content of the preset lip information is determined, so as to determine whether the target user is in a dangerous state based on the matching result.
Referring to fig. 2, fig. 2 is a schematic flowchart illustrating a method for identifying a dangerous state according to another embodiment of the present application. As will be explained in detail with respect to the flow shown in fig. 2, the identification of the dangerous state may specifically include the following steps:
step S210: and acquiring the current face image of the target user.
Step S220: and identifying lip features of the target user based on the face image.
Step S230: and performing lip language recognition based on the lip characteristics to obtain a recognition result.
In the embodiment of the present application, steps S210 to S230 may refer to the contents of steps S110 to S130 in the foregoing embodiment, and are not described herein again.
Step S240: and matching the identification result with preset lip language information to obtain a matching result, wherein the preset lip language information is preset according to the lip language information representing the dangerous state.
In some embodiments, referring to fig. 3, step S240 may include:
step S241: and acquiring the similarity between the recognition result and the preset lip language information.
In the embodiment of the application, the similarity between the recognition result and the preset lip language information can be understood as the similarity between the text information corresponding to the lip characteristics of the target user and the text information representing the dangerous state.
In one possible example, a dictionary library for representing a dangerous state is established, the dictionary library comprises all words in a training corpus, such as "save me", "alarm", "rescue" or "help me", and the words in the dictionary library all can indicate that a user is in a dangerous state, that is, a recognition result for a lip feature of a target user is a word in the dictionary library, and it can be determined that the recognition result for the lip feature of the user matches preset lip language information. In some cases, the text information that the target user may express is not completely consistent with the words in the dictionary library, and then the similarity between the text information that the target user expresses in lip language and the words in the dictionary library may be determined through some algorithms, and whether the recognition result for the lip feature of the user matches the preset lip language information may be determined through the similarity.
Step S242: and if the similarity is greater than a preset similarity threshold, determining that the recognition result is matched with the preset lip language information.
The preset similarity threshold may be a fixed value preset in advance, for example, 0.4, and whether the identification result for the target user matches the preset lip language information is determined by determining the size between the acquired similarity and the preset similarity threshold. In practical application, the scheme is mainly applied in a situation that a user is in a dangerous state, if a preset similarity threshold value for a target user is set to be high, under the condition that the user is tense, characters expressed by lip language may be intermittent, such as' quickly … rescues …, i.e., the similarity between the character information and preset lip language information may be low, such as 0.5, under the condition, the similarity obtained through calculation may be smaller than the similarity threshold value, correspondingly, the recognition result is not matched with the preset lip language information, and further, whether the target user is in the dangerous state may be misjudged. Therefore, in the embodiment of the present application, the preset similarity threshold is generally set to be relatively small, for example, 0.3 or 0.4, and under the condition that the preset similarity threshold is set to be relatively low, even if the text information expressed by the target user due to factors such as tension is incomplete or discontinuous, the matching result between the recognition result for the target user and the preset lip language cannot be misjudged.
For example, if the recognition result for the lip feature of the target user is "fast rescue me, fast rescue me", the similarity between the text information "fast rescue me, fast rescue me" and the words in the dictionary library representing the dangerous state may be calculated, for example, 0.7, and the preset similarity threshold may be 0.4, and correspondingly, it may be determined that the similarity is greater than the preset similarity threshold, and further, it may be determined that the recognition result for the target user matches the preset lip language information.
Step S243: and if the similarity is smaller than or equal to a preset similarity threshold, determining that the recognition result is not matched with the lip language information.
For example, if the recognition result of the lip feature of the target user is "wrong", the similarity between the word information "wrong" and the words in the dictionary library representing the dangerous state may be calculated, for example, 0.1, and the preset similarity threshold may be 0.4, and correspondingly, it may be determined that the similarity is smaller than the preset similarity threshold, and further, it may be determined that the recognition result of the target user does not match the preset lip language information.
In the embodiment of the application, whether the recognition result for the target user is matched with the preset lip language information or not can be judged by performing semantic analysis and emotion classification on the character information expressed by the target user. Correspondingly, each word can be mapped into a certain-dimension vector through training of a similar neural network model based on a large text corpus, the dimension is between dozens of dimensions and hundreds of dimensions, each vector represents the word, and the semantic and syntactic similarity of the word and the similarity between the vectors are judged. The method comprises the steps of segmenting the text information expressed by a target user through lip language, extracting nouns, adjectives, adverbs, connecting words and the like in the text information, sequentially comparing the similarity of the segmented text information with word vectors in a text corpus, further analyzing the emotion of the text information, and judging whether the emotion corresponding to the text information is negative emotion or not, wherein the negative emotion can comprise emotions such as tension, anxiety, difficulty, fear and the like.
Based on the emotion classification, the character information expressed by the target user is subjected to emotion classification, word segmentation and stop word processing can be carried out on the character information, and character string matching is carried out on the character information by utilizing an emotion dictionary which is formed by components, so that positive information and negative information are mined. The emotion dictionary comprises four parts including a positive word dictionary, a negative word dictionary and a degree adverb dictionary, and each dictionary generally comprises two parts, namely words and weights. And performing text matching by using an emotion dictionary, namely traversing words summarized in the sentence after word segmentation of the character information of the target user, if the words hit the corresponding emotion dictionary, performing corresponding weight processing, wherein the positive word weight is addition, the negative word weight is subtraction, the negative word weight is opposite, the degree adverb weight is multiplied by the word weight modified by the negative word weight, and the emotion classification of the character information is judged according to the finally output weight value. It is understood that, in a case where the target user is in a dangerous state, the emotion of the target user is generally tension, fear, unease, and the like, and thus the corresponding weight is a negative number. When the semantics of the character information expressed by the target user is to express the dangerous state of the target user, the emotion classification corresponding to the character information is detected, if the weight corresponding to the character information is calculated to be a negative number, the character information can be judged to belong to negative emotion, furthermore, the recognition result aiming at the lip characteristics of the target user is analyzed by combining the semantic analysis and the emotion classification, and when the semantics of the character information is to express the dangerous state of the target user and the emotion classification corresponding to the dangerous state of the target user is the negative emotion, whether the target user is in the dangerous state can be further judged.
Step S250: and when the matching result represents that the recognition result is matched with the preset lip language information, obtaining expression information corresponding to the face image.
In this embodiment, when the recognition result for the target user matches the preset lip language information, expression information corresponding to the face image of the target user may also be obtained.
It can be understood that when the recognition result for the target user matches the preset lip language information, it may be determined that the probability that the user is in the dangerous state is relatively high, but under some circumstances, such as an instruction that a child mistakenly triggers the recognition of the dangerous state when using a mobile phone or an instruction that a prank intentionally triggers the recognition of the dangerous state, if it is determined only according to the result of the lip language recognition, a misjudgment may occur, and therefore, it may be further determined whether the target user is in the dangerous state through facial expression recognition, so as to improve the accuracy of the determination.
Step S260: and judging whether the expression information meets a preset expression condition, wherein the preset expression condition is preset according to the expression information representing the dangerous state.
Based on the above, the obtained expression information can be judged, and whether the expression information meets the preset expression condition representing the dangerous state or not is judged.
In some embodiments, the facial contour and the feature points and emotional states of the five sense organs can be manually marked on the images in the training set, the convolutional neural network is used, the position coordinates of the feature points of the images in the training set are used as input, the emotional states corresponding to the images in the training set are used as output, and the convolutional neural network is trained to obtain the expression recognition model for expression recognition. When the expression recognition model is used for expression recognition, the positions and shapes (such as facial contour and five sense organs) of the face and main components thereof can be determined based on the face image of the target user acquired through the camera, the feature points of the face are extracted, the relations (such as distance, angle and the like) among the feature points are input into the expression recognition model as feature vectors, the emotion state output by the expression recognition model is acquired and used as recognized expression information, and therefore expression recognition is achieved.
Specifically, the relationship between the feature points is input into the expression model as a feature vector to obtain the emotional state output by the expression recognition model, it can be understood that 68 feature points are marked on the face of the target user, and the data coordinates of the 68 feature points are calculated by the predictor, the eyebrow picking degree and the frown degree of the target user may be analyzed through 10 feature points on two eyebrows, the eyebrow picking may correspond to distraction, surprise, or the like, the frown may correspond to confusion, worry, fear, or the like, the eye may also correspond to the eye, the opening degree of the eye of the target user may also be obtained through analysis of feature points around the eye, and further, analyzing the opening degree of the mouth through the characteristic points around the mouth, opening the mouth may correspond to happy feeling, surprise or fear, and the like, not opening the mouth may correspond to normal or angry, and the like, that is, 68 feature points of the face of the target user are analyzed to realize the expression recognition of the target user.
Step S270: and if the expression information meets the preset expression condition, determining that the target user is in a dangerous state.
It can be understood that on the premise that the recognition result for the target user is matched with the preset lip language information, if the expression information of the target user is judged to be negative emotions such as fear, fear or unease through the above mode, that is, the preset expression condition is met, it can be determined that the target user is in a dangerous state.
Based on this, after it is determined that the target user is in a dangerous state, alarm information may be transmitted through the steps shown in fig. 4.
Step S280: and if the expression information does not meet the preset expression condition, determining that the target user is not in a dangerous state.
Optionally, on the premise that the recognition result of the target user is matched with the preset lip language information, if it is determined that the expression information of the target user is positive emotions such as happy emotion and happy emotion, the preset condition is not satisfied, at this time, a mischief or an intentional test of the function of the user of the mobile phone may be considered, and when it is determined that the target user is not in a dangerous state, correspondingly, the electronic device may not obtain real-time location information of the target user and generate alarm information according to the location information.
Step S290: and when the matching result represents that the identification result is not matched with the preset lip language information, determining that the target user is not in a dangerous state.
It can be understood that when the lip language information acquired through the lip language identification technology is not matched with the preset lip language information, it is indicated that the target user is not in a dangerous state, and correspondingly, the electronic device cannot acquire the real-time position information of the target user and generates alarm information according to the position information.
In this embodiment, lip characteristics of a target user are identified through a lip language identification technology, an identification result is analyzed, and whether the target user is in a dangerous state is determined by combining an analysis result and a judgment result in combination with judgment of expression information of the target user, so that the user can report the dangerous state conveniently.
Referring to fig. 4, fig. 4 is a flowchart illustrating a method for identifying a dangerous state according to another embodiment of the present application. As will be explained in detail with respect to the flow shown in fig. 4, the identification of the dangerous state may specifically include the following steps:
step S410: and acquiring the current face image of the target user.
Step S420: and identifying lip features of the target user based on the face image.
Step S430: and performing lip language recognition based on the lip characteristics to obtain a recognition result.
Step S440: and matching the identification result with preset lip language information to obtain a matching result, wherein the preset lip language information is preset according to the lip language information representing the dangerous state.
Step S450: and determining whether the target user is in a dangerous state or not based on the matching result.
In the embodiment of the present application, steps S410 to S450 may refer to the contents of steps S110 to S150 in the foregoing embodiment, which are not described herein again.
Step S460: and when the target user is determined to be in a dangerous state, generating alarm information, wherein the alarm information is used for indicating that the target user is in the dangerous state.
In some embodiments, the alert information may be generated by the steps described in fig. 5, i.e., step S460 may include the steps shown in fig. 5.
Step S461: and acquiring the position information of the position of the target user, wherein the position information comprises real-time positioning information and/or background environment information.
In practical application, when determining that a target user is in a dangerous state, real-time Positioning information of the position of the target user can be obtained through a Positioning System carried by a mobile phone of the target user, wherein the Positioning System can be a local Satellite Positioning System such as a GPS (Global Positioning System), a BDS (BeiDou Navigation Satellite System), a GLONASS (Global Navigation Satellite System), or a network Positioning System, and the network Positioning can be performed in two ways, one way can be Positioning in a small range through Wi-Fi (wireless fidelity), Positioning is performed according to the position of a Wi-Fi router, Positioning accuracy is high but unreliable through the Wi-Fi router, and the *** cannot record the position of each router on the earth, so that the phenomenon of Positioning to other places and even other provinces is caused at times, in addition, the target user may not access a certain Wi-Fi router in a dangerous state, and the Wi-Fi router is adopted for positioning in practical application; another way in network positioning may be base station positioning, which is reliable but with large errors, because it depends on the base station distribution density. The urban positioning accuracy of developed areas is high, and can reach dozens of meters to within one hundred meters at present. However, when the distribution distance of base stations in remote areas is large, the error is large, and sometimes can even reach more than several kilometers. The network positioning has the advantages that the positioning speed is high, instantaneous positioning can be realized as long as the mobile phone of the target user is in a networking state, and the real-time position information of the target user in a dangerous state can be positioned in the shortest time; the satellite positioning system has the advantages of being accurate in positioning, free of network limitation, capable of positioning in deserts and on the sea without people and smoke, high in precision and capable of accurately positioning real-time position information of a target user in a dangerous state, but the satellite positioning system is dependent on satellites in the outer space, and is characterized in that the first positioning reaction is slow, and the first positioning of the mobile phone in the market at present needs more than 10 seconds and even more than 20 seconds when the mobile phone is started in a cold mode. Based on this, can through using network location and satellite positioning jointly, assistance-localization real-time location information in target user ground under the dangerous state can be realized accurately and fast to the promptness that can guarantee the accuracy of location also can guarantee the location together.
In one embodiment, only the real-time location information of the target user in the dangerous state may be obtained, and the real-time location information may be used as the location information of the location of the target user, where the location information may represent an accurate geographic location of the target user, and the location information may be a longitude and a latitude of the location of the target user.
Meanwhile, the background environment information of the specific position of the target user can be acquired through the camera of the mobile phone of the target user. Exemplarily, a picture or a video containing a target user is acquired through a camera of a mobile phone, and a specific background environment where the target user is located is determined through algorithm analysis, for example, when the target user is on a taxi, the specific background environment where the target user is located can be analyzed to be in the taxi according to the acquired picture or video containing the target user; when the target user is in the house, the specific background of the target user can be analyzed to be indoor according to the collected picture or video containing the target user; when the target user is outdoors, the specific background where the target user is located can be analyzed to be outdoors according to the collected picture or video containing the target user.
In another embodiment, the background environment information of the target user acquired by the image acquisition device may be used as the location information of the location where the target user is located, that is, the location information represents a specific scene where the user is located, and the location information may be a scene such as an indoor scene, an outdoor scene, or an in-vehicle scene.
In another embodiment, it may be understood that the combination of the acquired real-time location information of the target user and the specific background environment information of the location where the target user is located may help to determine the location information of the target user more accurately, that is, the acquired location information of the location where the target user is located may include the real-time location information and the specific background environment information.
Alternatively, there may be a variety of ways to determine real-time location information of where the target user is located. In one embodiment, when it is determined that the target user is in a dangerous state, the real-time location information of the location where the target user is located at the current time may be obtained, that is, the real-time location information of the target user is obtained only once, where the obtained real-time location information may include longitude and latitude of the location where the target user is located and time information of the current time, such as: (118.5, 31.5, 200927.0915), wherein 118.5 represents the longitude of the location of the target user, 31.5 represents the latitude of the location of the target user, and 200927.0915 represents the current time of 9 am, 15 min, 9/27/2020/9/10/min.
In another embodiment, the longitude and latitude of the location of the target user may be obtained through a cyclic program, that is, the longitude and latitude of the target user may be obtained every preset time period, for example, the longitude and latitude of the location of the target user may be obtained through the positioning system every 1 second, based on which, the real-time positioning information of the target user may be continuously obtained, and compared with a mode of obtaining the real-time positioning information of the target user only once, the real-time positioning information of the target user may be ensured to be obtained when the target user moves, so that the user receiving the real-time positioning information may observe the location information of the target user in real time, for example: when a target user encounters a danger on a vehicle when taking a vehicle, the real-time position information of the target user is changed all the time, and the user receiving the real-time position information of the target user can display the real-time positioning information of the target user by using an Application Programming Interface (API) of a map Application, so that the real-time position information of the target user can be observed more intuitively.
Step S462: and generating alarm information carrying the position information based on the position information.
Optionally, the acquired real-time positioning information and/or background environment information of the target user are integrated to generate alarm information for the target user, where the alarm information may indicate that the user is in a dangerous state and includes real-time location information of the target user.
Step S470: and sending the alarm information to a designated device.
The designated device may be a mobile phone of a family or/and a friend of the target user, an electronic device for receiving an alarm call by a police station, a server of an alarm platform, and the like, which are not limited in the embodiment of the present application.
In practical application, after the mobile phone of the target user generates the corresponding alarm information, the mobile phone of the target user can automatically send the alarm information to the designated equipment without the operation of the target user. Correspondingly, after the related personnel who manage the specified device receive the alarm information, the target user can be rescued according to the position information of the target user carried by the alarm information in a gathering way. For example, if the family or the friend of the target user receives the distress message, the family or the friend of the target user can drive to the corresponding position to search for the target user according to the position information carried in the distress message and perform rescue on the target user, and the family or the friend of the target user can also alarm according to the position information carried in the distress message and search for the target user through the police's assistance; if the police receives the help-seeking information, the police can directly rescue the target user according to the position information carried in the help-seeking information.
In this embodiment, lip characteristics of a target user are identified through a lip language identification technology, an identification result is analyzed, whether the target user is in a dangerous state is judged according to the analysis result, real-time position information of the target user in the dangerous state is acquired through a positioning technology, and alarm information about the target user is generated based on the position information, so that relevant rescuers can rescue the target user according to the alarm information. The application of the lip language recognition technology in specific life is well realized, the risk that a target user is possibly injured when encountering danger is reduced, the success rate of alarming and seeking help is improved, the target user is helped to send out effective help-seeking information in time, and the purpose of protecting the personal and property safety of the user is achieved.
Referring to fig. 6, a block diagram of a dangerous state identification apparatus 600 according to an embodiment of the present disclosure is shown. The apparatus 600 may include: an image acquisition module 610, a lip recognition module 620, a lip recognition module 630, a lip matching module 640, and a status determination module 650.
The image obtaining module 610 is configured to obtain a current face image of the target user.
The lip recognition module 620 is configured to recognize lip features of the target user based on the face image.
The lip language recognition module 630 is configured to perform lip language recognition based on the lip features to obtain a recognition result.
The lip matching module 630 is configured to match the recognition result with preset lip information to obtain a matching result, where the preset lip information is preset according to the lip information representing the dangerous state.
The status determination module 650 is configured to determine whether the target user is in a dangerous status based on the matching result.
Optionally, the state determination module 650 may include: a first determining unit and a second determining unit. The first determining unit is used for determining that the target user is in a dangerous state when the matching result represents that the recognition result is matched with the preset lip language information; and the second determining unit is used for determining that the target user is not in a dangerous state when the matching result represents that the identification result is not matched with the lip language information.
Optionally, the lip language matching module 630 may be further configured to obtain a similarity between the recognition result and the preset lip language information, determine that the recognition result is matched with the preset lip language information if the similarity is greater than a preset similarity threshold, and determine that the recognition result is not matched with the lip language information if the similarity is less than or equal to the preset similarity threshold.
Optionally, the state determination module 650 may be specifically configured to: and when the matching result represents that the recognition result is matched with the preset lip language information, obtaining expression information corresponding to the face image, judging whether the expression information meets a preset expression condition, wherein the preset expression condition is preset according to the expression information representing a dangerous state, and if the expression information meets the preset expression condition, determining that the target user is in the dangerous state.
Optionally, the state determining module 650 may be further configured to determine that the target user is not in a dangerous state if the expression information does not satisfy the preset expression condition.
The apparatus 600 for recognizing a dangerous state may further include: the alarm information generating module and the information sending module. The alarm information generation module is used for generating alarm information when the target user is determined to be in a dangerous state after the target user is determined to be in the dangerous state based on the matching result, wherein the alarm information is used for indicating that the target user is in the dangerous state; the alarm information generating module may be specifically configured to acquire position information of a position where the target user is located, where the position information includes real-time positioning information and/or background environment information, and generate alarm information carrying the position information based on the position information; and the information sending module is used for sending the alarm information to the specified equipment.
The image acquisition module 610 may be specifically configured to: and when an instruction for triggering the identification of the dangerous state is detected, executing the acquisition of the current face image of the target user.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and modules may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, the coupling between the modules may be electrical, mechanical or other type of coupling.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
To sum up, in the scheme provided by the embodiment of the application, through obtaining the current face image of the target user, based on the face image, the lip feature of the target user is identified, based on the lip feature lip recognition is performed, the recognition result is obtained, the recognition result is matched with the preset lip information, the matching result is obtained, wherein the preset lip information is preset according to the lip information representing the dangerous state, and based on the matching result, whether the target user is in the dangerous state is determined. So, can judge whether the user is in for dangerous state according to the lip recognition result through carrying out the lip recognition to the target user to realize that the user can trigger through the lip action and discern dangerous state, convenience of customers is to the report of its dangerous state of locating, and because the secret nature of lip action, also can make the security obtain promoting.
Referring to fig. 7, a block diagram of an electronic device 700 according to an embodiment of the present disclosure is shown, where the method for identifying a dangerous state according to the embodiment of the present disclosure may be executed by the electronic device 700.
The electronic device 700 in the embodiments of the present application may include one or more of the following components: a processor 701, a memory 702, and one or more applications, wherein the one or more applications may be stored in the memory 702 and configured to be executed by the one or more processors 701, the one or more programs configured to perform a method as described in the aforementioned method embodiments.
Processor 701 may include one or more processing cores. The processor 701 interfaces with various components throughout the electronic device 700 using various interfaces and circuitry to perform various functions of the electronic device 700 and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 702 and invoking data stored in the memory 702. Alternatively, the processor 701 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 701 may integrate one or a combination of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 701, but may be implemented by a communication chip.
The Memory 702 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 702 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 702 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The storage data area may also store data created by the electronic device 700 during use (such as the various correspondences described above), and so on.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and modules may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, the coupling or direct coupling or communication connection between the modules shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or modules may be in an electrical, mechanical or other form.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
Referring to fig. 8, a block diagram of a computer-readable storage medium according to an embodiment of the present application is shown. The computer-readable medium 800 has stored therein a program code that can be called by a processor to execute the method described in the above-described method embodiments.
The computer-readable storage medium 800 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Alternatively, the computer-readable storage medium 800 includes a non-transitory computer-readable storage medium. The computer readable storage medium 800 has storage space for program code 810 to perform any of the method steps of the method described above. The program code can be read from or written to one or more computer program products. The program code 810 may be compressed, for example, in a suitable form.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (11)

1. A method for identifying a hazardous condition, the method comprising:
acquiring a current face image of a target user;
identifying lip features of the target user based on the face image;
performing lip language recognition based on the lip features to obtain a recognition result;
matching the identification result with preset lip language information to obtain a matching result, wherein the preset lip language information is preset according to the lip language information representing the dangerous state;
and determining whether the target user is in a dangerous state or not based on the matching result.
2. The method of claim 1, wherein the determining whether the target user is in a dangerous state based on the matching result comprises:
when the matching result represents that the recognition result is matched with the preset lip language information, determining that the target user is in a dangerous state;
and when the matching result represents that the identification result is not matched with the preset lip language information, determining that the target user is not in a dangerous state.
3. The method according to claim 2, wherein the matching the recognition result with preset lip language information to obtain a matching result comprises:
acquiring the similarity between the recognition result and the preset lip language information;
if the similarity is larger than a preset similarity threshold, determining that the recognition result is matched with the preset lip language information;
and if the similarity is smaller than or equal to a preset similarity threshold, determining that the recognition result is not matched with the lip language information.
4. The method according to claim 2, wherein the determining that the target user is in a dangerous state when the matching result represents that the recognition result matches the preset lip language information comprises:
when the matching result represents that the recognition result is matched with the preset lip language information, obtaining expression information corresponding to the face image;
judging whether the expression information meets a preset expression condition or not, wherein the preset expression condition is preset according to the expression information representing the dangerous state;
and if the expression information meets the preset expression condition, determining that the target user is in a dangerous state.
5. The method of claim 4, wherein the determining whether the target user is in a dangerous state based on the matching result further comprises:
and if the expression information does not meet the preset expression condition, determining that the target user is not in a dangerous state.
6. The method of claim 1, wherein after the determining whether the target user is in a dangerous state based on the matching result, the method further comprises:
when the target user is determined to be in a dangerous state, generating alarm information, wherein the alarm information is used for indicating that the target user is in the dangerous state;
and sending the alarm information to a designated device.
7. The method of claim 6, wherein the generating alert information comprises:
acquiring position information of the position of the target user, wherein the position information comprises real-time positioning information and/or background environment information;
and generating alarm information carrying the position information based on the position information.
8. The method according to any one of claims 1 to 7, wherein the obtaining of the current face image of the target user comprises:
and when an instruction for triggering the identification of the dangerous state is detected, executing the acquisition of the current face image of the target user.
9. An apparatus for identifying a hazardous condition, the apparatus comprising:
the image acquisition module is used for acquiring a current face image of a target user;
the lip recognition module is used for recognizing lip characteristics of the target user based on the face image;
the lip language identification module is used for carrying out lip language identification based on the lip characteristics to obtain an identification result;
the lip matching module is used for matching the identification result with preset lip information to obtain a matching result, wherein the preset lip information is preset according to the lip information representing the dangerous state;
and the state determining module is used for determining whether the target user is in a dangerous state or not based on the matching result.
10. An electronic device, comprising:
one or more processors;
a memory;
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of any of claims 1-8.
11. A computer-readable storage medium, characterized in that a program code is stored in the computer-readable storage medium, which program code can be called by a processor to perform the method according to any of claims 1-8.
CN202011062013.3A 2020-09-30 2020-09-30 Dangerous state identification method and device, electronic equipment and storage medium Active CN112132095B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011062013.3A CN112132095B (en) 2020-09-30 2020-09-30 Dangerous state identification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011062013.3A CN112132095B (en) 2020-09-30 2020-09-30 Dangerous state identification method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112132095A true CN112132095A (en) 2020-12-25
CN112132095B CN112132095B (en) 2024-02-09

Family

ID=73844975

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011062013.3A Active CN112132095B (en) 2020-09-30 2020-09-30 Dangerous state identification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112132095B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112926407A (en) * 2021-02-02 2021-06-08 华南师范大学 Distress signal detection method, device and system based on campus deception
CN114612977A (en) * 2022-03-10 2022-06-10 苏州维科苏源新能源科技有限公司 Big data based acquisition and analysis method
WO2023006033A1 (en) * 2021-07-29 2023-02-02 华为技术有限公司 Speech interaction method, electronic device, and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110026779A1 (en) * 2008-12-24 2011-02-03 David Matsumoto Systems and methods for analyzing facial expressions, identifying intent and transforming images through review of facial expressions
US20120308971A1 (en) * 2011-05-31 2012-12-06 Hyun Soon Shin Emotion recognition-based bodyguard system, emotion recognition device, image and sensor control apparatus, personal protection management apparatus, and control methods thereof
CN110113319A (en) * 2019-04-16 2019-08-09 深圳壹账通智能科技有限公司 Identity identifying method, device, computer equipment and storage medium
CN110377761A (en) * 2019-07-12 2019-10-25 深圳传音控股股份有限公司 A kind of method and device enhancing video tastes
CN110427809A (en) * 2019-06-21 2019-11-08 平安科技(深圳)有限公司 Lip reading recognition methods, device, electronic equipment and medium based on deep learning
CN111045639A (en) * 2019-12-11 2020-04-21 深圳追一科技有限公司 Voice input method, device, electronic equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110026779A1 (en) * 2008-12-24 2011-02-03 David Matsumoto Systems and methods for analyzing facial expressions, identifying intent and transforming images through review of facial expressions
US20120308971A1 (en) * 2011-05-31 2012-12-06 Hyun Soon Shin Emotion recognition-based bodyguard system, emotion recognition device, image and sensor control apparatus, personal protection management apparatus, and control methods thereof
CN110113319A (en) * 2019-04-16 2019-08-09 深圳壹账通智能科技有限公司 Identity identifying method, device, computer equipment and storage medium
CN110427809A (en) * 2019-06-21 2019-11-08 平安科技(深圳)有限公司 Lip reading recognition methods, device, electronic equipment and medium based on deep learning
CN110377761A (en) * 2019-07-12 2019-10-25 深圳传音控股股份有限公司 A kind of method and device enhancing video tastes
CN111045639A (en) * 2019-12-11 2020-04-21 深圳追一科技有限公司 Voice input method, device, electronic equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
张晓冰;龚海刚;杨帆;戴锡笠;: "基于端到端句子级别的中文唇语识别研究", 软件学报, no. 06 *
肖庆阳;张金;左闯;范娟婷;梁碧玮;邸硕临;: "基于语义约束的口型序列识别方法", 计算机应用与软件, no. 09 *
马宁;田国栋;周曦;: "一种基于long short-term memory的唇语识别方法", 中国科学院大学学报, no. 01 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112926407A (en) * 2021-02-02 2021-06-08 华南师范大学 Distress signal detection method, device and system based on campus deception
WO2023006033A1 (en) * 2021-07-29 2023-02-02 华为技术有限公司 Speech interaction method, electronic device, and medium
CN114612977A (en) * 2022-03-10 2022-06-10 苏州维科苏源新能源科技有限公司 Big data based acquisition and analysis method

Also Published As

Publication number Publication date
CN112132095B (en) 2024-02-09

Similar Documents

Publication Publication Date Title
CN112132095B (en) Dangerous state identification method and device, electronic equipment and storage medium
CN109145680B (en) Method, device and equipment for acquiring obstacle information and computer storage medium
US20210133468A1 (en) Action Recognition Method, Electronic Device, and Storage Medium
CN109471919B (en) Zero pronoun resolution method and device
Khaled et al. In-door assistant mobile application using cnn and tensorflow
CN111539212A (en) Text information processing method and device, storage medium and electronic equipment
CN113792207A (en) Cross-modal retrieval method based on multi-level feature representation alignment
CN112150457A (en) Video detection method, device and computer readable storage medium
CN110929176A (en) Information recommendation method and device and electronic equipment
CN113850109A (en) Video image alarm method based on attention mechanism and natural language processing
CN112052911A (en) Method and device for identifying riot and terrorist content in image, electronic equipment and storage medium
CN115761839A (en) Training method of human face living body detection model, human face living body detection method and device
CN107622769B (en) Number modification method and device, storage medium and electronic equipment
CN106791010B (en) Information processing method and device and mobile terminal
CN115966061B (en) Disaster early warning processing method, system and device based on 5G message
CN112381091A (en) Video content identification method and device, electronic equipment and storage medium
CN112036307A (en) Image processing method and device, electronic equipment and storage medium
CN108241678A (en) The method for digging and device of interest point data
KR102395410B1 (en) System and method for providing sign language avatar using non-marker
CN115018215A (en) Population residence prediction method, system and medium based on multi-modal cognitive map
CN113392639B (en) Title generation method, device and server based on artificial intelligence
CN111723783A (en) Content identification method and related device
CN117332039B (en) Text detection method, device, equipment and storage medium
CN113518201B (en) Video processing method, device and equipment
CN116702094B (en) Group application preference feature representation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant