CN112115791A - Image recognition method and device, electronic equipment and computer-readable storage medium - Google Patents

Image recognition method and device, electronic equipment and computer-readable storage medium Download PDF

Info

Publication number
CN112115791A
CN112115791A CN202010833673.0A CN202010833673A CN112115791A CN 112115791 A CN112115791 A CN 112115791A CN 202010833673 A CN202010833673 A CN 202010833673A CN 112115791 A CN112115791 A CN 112115791A
Authority
CN
China
Prior art keywords
image
target
recognition
dynamic
category
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010833673.0A
Other languages
Chinese (zh)
Inventor
林航东
张法朝
唐剑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Didi Infinity Technology and Development Co Ltd
Original Assignee
Beijing Didi Infinity Technology and Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Didi Infinity Technology and Development Co Ltd filed Critical Beijing Didi Infinity Technology and Development Co Ltd
Priority to CN202010833673.0A priority Critical patent/CN112115791A/en
Publication of CN112115791A publication Critical patent/CN112115791A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • G06V40/113Recognition of static hand signs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Biology (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention discloses an image recognition method, an image recognition device, electronic equipment and a computer-readable storage medium.

Description

Image recognition method and device, electronic equipment and computer-readable storage medium
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to an image recognition method, an image recognition apparatus, an electronic device, and a computer-readable storage medium.
Background
With the rapid development of the artificial intelligence technology, the man-machine interaction mode is also converted from typewriting to an interaction mode which accords with the natural habits of users, such as expressions, languages, gestures, postures and the like, so that the convenience of interaction is greatly improved. The existing gesture recognition such as human-computer interaction usually performs gesture recognition based on a neural network model of template matching or key point recognition, and has large calculation amount and low speed.
Disclosure of Invention
In view of this, embodiments of the present invention disclose an image recognition method, an image recognition apparatus, an electronic device, and a computer-readable storage medium, so as to reduce the amount of computation and power consumption for image recognition and improve the image recognition efficiency.
In a first aspect, an embodiment of the present invention provides an image identification method, where the method includes:
acquiring an image to be processed;
determining the state type of a target part corresponding to the image to be processed;
in response to a category sequence of a plurality of continuously determined target part state categories meeting a first predetermined condition, starting dynamic image recognition;
acquiring a motion track of a target part;
and inputting the motion trail of the target part into a dynamic recognition model for processing, and determining the corresponding target action type.
Optionally, the acquiring the motion trajectory of the target portion includes:
acquiring a multi-frame moving image of the target part in the moving process;
carrying out target detection on a plurality of frames of moving images, and determining a target area corresponding to each moving image;
and determining the motion track according to the central point of each target area.
Optionally, the method further includes:
and ending the dynamic image recognition operation in response to the continuously determined category sequence of the plurality of target part state categories meeting a second predetermined condition.
Optionally, the first predetermined condition is that the category sequence is the same as a preset dynamic recognition start sequence, and the second predetermined condition is that the category sequence is the same as a preset dynamic recognition end sequence.
Optionally, determining the state type of the target region corresponding to the image to be processed includes:
carrying out target detection on the image to be processed, and determining a target area of the image to be processed;
and inputting the target area of the image to be processed into a static recognition model for processing so as to determine the state type of the target part.
Optionally, the method further includes:
and displaying the motion trail on a corresponding display page.
Optionally, the method further includes:
determining a corresponding first operation instruction according to the target action category;
and executing corresponding operation according to the first operation instruction.
Optionally, the method further includes:
determining a corresponding second operation instruction according to the state type of the target part;
and executing corresponding operation according to the second operation instruction.
Optionally, the dynamic recognition model is trained by the following steps:
acquiring the motion trail of each target action category;
carrying out category marking on each obtained motion track to obtain first sample data;
and training according to the first sample data to obtain the dynamic recognition model.
Optionally, the static recognition model is trained by the following steps:
acquiring a static image comprising the state category of each target part;
performing category marking on each static image to obtain second sample data;
and training according to the second sample data to obtain the static recognition model.
In a second aspect, an embodiment of the present invention provides an image recognition apparatus, including:
an image acquisition unit configured to acquire an image to be processed;
a first class determination unit configured to determine a target part state class corresponding to the image to be processed;
a dynamic starting unit configured to start dynamic image recognition in response to a category sequence of a plurality of target site state categories determined consecutively satisfying a first predetermined condition;
a motion trajectory acquisition unit configured to acquire a motion trajectory of a target portion;
and the second type determination unit is configured to input the motion trail of the target part into a dynamic recognition model for processing, and determine a corresponding target action type.
Optionally, the motion trajectory acquiring unit includes:
an image acquisition subunit configured to acquire a multi-frame moving image of the target portion during movement;
a first target area determination subunit configured to perform target detection on a plurality of frames of the moving images, and determine a target area corresponding to each of the moving images;
a motion trajectory acquisition subunit configured to determine the motion trajectory from a center point of each of the target regions.
Optionally, the apparatus further comprises:
and a dynamic ending unit configured to end the dynamic image recognition operation in response to a category sequence of the plurality of target site state categories determined consecutively satisfying a second predetermined condition.
Optionally, the first predetermined condition is that the category sequence is the same as a preset dynamic recognition start sequence, and the second predetermined condition is that the category sequence is the same as a preset dynamic recognition end sequence.
Optionally, the first category determining unit includes:
the second target area determining subunit is configured to perform target detection on the image to be processed and determine a target area of the image to be processed;
a first category determining subunit, configured to input the target area of the image to be processed into a static recognition model for processing, so as to determine the target part state category.
Optionally, the apparatus further comprises:
and the display control unit is configured to display the motion trail on a corresponding display page.
Optionally, the apparatus further comprises:
a first instruction determining unit configured to determine a corresponding first operation instruction according to the target action category;
a first execution unit configured to execute a corresponding operation according to the first operation instruction.
Optionally, the apparatus further comprises:
the second instruction determining unit is configured to determine a corresponding second operation instruction according to the target part state type;
and the second execution unit is configured to execute corresponding operation according to the second operation instruction.
Optionally, the apparatus further includes a dynamic recognition model training unit, where the dynamic recognition model training unit includes:
a first acquisition subunit configured to acquire a motion trajectory of each target action category;
the first sample data acquisition subunit is configured to label the types of the acquired motion tracks to acquire first sample data;
a first training subunit configured to train to obtain the dynamic recognition model according to the first sample data.
Optionally, the apparatus further includes a static recognition model training unit, where the static recognition model training unit includes:
a second acquisition subunit configured to acquire a still image including a state category of each target portion;
the second sample data acquisition subunit is configured to perform category labeling on each static image to acquire second sample data;
a second training subunit configured to train to obtain the static recognition model according to the second sample data.
In a third aspect, embodiments of the present invention provide an electronic device, which includes a memory and a processor, wherein the memory is used for storing one or more computer program instructions, and the one or more computer program instructions are executed by the processor to implement the method described above.
Optionally, the electronic device further comprises a capturing device configured to capture an image or a video.
In a fourth aspect, embodiments of the present invention provide a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement a method as described above.
According to the method and the device, the target part state type corresponding to the image to be processed is determined, the continuously determined type sequence of the plurality of target part state types meets the first preset condition, dynamic image recognition is started, the motion track of the target part is obtained, the motion track of the target part is input into the dynamic recognition model to be processed, and the corresponding target action type is determined, so that the calculated amount and the power consumption of the image recognition can be reduced, and the image recognition efficiency is improved.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent from the following description of the embodiments of the present invention with reference to the accompanying drawings, in which:
FIG. 1 is a flow chart of an image recognition method of an embodiment of the present invention;
FIG. 2 is a flow chart of a method of training a static recognition model according to an embodiment of the present invention;
FIG. 3 is a flow chart of a motion trajectory determination method of an embodiment of the present invention;
FIG. 4 is a schematic diagram of a motion trajectory determination process of an embodiment of the present invention;
FIG. 5 is a schematic diagram of a training method for a dynamic recognition model according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a still and moving image recognition process according to an embodiment of the invention;
FIG. 7 is a schematic diagram of an image recognition apparatus according to an embodiment of the present invention;
fig. 8 is a schematic diagram of an electronic device of an embodiment of the invention.
Detailed Description
The present invention will be described below based on examples, but the present invention is not limited to only these examples. In the following detailed description of the present invention, certain specific details are set forth. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details. Well-known methods, procedures, components and circuits have not been described in detail so as not to obscure the present invention.
Further, those of ordinary skill in the art will appreciate that the drawings provided herein are for illustrative purposes and are not necessarily drawn to scale.
Unless the context clearly requires otherwise, throughout the description, the words "comprise", "comprising", and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is, what is meant is "including, but not limited to".
In the description of the present invention, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified.
In the current human-computer interaction, an interaction mode conforming to the natural habit of a user, such as expression, language, gesture, and posture, is usually adopted, and for example, in an on-vehicle environment, in order to ensure the safety of a vehicle, a driver controls an on-vehicle device or a terminal device by performing human-computer interaction by adopting language, gesture, posture, and the like. The embodiment of the invention is mainly described by taking the example of realizing human-computer interaction through gesture recognition as an example, and it should be understood that the embodiment of the invention is not limited to a gesture recognition mode, and other human-computer interaction modes based on image recognition can adopt the image recognition method of the embodiment of the invention.
In the existing static and dynamic gesture recognition based on the neural network, a key point recognition mode is usually adopted, the key point recognition has high requirements on the detail characteristics of each key point of the gesture, a clear gesture image needs to be input, meanwhile, in order to meet the semantic recognition of gesture detail information, the neural network structure usually needs a deeper model to learn, and a large amount of manpower and material resources are also needed for data labeling of different application scenes. Moreover, the computing power of the embedded platform is weak, the memory is less, and the computing power of the current static and dynamic gesture recognition network on the embedded platform can generate great pressure. Therefore, the embodiment of the invention provides an image identification method, which is used for reducing the calculation amount and power consumption of image identification and improving the image identification efficiency.
Fig. 1 is a flowchart of an image recognition method according to an embodiment of the present invention. As shown in fig. 1, the image recognition method according to the embodiment of the present invention includes the following steps:
step S110, an image to be processed is acquired. Optionally, the image acquisition device acquires an image including the motion of the target portion, or receives an image of the motion of the target portion, such as a gesture image, a head motion image, or an image of other portion motions.
And step S120, determining the state type of the target part corresponding to the image to be processed. The target portion state type is also a current motion state type of the target portion, such as a gesture type.
In an optional implementation manner, step S120 may specifically include: and carrying out target detection on the image to be processed, determining a target area of the image to be processed, inputting the target area of the image to be processed into the static recognition model for processing, and determining the corresponding target part state category. Optionally, the target detection model is used to process the image to be processed, and a target area of the image to be processed is obtained. Optionally, the target detection model may adopt a Faster R-CNN model, an SSD model, or a YOLO model, which is not limited in this embodiment.
Taking a gesture as an example, inputting an image to be processed including the gesture into a target detection model for processing, acquiring a gesture area in the image to be processed, inputting the gesture area into a static recognition model for processing, and obtaining a corresponding gesture category. Optionally, the gesture categories may include scissors, stone, cloth, OK, number, like gestures.
FIG. 2 is a flowchart of a training method of a static recognition model according to an embodiment of the present invention. In an alternative implementation manner, as shown in fig. 2, the static recognition model of the present embodiment is obtained by training through the following steps:
step S210, a still image including the state type of each target portion is acquired. In an alternative implementation, taking a gesture as an example, assume that the gesture category includes A, B, C, D, E, F, G seven categories. For each type of gesture, acquiring a corresponding training data set, for example, for gesture class a, acquiring a plurality of images including gesture a to determine the training data set corresponding to gesture a.
Step S220, performing category labeling on each static image to obtain second sample data. In an alternative implementation, each still image is encoded in a one-hot (one-hot) encoding manner. For example, the A, B, C, D, E, F, G gesture is coded as follows: and a gesture A: 1000000, gesture B: 0100000, gesture C: 0010000, gesture D: 0001000, gesture E: 0000100, gesture F: 0000010, gesture B: 0000001. it should be understood that the labeling manner in the training data set is not limited in this embodiment, and other image labeling manners can be applied to this embodiment.
And step S230, training according to the second sample data to obtain a static recognition model. Optionally, the second sample data is input into the static identification model for processing, and parameters of the static identification model are adjusted according to the corresponding loss function, so that the output of the static identification model and the corresponding class label are kept as consistent as possible (i.e., the loss function is minimized).
The static recognition model in the embodiment of the invention only needs to label the image in the training data set according to the category, and does not need to label the key point of the target part (such as a gesture) in the image, thereby greatly reducing the complexity of the static recognition model and the manpower and material resources required by image labeling, reducing the calculated amount of the static recognition model for processing the image to be processed, and improving the efficiency of image recognition.
Step S130, responding to the continuously determined category sequence of the plurality of target part state categories meeting a first preset condition, and starting dynamic image recognition. Optionally, in response to that the continuously determined category sequence of the multiple target part state categories is the same as the preset dynamic recognition starting sequence, the dynamic image recognition is started. Taking a gesture as an example, a gesture sequence ABA is preset as a dynamic recognition start sequence, that is, in static image recognition, if a gesture a, a gesture B, and a gesture a are continuously recognized, dynamic image recognition is started. It should be understood that the present embodiment does not limit the length of the dynamic recognition initiation sequence and the type of gesture in the sequence.
Step S140, a motion trajectory of the target portion is obtained. In the present embodiment, after the dynamic image recognition is started, a motion trajectory of a target portion, for example, a motion trajectory of a hand, is acquired.
Fig. 3 is a flowchart of a motion trajectory determination method according to an embodiment of the present invention. In an alternative implementation manner, as shown in fig. 3, step S140 may specifically include:
in step S141, a multi-frame moving image of the target portion during movement is acquired. Optionally, the multi-frame moving image is acquired in real time during the moving process of the target portion, or the multi-frame moving image during the moving process of the target portion is received.
In step S142, object detection is performed on the multi-frame moving images, and an object area corresponding to each moving image is determined. Optionally, the target detection model is used to process a plurality of frames of moving images, and a target area of each frame of moving image is obtained. Optionally, the target detection model may adopt a Faster R-CNN model, an SSD model, or a YOLO model, which is not limited in this embodiment.
And step S143, determining a motion track according to the central point of each target area. That is, in the present embodiment, the trajectory of the center of each target region is determined as the motion trajectory of the target portion. In an alternative implementation, the motion trail is displayed on a display page of the user terminal, so that the user can determine whether the motion trail is drawn accurately.
Fig. 4 is a schematic diagram of a motion trajectory determination process of an embodiment of the present invention. As shown in fig. 4, after the motion image recognition is started, a plurality of frames of motion images X of the hand during motion are acquired, object detection is performed on each frame of motion image X, a hand region Y of each frame of motion image X is determined, a center point c of each hand region Y is determined, and a motion trajectory 41 of the hand is determined according to a change in position of each center point c. In an optional implementation manner, when the hand moves, the motion track of the hand is displayed on a display page of the user terminal, so that the user can determine whether the motion track is drawn accurately.
And S150, inputting the motion trail of the target part into the dynamic recognition model for processing, and determining the corresponding target action type.
FIG. 5 is a diagram illustrating a training method for a dynamic recognition model according to an embodiment of the present invention. In an alternative implementation, as shown in fig. 5, the dynamic recognition model is trained by:
in step S310, the motion trajectory of each target motion category is acquired. Optionally, taking the gesture motion as an example, a plurality of standard gesture motions are set, for example, a "V" or a "V" pattern is drawn by using a hand, a plurality of videos of each type of standard gesture motion are acquired, and a motion trajectory set of each standard gesture motion is acquired through the steps of steps S141 to S143.
And step S320, performing category marking on each acquired motion track to acquire first sample data. For example, the set of motion trajectories for each type of standard gesture motion is labeled to obtain first sample data.
And step S330, training according to the first sample data to obtain a dynamic recognition model. Optionally, the second sample data is input into the static identification model for processing, and parameters of the dynamic identification model are adjusted according to the corresponding loss function, so that the output of the dynamic identification model and the corresponding class label are kept as consistent as possible (i.e. the loss function is minimized).
The dynamic recognition model in the embodiment of the invention only needs to label the motion trail in the training data set in a category mode, and does not need to label key points of target parts (such as gestures) of each frame of image in the video, thereby greatly reducing the complexity of the dynamic recognition model and manpower and material resources required by video labeling, reducing the calculation amount of the dynamic recognition model for recognizing the motion trail of the target part, and improving the efficiency of image recognition.
In an alternative implementation manner, the present embodiment further sets a dynamic recognition end sequence to indicate that the dynamic image recognition operation ends, and switches to the static recognition operation. Optionally, in response to the continuously determined category sequence of the plurality of target part state categories satisfying the second predetermined condition, the moving image recognition operation is ended. Optionally, the dynamic image recognition operation is ended in response to that the continuously determined category sequence of the plurality of target part state categories is the same as a preset dynamic recognition ending sequence. It should be understood that the dynamic recognition start sequence and the dynamic recognition end sequence may be the same or different. Taking a gesture as an example, a gesture sequence ABA may be preset as a dynamic recognition ending sequence, that is, in the image recognition process, if a gesture a, a gesture B, and a gesture a are continuously recognized, the dynamic image recognition operation is ended.
According to the method and the device, the target part state type corresponding to the image to be processed is determined, the continuously determined type sequence of the plurality of target part state types meets the first preset condition, dynamic image recognition is started, the motion track of the target part is obtained, the motion track of the target part is input into the dynamic recognition model to be processed, and the corresponding target action type is determined, so that the calculated amount and the power consumption of the image recognition can be reduced, and the image recognition efficiency is improved.
In an optional implementation manner, each type of target action may correspond to different first operation instructions, and thus, the corresponding first operation instruction may be determined according to the type of the target action acquired in step S150, and the corresponding operation may be executed according to the first operation instruction. For example, if the command corresponding to the "V" type motion is an unlock command, the corresponding device is controlled to unlock when the "V" type motion is recognized during the moving image recognition.
In an alternative implementation manner, each type of target portion state may correspond to a different second operation instruction, and thus, the corresponding second operation instruction is determined according to the target portion state type acquired in step S120, and a corresponding operation is executed according to the second operation instruction. For example, if the command corresponding to the "scissors" gesture is a music playing command, when the "scissors" gesture is recognized in the static image recognition process, the corresponding device player is controlled to play music. Optionally, in this embodiment, the target portion state category in the dynamic recognition start sequence and the dynamic recognition end sequence does not set a corresponding second operation instruction, so as to avoid causing a corresponding device to perform a misoperation when the dynamic image recognition operation is started. For example, assuming that the dynamic recognition start sequence is gesture ABA, the corresponding second operation instruction is not set for gesture a and gesture B.
Fig. 6 is a schematic diagram of a still and moving image recognition process according to an embodiment of the present invention. The present embodiment is described by taking the target site as an example of a hand. As shown in fig. 6, an image to be processed X1 is acquired, target detection is performed on the image to be processed X1, a gesture area Y1 is determined, the gesture area Y1 is input into the static recognition model 61 for static image recognition, the gesture in the image to be processed X1 is determined to be a "number 1" gesture, a second operation instruction corresponding to the "number 1" gesture is acquired, and the corresponding device 63 is controlled to execute a corresponding operation according to the second operation instruction.
As shown in fig. 6, target detection is performed on the to-be-processed images X2, X3, and X4 acquired later, corresponding gesture regions Y2, Y3, and Y4 are determined, the gesture regions Y2, Y3, and Y4 are respectively input into the static recognition model 61 to perform static image recognition, and the gestures in the to-be-processed images X2, X3, and X4 are respectively determined to be a "cloth" gesture, a "stone" gesture, and a "cloth" gesture, which are the same as the gesture sequence in the dynamic recognition start sequence, so that dynamic image recognition is started. After the dynamic image recognition is started, acquiring a continuous multi-frame moving image X5, performing target detection on the continuous multi-frame moving image X5, determining a target area corresponding to each frame moving image X5, determining a motion track 64 according to the central point of the target area of each frame moving image X5, inputting the motion track 64 into a dynamic recognition model 62 for dynamic recognition, determining a hand motion corresponding to the multi-frame moving image X5 as a motion K, acquiring a first operation instruction corresponding to the motion K, and controlling a corresponding device 63 to execute corresponding operation according to the first operation instruction. In an alternative implementation manner, in the dynamic image recognition process, the target area of the last acquired several frames of images is input into the static recognition model 61 for recognition to determine whether a dynamic recognition ending sequence exists, and when the dynamic recognition ending sequence is recognized, the control ends the dynamic image recognition operation and switches to the static image recognition operation.
It is easy to understand that in this embodiment, the still image recognition operation and the moving image recognition operation may be performed by the device 63, or may be performed by other settings, and the image to be processed and the continuous multi-frame moving image may be acquired by an image acquisition device of the device 63, or may be acquired by other image acquisition devices, which is not limited in this embodiment.
According to the method and the device, the target part state type corresponding to the image to be processed is determined, the continuously determined type sequence of the plurality of target part state types meets the first preset condition, dynamic image recognition is started, the motion track of the target part is obtained, the motion track of the target part is input into the dynamic recognition model to be processed, and the corresponding target action type is determined, so that the calculated amount and the power consumption of the image recognition can be reduced, and the image recognition efficiency is improved.
Fig. 7 is a schematic diagram of an image recognition apparatus according to an embodiment of the present invention. As shown in fig. 7, the image recognition apparatus 7 of the present embodiment includes an image acquisition unit 71, a first category determination unit 72, a dynamic boot unit 73, a motion trajectory acquisition unit 74, and a second category determination unit 75.
The image acquisition unit 71 is configured to acquire an image to be processed. The first class determination unit 72 is configured to determine a target region state class corresponding to the image to be processed. The dynamic initiation unit 73 is configured to initiate dynamic image recognition in response to a category sequence of a plurality of target site state categories determined consecutively satisfying a first predetermined condition. The motion trajectory acquisition unit 74 is configured to acquire a motion trajectory of the target portion. The second type determination unit 75 is configured to input the motion trajectory of the target portion to a dynamic recognition model for processing, and determine a corresponding target motion type.
In an alternative implementation, the first category determining unit 72 comprises a second target area determining subunit 721 and a first category determining subunit 722. The second target area determining subunit 721 is configured to perform target detection on the image to be processed, and determine a target area of the image to be processed. The first class determination subunit 722 is configured to input the target region of the image to be processed into a static recognition model for processing, so as to determine the target part state class.
In an alternative implementation, the motion trajectory acquisition unit 74 includes an image acquisition subunit 741, a first target region determination subunit 742, and a motion trajectory acquisition subunit 743. The image acquisition subunit 741 is configured to acquire a multi-frame moving image of the target portion during movement. The first target region determining subunit 742 is configured to perform target detection on a plurality of frames of the moving images, and determine a target region corresponding to each of the moving images. The motion trajectory acquiring subunit 743 is configured to determine the motion trajectory from the central point of each of the target regions.
In an alternative implementation, the image recognition device 7 further comprises a dynamic ending unit 76. The dynamic ending unit 76 is configured to end the dynamic image recognition operation in response to the category sequence of the plurality of target site state categories determined consecutively satisfying the second predetermined condition. Optionally, the first predetermined condition is that the category sequence is the same as a preset dynamic recognition start sequence, and the second predetermined condition is that the category sequence is the same as a preset dynamic recognition end sequence.
In an alternative implementation, the image recognition device 7 further comprises a display control unit 77. The display control unit 77 is configured to display the motion trajectory on the corresponding display page.
In an alternative implementation, the image recognition apparatus 7 further includes a first instruction determining unit 78 and a first executing unit 79. The first instruction determination unit 78 is configured to determine a corresponding first operation instruction according to the target action category. The first execution unit 79 is configured to execute the corresponding operation according to the first operation instruction.
In an alternative implementation, the image recognition device 7 further comprises a second instruction determining unit 7A and a second execution unit 7B. The second instruction determining unit 7A is configured to determine a corresponding second operation instruction according to the target site state category. The second execution unit 7B is configured to execute the corresponding operation according to the second operation instruction.
In an alternative implementation, the image recognition apparatus 7 further includes a dynamic recognition model training unit 7C, and the dynamic recognition model training unit 7C includes a first obtaining sub-unit 7C1, a first sample data obtaining sub-unit 7C2, and a first training sub-unit 7C 3. The first acquisition subunit 7C1 is configured to acquire a motion trajectory for each target action category. The first sample data acquisition subunit 7C2 is configured to perform category labeling on each of the acquired motion trajectories, and obtain first sample data. The first training subunit 7C3 is configured to train to obtain the dynamic recognition model from the first sample data.
In an alternative implementation, the image recognition apparatus 7 further comprises a static recognition model training unit 7D, and the static recognition model training unit 7D comprises a second obtaining sub-unit 7D1, a second sample data obtaining sub-unit 7D2, and a second training sub-unit 7D 3. The second acquisition subunit 7D1 is configured to acquire a still image including each target site state category. The second sample data acquiring subunit 7D2 is configured to perform category labeling on each of the still images, and acquire second sample data. The second training subunit 7D3 is configured to train to obtain the static recognition model according to the second sample data.
According to the method and the device, the target part state type corresponding to the image to be processed is determined, the continuously determined type sequence of the plurality of target part state types meets the first preset condition, dynamic image recognition is started, the motion track of the target part is obtained, the motion track of the target part is input into the dynamic recognition model to be processed, and the corresponding target action type is determined, so that the calculated amount and the power consumption of the image recognition can be reduced, and the image recognition efficiency is improved.
Fig. 8 is a schematic diagram of an electronic device of an embodiment of the invention. As shown in fig. 8, the electronic device shown in fig. 8 is a general-purpose data processing apparatus including a general-purpose computer hardware structure including at least a processor 81 and a memory 82. The processor 81 and the memory 82 are connected by a bus 83. The memory 82 is adapted to store instructions or programs executable by the processor 81. Processor 81 may be a stand-alone microprocessor or a collection of one or more microprocessors. Thus, the processor 81 implements the processing of data and the control of other devices by executing instructions stored by the memory 82 to perform the method flows of embodiments of the present invention as described above. Optionally, the electronic device 8 further comprises a display controller 84, an input/output (I/O) device 85, an input/output (I/O) controller 86, and an acquisition device 87. The capturing device 87 is used for capturing images or videos. Wherein a bus 83 connects the above components together, as well as to a display controller 84 and a display device and input/output (I/O) device 85. Input/output (I/O) devices 85 may be a mouse, keyboard, modem, network interface, touch input device, motion sensing input device, printer, and other devices known in the art. Typically, the input/output devices 85 are coupled to the system through an input/output (I/O) controller 86.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, apparatus (device) or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may employ a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations of methods, apparatus (devices) and computer program products according to embodiments of the application. It will be understood that each flow in the flow diagrams can be implemented by computer program instructions.
These computer program instructions may be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows.
These computer program instructions may also be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows.
Another embodiment of the invention is directed to a non-transitory storage medium storing a computer-readable program for causing a computer to perform some or all of the above-described method embodiments.
That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (14)

1. An image recognition method, characterized in that the method comprises:
acquiring an image to be processed;
determining the state type of a target part corresponding to the image to be processed;
in response to a category sequence of a plurality of continuously determined target part state categories meeting a first predetermined condition, starting dynamic image recognition;
acquiring a motion track of a target part;
and inputting the motion trail of the target part into a dynamic recognition model for processing, and determining the corresponding target action type.
2. The method of claim 1, wherein obtaining a motion trajectory of the target site comprises:
acquiring a multi-frame moving image of the target part in the moving process;
carrying out target detection on a plurality of frames of moving images, and determining a target area corresponding to each moving image;
and determining the motion track according to the central point of each target area.
3. The method of claim 1, further comprising:
and ending the dynamic image recognition operation in response to the continuously determined category sequence of the plurality of target part state categories meeting a second predetermined condition.
4. The method according to claim 1, wherein the first predetermined condition is that the category sequence is the same as a preset dynamic recognition start sequence, and the second predetermined condition is that the category sequence is the same as a preset dynamic recognition end sequence.
5. The method according to claim 1, wherein determining the target site state category corresponding to the image to be processed comprises:
carrying out target detection on the image to be processed, and determining a target area of the image to be processed;
and inputting the target area of the image to be processed into a static recognition model for processing so as to determine the state type of the target part.
6. The method according to claim 1 or 2, characterized in that the method further comprises:
and displaying the motion trail on a corresponding display page.
7. The method of claim 1, further comprising:
determining a corresponding first operation instruction according to the target action category;
and executing corresponding operation according to the first operation instruction.
8. The method of claim 5, further comprising:
determining a corresponding second operation instruction according to the state type of the target part;
and executing corresponding operation according to the second operation instruction.
9. The method of claim 1, wherein the dynamic recognition model is trained by:
acquiring the motion trail of each target action category;
carrying out category marking on each obtained motion track to obtain first sample data;
and training according to the first sample data to obtain the dynamic recognition model.
10. The method of claim 5, wherein the static recognition model is trained by:
acquiring a static image comprising the state category of each target part;
performing category marking on each static image to obtain second sample data;
and training according to the second sample data to obtain the static recognition model.
11. An image recognition apparatus, characterized in that the apparatus comprises:
an image acquisition unit configured to acquire an image to be processed;
a first class determination unit configured to determine a target part state class corresponding to the image to be processed;
a dynamic starting unit configured to start dynamic image recognition in response to a category sequence of a plurality of target site state categories determined consecutively satisfying a first predetermined condition;
a motion trajectory acquisition unit configured to acquire a motion trajectory of a target portion;
and the second type determination unit is configured to input the motion trail of the target part into a dynamic recognition model for processing, and determine a corresponding target action type.
12. An electronic device, comprising a memory and a processor, wherein the memory is configured to store one or more computer program instructions, wherein the one or more computer program instructions are executed by the processor to implement the method of any one of claims 1-10.
13. The electronic device of claim 12, further comprising a capture device configured to capture an image or video.
14. A computer-readable storage medium on which computer program instructions are stored, which computer program instructions, when executed by a processor, are to implement a method according to any one of claims 1-10.
CN202010833673.0A 2020-08-18 2020-08-18 Image recognition method and device, electronic equipment and computer-readable storage medium Pending CN112115791A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010833673.0A CN112115791A (en) 2020-08-18 2020-08-18 Image recognition method and device, electronic equipment and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010833673.0A CN112115791A (en) 2020-08-18 2020-08-18 Image recognition method and device, electronic equipment and computer-readable storage medium

Publications (1)

Publication Number Publication Date
CN112115791A true CN112115791A (en) 2020-12-22

Family

ID=73804162

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010833673.0A Pending CN112115791A (en) 2020-08-18 2020-08-18 Image recognition method and device, electronic equipment and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN112115791A (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103336967A (en) * 2013-05-27 2013-10-02 东软集团股份有限公司 Hand motion trail detection method and apparatus
CN103390168A (en) * 2013-07-18 2013-11-13 重庆邮电大学 Intelligent wheelchair dynamic gesture recognition method based on Kinect depth information
CN104246668A (en) * 2012-01-10 2014-12-24 马克西姆综合产品公司 Method and apparatus for activating electronic devices with gestures
CN105787471A (en) * 2016-03-25 2016-07-20 南京邮电大学 Gesture identification method applied to control of mobile service robot for elder and disabled
CN106557672A (en) * 2015-09-29 2017-04-05 北京锤子数码科技有限公司 The solution lock control method of head mounted display and device
CN107085469A (en) * 2017-04-21 2017-08-22 深圳市茁壮网络股份有限公司 A kind of recognition methods of gesture and device
CN107563286A (en) * 2017-07-28 2018-01-09 南京邮电大学 A kind of dynamic gesture identification method based on Kinect depth information
CN108460313A (en) * 2017-02-17 2018-08-28 鸿富锦精密工业(深圳)有限公司 A kind of gesture identifying device and human-computer interaction system
CN108595003A (en) * 2018-04-23 2018-09-28 Oppo广东移动通信有限公司 Function control method and relevant device
CN108921101A (en) * 2018-07-04 2018-11-30 百度在线网络技术(北京)有限公司 Processing method, equipment and readable storage medium storing program for executing based on gesture identification control instruction
CN109784421A (en) * 2019-01-30 2019-05-21 北京朗镜科技有限责任公司 A kind of construction method and device of identification model

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104246668A (en) * 2012-01-10 2014-12-24 马克西姆综合产品公司 Method and apparatus for activating electronic devices with gestures
CN103336967A (en) * 2013-05-27 2013-10-02 东软集团股份有限公司 Hand motion trail detection method and apparatus
CN103390168A (en) * 2013-07-18 2013-11-13 重庆邮电大学 Intelligent wheelchair dynamic gesture recognition method based on Kinect depth information
CN106557672A (en) * 2015-09-29 2017-04-05 北京锤子数码科技有限公司 The solution lock control method of head mounted display and device
CN105787471A (en) * 2016-03-25 2016-07-20 南京邮电大学 Gesture identification method applied to control of mobile service robot for elder and disabled
CN108460313A (en) * 2017-02-17 2018-08-28 鸿富锦精密工业(深圳)有限公司 A kind of gesture identifying device and human-computer interaction system
CN107085469A (en) * 2017-04-21 2017-08-22 深圳市茁壮网络股份有限公司 A kind of recognition methods of gesture and device
CN107563286A (en) * 2017-07-28 2018-01-09 南京邮电大学 A kind of dynamic gesture identification method based on Kinect depth information
CN108595003A (en) * 2018-04-23 2018-09-28 Oppo广东移动通信有限公司 Function control method and relevant device
CN108921101A (en) * 2018-07-04 2018-11-30 百度在线网络技术(北京)有限公司 Processing method, equipment and readable storage medium storing program for executing based on gesture identification control instruction
CN109784421A (en) * 2019-01-30 2019-05-21 北京朗镜科技有限责任公司 A kind of construction method and device of identification model

Similar Documents

Publication Publication Date Title
CN108197589B (en) Semantic understanding method, apparatus, equipment and the storage medium of dynamic human body posture
JP4050055B2 (en) Handwritten character batch conversion apparatus, handwritten character batch conversion method, and program
US8644556B2 (en) Image processing apparatus and method and program
JP5147933B2 (en) Man-machine interface device system and method
US20140295393A1 (en) Interactive rehabilitation method and system for movement of upper and lower extremities
US20090153468A1 (en) Virtual Interface System
KR101718837B1 (en) A method, a device, and an electronic equipment for controlling an Application Program
JP7149202B2 (en) Behavior analysis device and behavior analysis method
CN102456135A (en) Imaging processing apparatus, method and program
US20030214524A1 (en) Control apparatus and method by gesture recognition and recording medium therefor
WO2016084336A1 (en) Iterative training device, iterative training method, and storage medium
US20150193001A1 (en) Input device, apparatus, input method, and recording medium
KR101916675B1 (en) Gesture recognition method and system for user interaction
CN117197878B (en) Character facial expression capturing method and system based on machine learning
WO2016140628A1 (en) Sketch misrecognition correction based on eye gaze monitoring
CN103413137B (en) Based on the interaction gesture movement locus dividing method of more rules
RU2552192C2 (en) Method and system for man-machine interaction based on gestures and machine readable carrier to this end
CN110662587A (en) Game program, information processing device, information processing system, and game processing method
CN112115791A (en) Image recognition method and device, electronic equipment and computer-readable storage medium
KR101525011B1 (en) tangible virtual reality display control device based on NUI, and method thereof
CN113269008A (en) Pedestrian trajectory prediction method and device, electronic equipment and storage medium
JP2002015282A (en) Device and program for handwritten character recognition and computer-readable recording medium with recorded handwritten character recognizing program
CN104516566A (en) Handwriting input method and device
CN115705754A (en) Method and device for recognizing picture book
CN115966016B (en) Jump state identification method, system, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201222

RJ01 Rejection of invention patent application after publication