WO2017041366A1 - 图像识别方法和装置 - Google Patents

图像识别方法和装置 Download PDF

Info

Publication number
WO2017041366A1
WO2017041366A1 PCT/CN2015/096132 CN2015096132W WO2017041366A1 WO 2017041366 A1 WO2017041366 A1 WO 2017041366A1 CN 2015096132 W CN2015096132 W CN 2015096132W WO 2017041366 A1 WO2017041366 A1 WO 2017041366A1
Authority
WO
WIPO (PCT)
Prior art keywords
identified
image
information
recognition
target object
Prior art date
Application number
PCT/CN2015/096132
Other languages
English (en)
French (fr)
Inventor
龚龙
张彦福
顾嘉唯
Original Assignee
北京百度网讯科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京百度网讯科技有限公司 filed Critical 北京百度网讯科技有限公司
Priority to US15/535,006 priority Critical patent/US10796685B2/en
Publication of WO2017041366A1 publication Critical patent/WO2017041366A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2111Selection of the most significant subset of features by using evolutionary computational techniques, e.g. genetic algorithms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/20Scenes; Scene-specific elements in augmented reality scenes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/027Concept to speech synthesisers; Generation of natural phrases from machine-based concepts

Definitions

  • the present application relates to the field of computers, and in particular to the field of image recognition, and more particularly to an image recognition method and apparatus.
  • image recognition is to establish a recognition model by analyzing features of a massive image, and the recognition model is used to identify the image.
  • the recognition model can only be adjusted according to the recognition result output by the machine. When there are many errors in the output recognition result, the adjustment of the recognition model may be deviated, resulting in a decrease in the recognition accuracy.
  • the present application provides an image recognition method and apparatus for solving the technical problems existing in the above background art.
  • the present application provides an image recognition method, the method comprising: acquiring an image to be identified that includes an object to be identified; transmitting the image to be identified to a server, and receiving an object to be identified obtained by recognizing the image to be recognized returned by the server
  • the identification information of the corresponding target object and the confidence parameter, the confidence parameter represents the probability that the object to be identified is the target object; when the confidence parameter is greater than the confidence threshold, the identification information of the target object is used as the recognition result; when the confidence parameter is smaller than Confidence threshold, obtained from third-party platforms
  • the annotation information associated with the image to be identified is taken, and the annotation information is taken as the recognition result.
  • the present application provides an image recognition method, including: receiving an image to be identified that is sent by a client and including an object to be identified; identifying the image to be identified, obtaining identification information of the target object corresponding to the object to be identified, and a confidence level
  • the parameter, the confidence parameter represents the probability that the object to be identified is the target object; the identification information of the target object and the confidence parameter are sent to the client.
  • the present application provides an image recognition apparatus, the apparatus comprising: an acquisition unit configured to acquire an image to be identified that includes an object to be identified; an interaction unit configured to send the image to be identified to the server, and the receiving server returns The identification information of the target object corresponding to the object to be identified obtained by identifying the recognition image and the confidence parameter, the confidence parameter characterizing the probability that the object to be identified is the target object; and the determining unit configured to use the confidence parameter to be greater than the confidence level
  • the identification information of the target object is used as the recognition result; when the confidence parameter is less than the confidence threshold, the annotation information associated with the image to be recognized is acquired from the third-party platform, and the annotation information is used as the recognition result.
  • the present application provides an image recognition apparatus, comprising: a receiving unit configured to receive an image to be identified that is sent by a client and includes an object to be identified; and an identification unit configured to identify the image to be recognized and obtain Identifying the identification information of the target object corresponding to the object and the confidence parameter, the confidence parameter characterizing the probability that the object to be identified is the target object; and the sending unit configured to send the identification information of the target object and the confidence parameter to the client.
  • the image recognition method and apparatus acquires a to-be-recognized image including an object to be identified; transmits the image to be recognized to a server, and receives a target object corresponding to the object to be identified obtained by recognizing the image to be recognized returned by the server.
  • the identification information and the confidence parameter when the confidence parameter is greater than the confidence threshold, the identification information of the target object is used as the recognition result; when the confidence parameter is less than the confidence threshold, the annotation associated with the image to be identified is acquired from the third-party platform.
  • Information, and the annotation information as a recognition result. It realizes the combination of automatic server identification and third-party annotation information, improves recognition accuracy, and uses third-party annotation information to train the recognition model corresponding to the machine learning recognition method used by the server to improve the training effect, thereby further improving the recognition accuracy. rate.
  • FIG. 1 is an exemplary system architecture diagram to which the present application can be applied;
  • FIG. 2 shows a flow chart of one embodiment of an image recognition method according to the present application
  • FIG. 3 shows a flow chart of another embodiment of an image recognition method according to the present application.
  • FIG. 4 is a block diagram showing the structure of an embodiment of an image recognition apparatus according to the present application.
  • FIG. 5 is a block diagram showing another embodiment of an image recognition apparatus according to the present application.
  • FIG. 6 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server of an embodiment of the present application.
  • FIG. 1 illustrates an exemplary system architecture 100 in which an embodiment of an image recognition method or image recognition device of the present application may be applied.
  • system architecture 100 can include terminal devices 101, 102, 103, network 104, and server 105.
  • the network 104 is used to provide a medium for the transmission link between the terminal devices 101, 102, 103 and the server 105.
  • Network 104 may include various types of connections, such as wired, wireless transmission links, or fiber optic cables, to name a few.
  • the user can interact with the server 105 over the network 104 using the terminal devices 101, 102, 103 to receive or transmit messages and the like.
  • Terminal devices 101, 102, and 103 can be installed Various communication applications, such as image recognition applications, instant messaging tools, and the like.
  • the terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting network communication, including but not limited to smart phones, tablets, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic The video specialist compresses the standard audio layer 3), MP4 (Moving Picture Experts Group Audio Layer IV) player, laptop portable computer and desktop computer, and the like.
  • MP3 players Motion Picture Experts Group Audio Layer III, dynamic The video specialist compresses the standard audio layer 3
  • MP4 Moving Picture Experts Group Audio Layer IV
  • the server 105 may be a server that provides various services, such as a background server that provides support for image recognition applications on the terminal devices 101, 102, 103.
  • the background server may perform processing such as analyzing the received image to be recognized, and the like, and feed back the processing result (target object) to the terminal device.
  • one end of the image to be identified may be referred to as a client, and the client does not specifically refer to a certain type of terminal, which may be the terminal device 101, 102, 103 or the server 105.
  • terminal devices, networks, and servers in Figure 1 is merely illustrative. Depending on the implementation needs, there can be any number of terminal devices, networks, and servers.
  • FIG. 2 illustrates a flow 200 of one embodiment of an image recognition method in accordance with the present application.
  • the image recognition method provided by the embodiment of the present application is generally performed by the terminal devices 101, 102, and 103. Accordingly, the image recognition device is generally disposed in the terminal devices 101, 102, and 103. The method includes the following steps:
  • Step 201 Acquire an image to be identified that includes an object to be identified.
  • the image to be recognized may be collected by the camera, and the camera may be disposed on the terminal device.
  • the terminal device may include, but is not limited to, a mobile terminal and a wearable device (eg, smart glasses). Taking the camera on the smart glasses as an example, when the user wears the smart glasses, the camera can be used to capture the image within the viewing angle range of the camera as the image to be recognized.
  • the camera can be turned on for image acquisition in response to the input image acquisition instruction.
  • the voice information input by the user may be received through a microphone, the voice information may be parsed to obtain an image acquisition instruction, and the camera may be triggered to perform image acquisition.
  • the image to be identified includes the object to be identified, for example, when the user enters the conference site, the camera on the wearable device worn by the user can be used to collect and meet the conference site.
  • the image associated with the scene may include a table or a chair in the conference site waiting to be recognized.
  • the object to be identified includes at least one of the following: an object object, a scene object, and a color object.
  • Step 202 Send the to-be-identified image to the server, and receive the identification information of the target object corresponding to the object to be identified obtained by identifying the image to be recognized returned by the server, and a confidence parameter.
  • the confidence parameter characterizes the probability that the object to be identified is the target object.
  • the image to be identified may be sent to the server to identify the object to be identified in the image, and then the target object corresponding to the object to be identified obtained by identifying the image to be recognized by the server and the confidence parameter may be received.
  • an optional recognition method for the server to identify the recognized image is a machine learning recognition mode.
  • the confidence parameter may be used to characterize the probability that the object to be identified is the target object when the image to be recognized is identified, that is, the similarity between the object to be identified and the sample data of the target object. The higher the value of the confidence parameter, the greater the probability that the object to be identified is the target object.
  • Step 203 When the confidence parameter is greater than the confidence threshold, the identification information of the target object is used as the recognition result; when the confidence parameter is less than the confidence threshold, the annotation information associated with the image to be identified is acquired from the third-party platform, and Label information as a recognition result.
  • an optional annotation information includes information that is published by a registered user of the third-party platform and includes identification information of the target object corresponding to the object to be identified.
  • the recognition result of the image to be recognized may be further determined.
  • the identification information of the target object may be used as the recognition result.
  • the annotation information associated with the image to be identified may be obtained from a third party platform. For example, when the image to be recognized includes a circular table and three chairs, when the image to be recognized is recognized by the server in a machine identification manner, for example, the object to be recognized and the target object are a circular table object and a chair.
  • the sample data of the object is matched.
  • the confidence parameter is greater than the confidence threshold
  • the target object table object and the identification information of the chair object that is, the table and the chair
  • the image to be identified may be sent to the server, the annotation information returned by the server, and the annotation information as the recognition result.
  • the annotation information of the object to be identified may be acquired in the following manner: the image to be identified may be sent to a third-party platform associated with the server, and the third-party platform may provide a question and answer service, wherein the question and answer service may be used for The problem of the user consultation is issued in the form of a task, and the registered user of the third-party platform will post the answer to the question on the third-party platform.
  • the question-and-answer service on the third-party platform can be utilized to generate a task for identifying the image to be identified, and then the task is sent to the registered user on the third-party platform.
  • the information input area may be provided while presenting the image to be recognized to the registered user.
  • the registered user can determine which target objects are included in the image to be identified, and then fill in the information input area with the name and number of the target object to generate the annotation information. For example, if the image to be recognized in the task for identifying the image to be recognized accepted by the registered user includes a circular table and three chairs, the registered user can fill in the information in the information input area in the following format: circle, table, one, Chair, three.
  • the annotation information can then be generated based on the information filled in by the registered user.
  • the annotation information includes "circular table and chair" which is identification information of the target object corresponding to the object to be identified, and may include "one or three" which is information indicating the number of target objects.
  • the method further includes: converting the recognition result into voice information, and playing the voice information.
  • the recognition result can be converted into voice information, and then the voice information is broadcast to the user.
  • the method further includes: when the confidence parameter is less than the confidence threshold, sending the annotation information to the server, as the training sample is used for the machine learning identification mode used by the server. Identify the training of the model.
  • the application scenario of this embodiment may be: a user (for example, a blind user) uses a camera on the worn wearable device to collect an image to be recognized (for example, a table in a conference site) associated with a scene currently being located (eg, a conference site).
  • the chair waits for the image to be recognized of the identified object).
  • the image to be identified may be sent to the server to identify the image to be identified, and the receiving server returns the identification information of the target object corresponding to the object to be identified and Reliability parameter.
  • the confidence parameter is greater than the confidence threshold (for example, the table or the chair object is correctly recognized)
  • the identification information of the target object corresponding to the object to be identified may be used as the recognition result.
  • the information may be The identification image is sent to the third-party platform, and the registered user of the third-party platform determines the target object corresponding to the object to be identified (for example, the registered user determines that the image to be recognized includes a target object such as a table or a chair), and then can receive the return from the third-party platform.
  • the annotation information including the identification information of the target object corresponding to the object to be identified is used as the recognition result. After the recognition result is determined, the recognition result can be converted into voice information for playback. This allows the user to more accurately understand the current scene (eg, which objects are included in the scene) based on the acquired image.
  • the annotation data may also be sent to the server to train the recognition model corresponding to the machine learning recognition mode used by the server as the sample data, so as to improve the training effect of the recognition model. In the subsequent image recognition, the recognition accuracy is further improved.
  • FIG. 3 illustrates a flow 300 of one embodiment of an image recognition method in accordance with the present application.
  • the image recognition method provided by the embodiment of the present application is generally performed by the server 105.
  • the image recognition apparatus is generally disposed in the server 105. The method includes the following steps:
  • Step 301 Receive an image to be identified that is sent by the client and that includes the object to be identified.
  • the image to be identified includes the object to be identified.
  • the image can be acquired by using the camera on the smart glasses, and the collected image may include a table and a chair waiting for the recognition object.
  • Step 302 Identify the image to be identified, and obtain identification information of the target object corresponding to the object to be identified and a confidence parameter.
  • the confidence parameter characterizes the probability that the object to be identified is the target object.
  • An alternative way to identify objects to be identified is machine learning.
  • Machine learning methods may include, but are not limited to, Auto Encoder, Sparse Coding, and Deep Belief Networks. The above machine learning method can also be called deep learning.
  • identifying the image to be recognized includes: identifying the image to be recognized by using a convolutional neural network model.
  • the recognition model corresponding to the machine learning recognition method used to identify the image to be recognized may be first established, and then the recognition image is identified by the recognition model.
  • the principle of recognizing an image to be recognized by the recognition model corresponding to the machine learning mode is summarized as follows: when the recognition image is recognized by using a recognition model (for example, a convolutional neural network model), some features of the object to be identified in the image to be identified may be used. (for example, the scale-invariant feature transform feature point) is represented, and an input vector is generated. After identifying the image to be recognized by the recognition model, an output vector representing the target object corresponding to the object to be identified may be obtained, and the recognition model may be used to indicate the input vector. To the mapping relationship of the output vector, the image to be recognized can then be identified based on the mapping relationship.
  • a recognition model for example, a convolutional neural network model
  • some features such as scale-invariant feature transformation feature points
  • the object to be identified for example, a table
  • the object is matched with the target object (for example, the sample data of the table object), and the confidence parameter that characterizes the object to be identified as the target object is obtained.
  • the method further includes: receiving a training identification result sent by the client, where the training identification result includes the annotation information associated with the image to be identified acquired from the third-party platform, and the labeling information includes The information of the identification information of the target object corresponding to the object to be identified issued by the registered user of the third-party platform; and the recognition model corresponding to the machine learning mode is trained by using the training recognition result.
  • the training identification result sent by the client may be related to the image to be recognized obtained by the client from the third-party platform when the confidence parameter obtained by identifying the image to be recognized by the machine learning method is less than the confidence threshold.
  • the annotation information includes information of the identification information of the target object corresponding to the object to be identified issued by the registered user of the third-party platform. For example, if the image to be recognized includes a circular table and three chairs, when the confidence parameter obtained by identifying the recognized image in a machine learning manner is less than the confidence threshold, the circular table or the surface table cannot be accurately recognized.
  • the client can be triggered to send the image to be identified to a third-party platform (for example, a third-party platform that provides a question and answer service) to obtain annotation information for the image.
  • the annotation information may be information of the identification information of the target object corresponding to the object to be identified that is published by the registered user on the third-party platform.
  • the annotation information includes "round table, one, chair, three".
  • the recognition model can be trained using the annotation information.
  • the features of the image to be identified (such as scale-invariant feature transformation feature points) can be used as the input vector of the convolutional neural network, and the annotation information is used as the ideal output vector of the convolutional neural network.
  • the convolutional neural network is trained by the input vector and the output vector to form a vector pair, so that the correct recognition result is used, that is, the registered information obtained by the registered user of the third-party platform to manually recognize the recognized image, and the recognition model is obtained.
  • the training is performed to improve the training effect of the recognition model, and the recognition accuracy is improved in the subsequent recognition of the image to be recognized.
  • sample data corresponding to the type of the object to be identified may be set in advance according to the type of the object to be identified, and then the recognition model is trained using the sample data. For example, images of some common application scenes and annotation information of the images may be acquired in advance as training data.
  • Step 303 Send the identification information of the target object and the confidence parameter to the client.
  • the identification information of the target object corresponding to the object to be identified in the image to be identified and the obtained confidence parameter may be sent to the client.
  • the device 400 includes an acquisition unit 401, an interaction unit 402, and a determination unit 403.
  • the obtaining unit 401 is configured to acquire an image to be identified that includes the object to be identified;
  • the interaction unit 402 is configured to send the image to be identified to the server, and the object to be identified obtained by identifying the image to be recognized returned by the server
  • the identification information of the target object and the confidence parameter, the confidence parameter represents the probability that the object to be identified is the target object;
  • the determining unit 403 is configured to use the identification information of the target object as the recognition result when the confidence parameter is greater than the confidence threshold;
  • the annotation information associated with the image to be identified is acquired from the third-party platform, and the annotation information is used as the recognition result.
  • the annotation information includes information that is published by a registered user of the third-party platform and includes identification information of the target object corresponding to the object to be identified.
  • the apparatus 400 further includes: a playlist A meta (not shown) configured to convert the recognition result into voice information and to play voice information.
  • the apparatus 400 further includes: an annotation information sending unit (not shown) configured to send the label information to the server when the confidence parameter is less than the confidence threshold, as The training samples are used for training the recognition model corresponding to the machine learning recognition method used by the server.
  • an annotation information sending unit (not shown) configured to send the label information to the server when the confidence parameter is less than the confidence threshold, as The training samples are used for training the recognition model corresponding to the machine learning recognition method used by the server.
  • the object to be identified includes at least one of the following: an object object, a scene object, and a color object.
  • the apparatus 500 includes a receiving unit 501, an identifying unit 502, and a transmitting unit 503.
  • the receiving unit 501 is configured to receive an image to be identified that is sent by the client and that includes the object to be identified;
  • the identifying unit 502 is configured to identify the image to be identified, and obtain the identification information of the target object and the confidence parameter corresponding to the object to be identified.
  • the confidence parameter characterizes the probability that the object to be identified is the target object;
  • the sending unit 503 is configured to send the identification information of the target object and the confidence parameter to the client.
  • the identification unit 502 includes a neural network sub-unit (not shown) configured to identify the image to be recognized using the convolutional neural network model.
  • the apparatus 500 further includes: a recognition result receiving unit (not shown) configured to receive the training identification result sent by the client, where the training identification result includes the third-party platform Acquiring the annotation information associated with the image to be identified, the annotation information includes information of the identification information of the target object corresponding to the object to be identified issued by the registered user of the third-party platform; a training unit (not shown) configured to utilize The training uses the recognition result to train the recognition model corresponding to the machine learning mode.
  • a recognition result receiving unit (not shown) configured to receive the training identification result sent by the client, where the training identification result includes the third-party platform Acquiring the annotation information associated with the image to be identified, the annotation information includes information of the identification information of the target object corresponding to the object to be identified issued by the registered user of the third-party platform
  • a training unit (not shown) configured to utilize The training uses the recognition result to train the recognition model corresponding to the machine learning mode.
  • FIG. 6 a block diagram of a computer system 600 suitable for use in implementing a terminal device or server of an embodiment of the present application is shown.
  • computer system 600 includes a central processing unit (CPU) 601 that can be loaded into a program in random access memory (RAM) 603 according to a program stored in read only memory (ROM) 602 or from storage portion 608. And perform various appropriate actions and processes.
  • RAM random access memory
  • ROM read only memory
  • various programs and numbers required for the operation of the system 600 are also stored. according to.
  • the CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input/output (I/O) interface 605 is also coupled to bus 604.
  • the following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, etc.; an output portion 607 including, for example, a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a storage portion 608 including a hard disk or the like. And a communication portion 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the Internet.
  • Driver 610 is also coupled to I/O interface 605 as needed.
  • a removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like, is mounted on the drive 610 as needed so that a computer program read therefrom is installed into the storage portion 608 as needed.
  • an embodiment of the present disclosure includes a computer program product comprising a computer program tangibly embodied on a machine readable medium, the computer program comprising program code for executing the method illustrated in the flowchart.
  • the computer program can be downloaded and installed from the network via communication portion 609, and/or installed from removable media 611.
  • each block of the flowchart or block diagrams can represent a module, a program segment, or a portion of code that includes one or more logic for implementing the specified.
  • Functional executable instructions can also occur in a different order than that illustrated in the drawings. For example, two successively represented blocks may in fact be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented in a dedicated hardware-based system that performs the specified function or operation. Or it can be implemented by a combination of dedicated hardware and computer instructions.
  • the unit or module involved in the embodiment of the present application may be implemented by software or by hardware.
  • the described unit or module may also be disposed in the processor, for example, as a processor, including an acquisition unit, and receiving a single Meta, processing unit.
  • the name of these units does not constitute a limitation on the unit itself in some cases.
  • the obtaining unit may also be described as “acquiring an image to be recognized containing an object to be identified”.
  • the present application further provides a non-volatile computer storage medium, which may be a non-volatile computer storage medium included in the apparatus described in the foregoing embodiments; It may be a non-volatile computer storage medium that exists alone and is not assembled into the terminal.
  • the non-volatile computer storage medium stores one or more programs, when the one or more programs are executed by a device, causing the device to: acquire an image to be recognized that includes an object to be identified; Sending an image to the server, and receiving, by the receiving server, the identification information of the target object corresponding to the object to be identified obtained by identifying the image to be identified, and a confidence parameter, wherein the confidence parameter represents that the object to be identified is a probability of the target object; when the confidence parameter is greater than a confidence threshold, the identification information of the target object is used as a recognition result; when the confidence parameter is less than a confidence threshold, obtaining a location from a third-party platform
  • the annotation information associated with the identification image is referred to, and the annotation information is used as the recognition result.
  • the non-volatile computer storage medium stores one or more programs, when the one or more programs are executed by a device, causing the device to: receive an image to be recognized that is sent by a client and includes an object to be identified; Identifying the image for identification, obtaining identification information of the target object corresponding to the object to be identified, and a confidence parameter, the confidence parameter characterizing the probability that the object to be identified is the target object; and transmitting the identification information of the target object and the confidence parameter to the client.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Physiology (AREA)
  • Image Analysis (AREA)

Abstract

本申请公开了图像识别方法和装置。该方法的一具体实施方式包括:获取包含有待识别对象的待识别图像;将待识别图像发送至服务器,以及接收服务器返回的对待识别图像进行识别而得到的待识别对象对应的目标对象的标识信息以及置信度参数;当置信度参数大于置信度阈值时,将目标对象的标识信息作为识别结果;当置信度参数小于置信度阈值时,从第三方平台获取与待识别图像相关联的标注信息,以及将标注信息作为识别结果。实现了将服务器自动识别与第三方标注信息进行结合,提升识别准确率,以及利用第三方标注信息对服务器使用的机器学习识别方式对应的识别模型进行训练,提升识别模型的训练效果。

Description

图像识别方法和装置
相关申请的交叉引用
本申请要求于2015年09月08日提交的中国专利申请号为“201510567452.2”的优先权,其全部内容作为整体并入本申请中。
技术领域
本申请涉及计算机领域,具体涉及图像识别领域,尤其涉及图像识别方法和装置。
背景技术
在日常生活中,用户有时会有对拍摄的图像进行识别的需求。在已知的技术中,图像识别是通过对海量图像的特征进行分析而建立识别模型,采用识别模型对图像进行识别。然而,当采用上述方式对图像进行识别时,一方面,由于在识别过程中消耗资源较大,不适用于个人用户,另一方面,仅能根据机器输出的识别结果对识别模型进行调整,当输出的识别结果出现较多错误时,会引起识别模型的调整出现偏差,进而导致识别准确率降低。
发明内容
本申请提供了图像识别方法和装置,用于解决上述背景技术部分存在的技术问题。
第一方面,本申请提供了图像识别方法,该方法包括:获取包含有待识别对象的待识别图像;将待识别图像发送至服务器,以及接收服务器返回的对待识别图像进行识别而得到的待识别对象对应的目标对象的标识信息以及置信度参数,置信度参数表征待识别对象为目标对象的概率;当置信度参数大于置信度阈值时,将目标对象的标识信息作为识别结果;当置信度参数小于置信度阈值时,从第三方平台获 取与待识别图像相关联的标注信息,以及将标注信息作为识别结果。
第二方面,本申请提供了图像识别方法,该方法包括:接收客户端发送的包含有待识别对象的待识别图像;对待识别图像进行识别,得到待识别对象对应的目标对象的标识信息以及置信度参数,置信度参数表征待识别对象为目标对象的概率;将目标对象的标识信息以及置信度参数发送至客户端。
第三方面,本申请提供了图像识别装置,该装置包括:获取单元,配置用于获取包含有待识别对象的待识别图像;交互单元,配置用于将待识别图像发送至服务器,以及接收服务器返回的对待识别图像进行识别而得到的待识别对象对应的目标对象的标识信息以及置信度参数,置信度参数表征待识别对象为目标对象的概率;判断单元,配置用于当置信度参数大于置信度阈值时,将目标对象的标识信息作为识别结果;当置信度参数小于置信度阈值时,从第三方平台获取与待识别图像相关联的标注信息,以及将标注信息作为识别结果。
第四方面,本申请提供了图像识别装置,该装置包括:接收单元,配置用于接收客户端发送的包含有待识别对象的待识别图像;识别单元,配置用于对待识别图像进行识别,得到待识别对象对应的目标对象的标识信息以及置信度参数,置信度参数表征待识别对象为目标对象的概率;发送单元,配置用于将目标对象的标识信息以及置信度参数发送至客户端。
本申请提供的图像识别方法和装置,通过获取包含有待识别对象的待识别图像;将待识别图像发送至服务器,以及接收服务器返回的对待识别图像进行识别而得到的待识别对象对应的目标对象的标识信息以及置信度参数;当置信度参数大于置信度阈值时,将目标对象的标识信息作为识别结果;当置信度参数小于置信度阈值时,从第三方平台获取与待识别图像相关联的标注信息,以及将标注信息作为识别结果。实现了将服务器自动识别与第三方标注信息进行结合,提升识别准确率,以及利用第三方标注信息对服务器使用的机器学习识别方式对应的识别模型进行训练,以提升训练效果,从而进一步提升识别准确率。
附图说明
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:
图1是本申请可以应用于其中的示例性***架构图;
图2示出了根据本申请的图像识别方法的一个实施例的流程图;
图3示出了根据本申请的图像识别方法的另一个实施例的流程图;
图4示出了根据本申请的图像识别装置的一个实施例的结构示意图;
图5示出了根据本申请的图像识别装置的另一个实施例的结构示意图;
图6是适于用来实现本申请实施例的终端设备或服务器的计算机***的结构示意图。
具体实施方式
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。
图1示出了可以应用本申请的图像识别方法或图像识别装置的实施例的示例性***架构100。
如图1所示,***架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供传输链路的介质。网络104可以包括各种连接类型,例如有线、无线传输链路或者光纤电缆等等。
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103上可以安装有 各种通讯应用,例如图像识别类应用、即时通信工具等。
终端设备101、102、103可以是具有显示屏并且支持网络通信的各种电子设备,包括但不限于智能手机、平板电脑、电子书阅读器、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、膝上型便携计算机和台式计算机等等。
服务器105可以是提供各种服务的服务器,例如对终端设备101、102、103上的图像识别类应用提供支持的后台服务器。后台服务器可以对接收到的待识别图像等数据进行分析等处理,并将处理结果(目标对象)反馈给终端设备。
需要说明的是,本申请实施例中,可以将获取待识别图像的一端称之为客户端,客户端并不特指某一类型终端,其可以为终端设备101、102、103或服务器105。
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。
请参考图2,其示出了根据本申请的图像识别方法的一个实施例的流程200。需要说明的是,本申请实施例所提供的图像识别方法一般由终端设备101、102、103执行,相应地,图像识别装置一般设置于终端设备101、102、103中。该方法包括以下步骤:
步骤201,获取包含有待识别对象的待识别图像。
在本实施例中,可以通过摄像头采集待识别图像,摄像头可以设置于终端设备上,终端设备可以包括但不限于移动终端、可穿戴设备(例如智能眼镜)。以摄像头设置于智能眼镜上为例,可以在用户佩戴智能眼镜时,可以利用摄像头采集摄像头的视角范围内的图像作为待识别图像。在本实施例中,可以响应于输入的图像采集指令,开启摄像头进行图像采集。例如,可以通过麦克风接收用户输入的语音信息,对语音信息进行解析得到图像采集指令,触发摄像头进行图像采集。在本实施例中,待识别图像中包含待识别对象,例如,当用户进入会议现场时,可以利用其佩戴的可穿戴设备上的摄像头采集与会议现场 这一场景相关联的图像,则采集到的待识别图像中可以包含会议现场中的桌子、椅子等待识别对象。
在本实施例的一些可选的实现方式中,待识别对象包括以下至少一项:物体对象、场景对象、颜色对象。
步骤202,将待识别图像发送至服务器,以及接收服务器返回的对待识别图像进行识别而得到的待识别对象对应的目标对象的标识信息以及置信度参数。
在本实施例中,置信度参数表征待识别对象为目标对象的概率。在获取待识别图像之后,可以将待识别图像发送至服务器以对图像中的待识别对象进行识别,然后可以接收经服务器对待识别图像进行识别后得到的待识别对象对应的目标对象以及置信度参数。其中,服务器对待识别图像进行识别的一种可选识别方式为机器学习识别方式。在本实施例中,置信度参数可以用于表征在对待识别图像进行识别时,待识别对象为目标对象的概率即待识别对象与目标对象的样本数据的相似度。置信度参数的数值越高,则待识别对象为目标对象的概率越大。
步骤203,当置信度参数大于置信度阈值时,将目标对象的标识信息作为识别结果;当置信度参数小于置信度阈值时,从第三方平台获取与待识别图像相关联的标注信息,以及将标注信息作为识别结果。
在本实施例中,一种可选的标注信息包括由第三方平台的注册用户发布的包含待识别对象对应的目标对象的标识信息的信息。在本实施例中,在得到服务器返回的置信度参数之后,可以进一步确定对待识别图像的识别结果。当置信度参数大于置信度阈值时,则可以将目标对象的标识信息作为识别结果。当置信度参数小于置信度阈值时,则可以从第三方平台获取与待识别图像相关联的标注信息。以待识别图像中包含一个圆形的桌子和三把椅子为例,当通过服务器以机器识别方式对该待识别图像进行识别时,例如,将待识别对象与目标对象即圆形桌子对象和椅子对象的样本数据进行匹配,在置信度参数大于置信度阈值的情况下,则可以将目标对象桌子对象、椅子对象的标识信息即桌子、椅子作为识别结果。在置信度参数小于置信度阈值的情 况下,则可以将待识别图像发送至服务器,接收服务器返回的标注信息,以及将标注信息作为识别结果。
在本实施例中,可以采用以下方式获取对待识别对象的标注信息:可以将待识别图像发送至与服务器相关联的第三方平台,该第三方平台可以提供问答服务,其中,问答服务可以用于将用户咨询的问题以任务形式进行下发,由该第三方平台的注册用户将对该问题的答案发布在第三方平台。在将待识别图像发送至第三方平台之后,可以利用第三方平台上的问答服务来生成对待识别图像进行识别的任务,然后将任务下发给第三方平台上的注册用户。当注册用户接受对待识别图像进行识别的任务时,可以在向注册用户展示待识别图像的同时,提供信息输入区。注册用户可以判断待识别图像中包含哪些目标对象,然后可以将目标对象的名称、个数等信息在信息输入区进行填写,从而生成标注信息。例如,注册用户接受的对待识别图像进行识别的任务中的待识别图像包括一个圆形的桌子和三把椅子,则注册用户可以在信息输入区采用以下格式填写信息:圆形、桌子、一个、椅子、三把。然后,可以基于注册用户填写的信息,生成标注信息。该标注信息中包含有待识别对象对应的目标对象的标识信息即“圆形桌子、椅子”,还可以包含表征目标对象的个数的信息即“一个、三把”。
在本实施例的一些可选的实现方式中,还包括:将识别结果转换为语音信息,以及播放语音信息。这种实现方式中,在得到最终的识别结果之后,可以将识别结果转换为语音信息,然后将语音信息播报给用户。
在本实施例的一些可选的实现方式中,还包括:当置信度参数小于置信度阈值时,将标注信息发送至服务器,以作为训练样本用于对服务器所使用的机器学习识别方式对应的识别模型的训练。
本实施例的应用场景可以为:用户(例如盲人用户)利用佩戴的可穿戴设备上的摄像头采集当前所处的场景(例如会议现场)相关联的待识别图像(例如包含会议现场中的桌子、椅子等待识别对象的待识别图像)。然后,可以将待识别图像发送至服务器以对待识别图像进行识别,接收服务器返回待识别对象对应的目标对象的标识信息和置 信度参数。当置信度参数大于置信度阈值时(例如正确识别出桌子、椅子对象),可以作为待识别对象对应的目标对象的标识信息作为识别结果,当置信度参数小于置信度阈值时,则可以将待识别图像发送至第三方平台,由第三方平台的注册用户确定待识别对象对应的目标对象(例如注册用户确定出待识别图像中包含桌子、椅子等目标对象),然后可以接收第三方平台返回的包含待识别对象对应的目标对象的标识信息的标注信息,将标注信息作为识别结果。在确定识别结果之后,可以将识别结果转换为语音信息进行播放。从而使得用户可以基于采集到的图像,较为准确地了解当前所处的场景的情况(例如场景中包含哪些物体)。进一步地,当置信度参数小于置信度阈值时,还可以将标注数据发送至服务器,以作为样本数据对服务器所使用的机器学习识别方式对应的识别模型进行训练,以提升识别模型的训练效果,使得在后续图像识别中,进一步提升识别准确率。
请参考图3,其示出了根据本申请的图像识别方法的一个实施例的流程300。需要说明的是,本申请实施例所提供的图像识别方法一般由服务器105执行,相应地,图像识别装置一般设置于服务器105中。该方法包括以下步骤:
步骤301,接收客户端发送的包含有待识别对象的待识别图像。
在本实施例中,待识别图像中包含待识别对象,例如,当用户进入会议现场时,可以利用智能眼镜上的摄像头采集图像,则采集到的图像中可以包含桌子、椅子等待识别对象。
步骤302,对待识别图像进行识别,得到待识别对象对应的目标对象的标识信息以及置信度参数。
在本实施例中,置信度参数表征待识别对象为目标对象的概率。对待识别对象进行识别的一种可选方式为机器学习方式。机器学习方式可以包括但不限于:自动编码器(Auto Encoder)、稀疏编码(Sparse Coding)、深信度网络(Deep Belief Networks)。上述机器学习方式也可称之为深度学习。
在本实施例的一些可选的实现方式中,对待识别图像进行识别包括:采用卷积神经网络模型对待识别图像进行识别。
在本实施例中,可以首先建立与识别待识别图像使用的机器学习识别方式对应的识别模型,然后,利用识别模型对待识别图像进行识别。利用机器学习方式对应的识别模型对待识别图像进行识别的原理概述如下:在利用识别模型(例如卷积神经网络模型)对待识别图像进行识别时,可以将待识别图像中的待识别对象用一些特征(例如尺度不变特征变换特征点)进行表示,生成输入向量,在经过识别模型对待识别图像进行识别后,可以得到表征待识别对象对应的目标对象的输出向量,识别模型可以用于指示输入向量到输出向量的映射关系,然后可以基于该映射关系,对待识别图像进行识别。
在本实施例中,在利用识别模型对待识别图像进行识别时,可以利用一些特征(例如尺度不变特征变换特征点)来表征待识别图像中的待识别对象,可以将待识别对象(例如桌子对象)在待识别图像中的特征与目标对象(例如桌子对象的样本数据)进行匹配,得到表征待识别对象为目标对象概率的置信度参数。
在本实施例的一些可选的实现方式中,还包括:接收客户端发送的训练用识别结果,训练用识别结果包括从第三方平台获取的与待识别图像相关联的标注信息,标注信息包括由第三方平台的注册用户发布的包含待识别对象对应的目标对象的标识信息的信息;利用训练用识别结果,对机器学习方式对应的识别模型进行训练。
在本实施例中,客户端发送的训练用识别结果可以为在以机器学***台获取的与待识别图像相关联的标注信息。标注信息包括由第三方平台的注册用户发布的包含待识别对象对应的目标对象的标识信息的信息。以待识别图像中包含一个圆形的桌子和三把椅子为例,当以机器学***台(例如提供问答服务的第三方平台)以获得对图像的标注信息。该标注信息可以为第三方平台上的注册用户发布的包含待识别对象对应的目标对象的标识信息的信息。例如,标注信息包括“圆形桌子、一个、椅子、三把”。
在本实施例中,可以利用标注信息对识别模型进行训练。以识别模型为卷积神经网络为例,可以将待识别图像的特征(例如尺度不变特征变换特征点)作为卷积神经网络的输入向量,将标注信息作为卷积神经网络的理想输出向量,由输入向量与输出向量组成向量对来对卷积神经网络进行训练,从而可以利用正确的识别结果即经过第三方平台的注册用户以人工方式对待识别图像进行识别后获取的标注信息,对识别模型进行训练,从而提高识别模型的训练效果,进而在后续的对待识别图像的识别中,提升识别准确率。
在本实施例中,可以根据待识别对象的类型,预先设置与待识别对象的类型对应的样本数据,然后利用样本数据对识别模型进行训练。例如,可以预先获取一些常见的应用场景的图像和以及对图像的标注信息作为训练数据。
步骤303,将目标对象的标识信息以及置信度参数发送至客户端。
在本实施例中,在对待识别图像进行识别之后,可以将待识别图像中的待识别对象对应的目标对象的标识信息以及得到的置信度参数发送至客户端。
请参考图4,其示出了根据本申请的图像识别装置的一个实施例的结构示意图。装置400包括:获取单元401,交互单元402,判断单元403。其中,获取单元401配置用于获取包含有待识别对象的待识别图像;交互单元402配置用于将待识别图像发送至服务器,以及接收服务器返回的对待识别图像进行识别而得到的待识别对象对应的目标对象的标识信息以及置信度参数,置信度参数表征待识别对象为目标对象的概率;判断单元403配置用于当置信度参数大于置信度阈值时,将目标对象的标识信息作为识别结果;当置信度参数小于置信度阈值时,从第三方平台获取与待识别图像相关联的标注信息,以及将标注信息作为识别结果。
在本实施例的一些可选的实现方式中,标注信息包括由第三方平台的注册用户发布的包含待识别对象对应的目标对象的标识信息的信息。
在本实施例的一些可选的实现方式中,装置400还包括:播放单 元(未示出),配置用于将识别结果转换为语音信息,以及播放语音信息。
在本实施例的一些可选的实现方式中,装置400还包括:标注信息发送单元(未示出),配置用于当置信度参数小于置信度阈值时,将标注信息发送至服务器,以作为训练样本用于对服务器使用的机器学习识别方式对应的识别模型的训练。
在本实施例的一些可选的实现方式中,待识别对象包括以下至少一项:物体对象、场景对象、颜色对象。
请参考图5,其示出了根据本申请的图像识别装置的另一个实施例的结构示意图。装置500包括:接收单元501,识别单元502,发送单元503。其中,接收单元501配置用于接收客户端发送的包含有待识别对象的待识别图像;识别单元502配置用于对待识别图像进行识别,得到待识别对象对应的目标对象的标识信息以及置信度参数,置信度参数表征待识别对象为目标对象的概率;发送单元503配置用于将目标对象的标识信息以及置信度参数发送至客户端。
在本实施例的一些可选的实现方式中,识别单元502包括:神经网络子单元(未示出),配置用于采用卷积神经网络模型对待识别图像进行识别。
在本实施例的一些可选的实现方式中,装置500还包括:识别结果接收单元(未示出),配置用于接收客户端发送的训练用识别结果,训练用识别结果包括从第三方平台获取的与待识别图像相关联的标注信息,标注信息包括由第三方平台的注册用户发布的包含待识别对象对应的目标对象的标识信息的信息;训练单元(未示出),配置用于利用训练用识别结果,对机器学习方式对应的识别模型进行训练。
下面参考图6,其示出了适于用来实现本申请实施例的终端设备或服务器的计算机***600的结构示意图。
如图6所示,计算机***600包括中央处理单元(CPU)601,其可以根据存储在只读存储器(ROM)602中的程序或者从存储部分608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中,还存储有***600操作所需的各种程序和数 据。CPU 601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。
以下部件连接至I/O接口605:包括键盘、鼠标等的输入部分606;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分607;包括硬盘等的存储部分608;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口605。可拆卸介质611,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器610上,以便于从其上读出的计算机程序根据需要被安装入存储部分608。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括有形地包含在机器可读介质上的计算机程序,所述计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分609从网络上被下载和安装,和/或从可拆卸介质611被安装。
附图中的流程图和框图,图示了按照本申请各种实施例的***、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,所述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的***来实现,或者可以用专用硬件与计算机指令的组合来实现。
本申请实施例中所涉及到的单元或模块可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元或模块也可以设置在处理器中,例如,可以描述为:一种处理器包括获取单元,接收单 元,处理单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,获取单元还可以被描述为“获取包含有待识别对象的待识别图像”。
作为另一方面,本申请还提供了一种非易失性计算机存储介质,该非易失性计算机存储介质可以是上述实施例中所述装置中所包含的非易失性计算机存储介质;也可以是单独存在,未装配入终端中的非易失性计算机存储介质。上述非易失性计算机存储介质存储有一个或者多个程序,当所述一个或者多个程序被一个设备执行时,使得所述设备:获取包含有待识别对象的待识别图像;将所述待识别图像发送至服务器,以及接收服务器返回的对所述待识别图像进行识别而得到的所述待识别对象对应的目标对象的标识信息以及置信度参数,所述置信度参数表征所述待识别对象为所述目标对象的概率;当所述置信度参数大于置信度阈值时,将所述目标对象的标识信息作为识别结果;当所述置信度参数小于置信度阈值时,从第三方平台获取与所述待识别图像相关联的标注信息,以及将所述标注信息作为识别结果。上述非易失性计算机存储介质存储有一个或者多个程序,当所述一个或者多个程序被一个设备执行时,使得所述设备:接收客户端发送的包含有待识别对象的待识别图像;对待识别图像进行识别,得到待识别对象对应的目标对象的标识信息以及置信度参数,置信度参数表征待识别对象为目标对象的概率;将目标对象的标识信息以及置信度参数发送至客户端。
以上描述仅为本申请的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本申请中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离所述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本申请中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。

Claims (18)

  1. 一种图像识别方法,其特征在于,所述方法包括:
    获取包含有待识别对象的待识别图像;
    将所述待识别图像发送至服务器,以及接收所述服务器返回的对所述待识别图像进行识别而得到的所述待识别对象对应的目标对象的标识信息以及置信度参数,所述置信度参数表征所述待识别对象为所述目标对象的概率;
    当所述置信度参数大于置信度阈值时,将所述目标对象的标识信息作为识别结果;当所述置信度参数小于置信度阈值时,从第三方平台获取与所述待识别图像相关联的标注信息,以及将所述标注信息作为识别结果。
  2. 根据权利要求1所述的方法,其特征在于,所述标注信息包括由所述第三方平台的注册用户发布的包含所述待识别对象对应的目标对象的标识信息的信息。
  3. 根据权利要求2所述的方法,其特征在于,所述方法还包括:将所述识别结果转换为语音信息,以及播放所述语音信息。
  4. 根据权利要求1所述的方法,其特征在于,所述方法还包括:当所述置信度参数小于置信度阈值时,将所述标注信息发送至所述服务器,以作为训练样本用于对所述服务器所使用的机器学习识别方式对应的识别模型的训练。
  5. 根据权利要求1-4之一所述的方法,其特征在于,所述待识别对象包括以下至少一项:物体对象、场景对象、颜色对象。
  6. 一种图像识别方法,其特征在于,所述方法包括:
    接收客户端发送的包含有待识别对象的待识别图像;
    对所述待识别图像进行识别,得到待识别对象对应的目标对象的标识信息以及置信度参数,所述置信度参数表征所述待识别对象为所述目标对象的概率;
    将所述目标对象的标识信息以及置信度参数发送至所述客户端。
  7. 根据权利要求6所述的方法,其特征在于,所述对所述待识别图像进行识别包括:采用卷积神经网络模型对所述待识别图像进行识别。
  8. 根据权利要求7所述的方法,其特征在于,所述方法还包括:
    接收客户端发送的训练用识别结果,所述训练用识别结果包括从第三方平台获取的与所述待识别图像相关联的标注信息,所述标注信息包括由所述第三方平台的注册用户发布的包含所述待识别对象对应的目标对象的标识信息的信息;
    利用所述训练用识别结果,对识别待识别图像所使用的机器学习识别方式对应的识别模型进行训练。
  9. 一种图像识别装置,其特征在于,所述装置包括:
    获取单元,配置用于获取包含有待识别对象的待识别图像;
    交互单元,配置用于将所述待识别图像发送至服务器,以及接收服务器返回的对所述待识别图像进行识别而得到的所述待识别对象对应的目标对象的标识信息以及置信度参数,所述置信度参数表征所述待识别对象为所述目标对象的概率;
    判断单元,配置用于当所述置信度参数大于置信度阈值时,将所述目标对象的标识信息作为识别结果;当所述置信度参数小于置信度阈值时,从第三方平台获取与所述待识别图像相关联的标注信息,以及将所述标注信息作为识别结果。
  10. 根据权利要求9所述的装置,其特征在于,所述标注信息包括由所述第三方平台的注册用户发布的包含所述待识别对象对应的目 标对象的标识信息的信息。
  11. 根据权利要求9所述的装置,其特征在于,所述装置还包括:
    播放单元,配置用于将所述识别结果转换为语音信息,以及播放所述语音信息。
  12. 根据权利要求9所述的装置,其特征在于,所述装置还包括:
    标注信息发送单元,配置用于当所述置信度参数小于置信度阈值时,将所述标注信息发送至所述服务器,以作为训练样本用于对所述服务器所使用的机器学习识别方式对应的识别模型的训练。
  13. 根据权利要求9-12之一所述的装置,其特征在于,所述待识别对象包括以下至少一项:物体对象、场景对象、颜色对象。
  14. 一种图像识别装置,其特征在于,所述装置包括:
    接收单元,配置用于接收客户端发送的包含有待识别对象的待识别图像;
    识别单元,配置用于对所述待识别图像进行识别,得到待识别对象对应的目标对象的标识信息以及置信度参数,所述置信度参数表征所述待识别对象为所述目标对象的概率;
    发送单元,配置用于将所述目标对象的标识信息以及置信度参数发送至所述客户端。
  15. 根据权利要求14所述的装置,其特征在于,所述识别单元包括:
    神经网络子单元,配置用于采用卷积神经网络模型对所述待识别图像进行识别。
  16. 根据权利要求15所述的装置,其特征在于,所述装置还包括:
    识别结果接收单元,配置用于接收客户端发送的训练用识别结果, 所述训练用识别结果包括从第三方平台获取的与所述待识别图像相关联的标注信息,所述标注信息包括由所述第三方平台的注册用户发布的包含所述待识别对象对应的目标对象的标识信息的信息;
    训练单元,配置用于利用所述训练用识别结果,对识别待识别图像所使用的机器学习识别方式对应的识别模型进行训练。
  17. 一种设备,包括:
    处理器;和
    存储器,
    所述存储器中存储有能够被所述处理器执行的计算机可读指令,在所述计算机可读指令被执行时,所述处理器执行权利要求1至8中任一项所述的方法。
  18. 一种非易失性计算机存储介质,所述计算机存储介质存储有能够被处理器执行的计算机可读指令,当所述计算机可读指令被处理器执行时,所述处理器执行权利要求1至8中任一项所述的方法。
PCT/CN2015/096132 2015-09-08 2015-12-01 图像识别方法和装置 WO2017041366A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/535,006 US10796685B2 (en) 2015-09-08 2015-12-01 Method and device for image recognition

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510567452.2A CN105095919A (zh) 2015-09-08 2015-09-08 图像识别方法和装置
CN201510567452.2 2015-09-08

Publications (1)

Publication Number Publication Date
WO2017041366A1 true WO2017041366A1 (zh) 2017-03-16

Family

ID=54576304

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/096132 WO2017041366A1 (zh) 2015-09-08 2015-12-01 图像识别方法和装置

Country Status (3)

Country Link
US (1) US10796685B2 (zh)
CN (1) CN105095919A (zh)
WO (1) WO2017041366A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664997A (zh) * 2018-04-20 2018-10-16 西南交通大学 基于级联Faster R-CNN的高铁接触网等电位线不良状态检测方法
CN111429512A (zh) * 2020-04-22 2020-07-17 北京小马慧行科技有限公司 图像处理方法和装置、存储介质及处理器
CN111783775A (zh) * 2020-06-30 2020-10-16 京东数字科技控股有限公司 一种图像采集方法、装置、设备和计算机可读存储介质
CN112949667A (zh) * 2019-12-09 2021-06-11 北京金山云网络技术有限公司 图像识别方法、***、电子设备及存储介质
CN113128247A (zh) * 2021-05-17 2021-07-16 阳光电源股份有限公司 一种图像定位标识验证方法及服务器

Families Citing this family (61)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095919A (zh) * 2015-09-08 2015-11-25 北京百度网讯科技有限公司 图像识别方法和装置
CA2956780A1 (en) * 2016-02-01 2017-08-01 Mitchell International, Inc. Methods for improving automated damage appraisal and devices thereof
CN107291737B (zh) * 2016-04-01 2019-05-14 腾讯科技(深圳)有限公司 敏感图像识别方法及装置
CN105821538B (zh) * 2016-04-20 2018-07-17 广州视源电子科技股份有限公司 细纱断裂的检测方法和***
CN106022249A (zh) * 2016-05-16 2016-10-12 乐视控股(北京)有限公司 动态对象识别方法、装置及***
CN107625527B (zh) * 2016-07-19 2021-04-20 杭州海康威视数字技术股份有限公司 一种测谎方法及装置
CN107786867A (zh) 2016-08-26 2018-03-09 原相科技股份有限公司 基于深度学习架构的图像辨识方法及***
US10726573B2 (en) 2016-08-26 2020-07-28 Pixart Imaging Inc. Object detection method and system based on machine learning
CN106682590B (zh) * 2016-12-07 2023-08-22 浙江宇视科技有限公司 一种监控业务的处理方法以及服务器
US20200090057A1 (en) * 2016-12-07 2020-03-19 Cloudminds (Shenzhen) Robotics Systems Co., Ltd. Human-computer hybrid decision method and apparatus
CN106874845B (zh) * 2016-12-30 2021-03-26 东软集团股份有限公司 图像识别的方法和装置
US10275687B2 (en) * 2017-02-16 2019-04-30 International Business Machines Corporation Image recognition with filtering of image classification output distribution
CN107145816A (zh) * 2017-02-24 2017-09-08 北京悉见科技有限公司 对象识别跟踪方法及装置
CN107276974B (zh) * 2017-03-10 2020-11-03 创新先进技术有限公司 一种信息处理方法及装置
CN108573268A (zh) * 2017-03-10 2018-09-25 北京旷视科技有限公司 图像识别方法和装置、图像处理方法和装置及存储介质
WO2018170663A1 (zh) * 2017-03-20 2018-09-27 深圳前海达闼云端智能科技有限公司 图像标注方法、装置及电子设备
CN107908641B (zh) * 2017-09-27 2021-03-19 百度在线网络技术(北京)有限公司 一种获取图片标注数据的方法和***
CN107832662B (zh) * 2017-09-27 2022-05-27 百度在线网络技术(北京)有限公司 一种获取图片标注数据的方法和***
CN107758761B (zh) * 2017-09-28 2019-10-22 珠海格力电器股份有限公司 净水设备及其控制方法、装置、存储介质和处理器
US11222627B1 (en) * 2017-11-22 2022-01-11 Educational Testing Service Exploring ASR-free end-to-end modeling to improve spoken language understanding in a cloud-based dialog system
CN108172213B (zh) * 2017-12-26 2022-09-30 北京百度网讯科技有限公司 娇喘音频识别方法、装置、设备及计算机可读介质
CN108171274B (zh) * 2018-01-17 2019-08-09 百度在线网络技术(北京)有限公司 用于识别动物的方法和装置
CN108389316B (zh) * 2018-03-02 2021-07-13 北京京东尚科信息技术有限公司 自动售货方法、装置和计算机可读存储介质
CN110245668B (zh) * 2018-03-09 2023-06-27 腾讯科技(深圳)有限公司 基于图像识别的终端信息获取方法、获取装置及存储介质
CN108664897A (zh) * 2018-04-18 2018-10-16 平安科技(深圳)有限公司 票据识别方法、装置及存储介质
US20210157331A1 (en) * 2018-04-19 2021-05-27 Positec Power Tools (Suzhou) Co., Ltd Self-moving device, server, and automatic working system thereof
CN108734718B (zh) * 2018-05-16 2021-04-06 北京市商汤科技开发有限公司 用于图像分割的处理方法、装置、存储介质及设备
CN108897786B (zh) * 2018-06-08 2021-06-08 Oppo广东移动通信有限公司 应用程序的推荐方法、装置、存储介质及移动终端
CN109035558B (zh) * 2018-06-12 2020-08-25 武汉市哈哈便利科技有限公司 一种用于无人售货柜的商品识别算法在线学习***
CN109035579A (zh) * 2018-06-29 2018-12-18 深圳和而泰数据资源与云技术有限公司 一种商品识别方法、无人售货机及计算机可读存储介质
CN110750667A (zh) * 2018-07-05 2020-02-04 第四范式(北京)技术有限公司 辅助标注方法、装置、设备及存储介质
CN109255325A (zh) * 2018-09-05 2019-01-22 百度在线网络技术(北京)有限公司 用于可穿戴设备的图像识别方法和装置
CN109409423A (zh) * 2018-10-15 2019-03-01 珠海格力电器股份有限公司 一种图像识别方法、装置、终端及可读存储介质
CN111089388A (zh) * 2018-10-18 2020-05-01 珠海格力电器股份有限公司 控制空调的方法及***、空调器、家用电器
CN109522947B (zh) * 2018-10-31 2022-03-25 联想(北京)有限公司 识别方法和设备
CN109409325B (zh) * 2018-11-09 2022-05-31 联想(北京)有限公司 一种识别方法和电子设备
US20220004777A1 (en) * 2018-11-15 2022-01-06 Sony Group Corporation Information processing apparatus, information processing system, information processing method, and program
CN109583499B (zh) * 2018-11-30 2021-04-16 河海大学常州校区 一种基于无监督sdae网络的输电线路背景目标分类***
CN109783674A (zh) * 2018-12-13 2019-05-21 平安普惠企业管理有限公司 图片识别方法、装置、***、计算机设备及存储介质
CN111444746B (zh) * 2019-01-16 2024-01-30 北京亮亮视野科技有限公司 一种基于神经网络模型的信息标注方法
CN109886338A (zh) * 2019-02-25 2019-06-14 苏州清研精准汽车科技有限公司 一种智能汽车测试图像标注方法、装置、***及存储介质
CN111611828A (zh) * 2019-02-26 2020-09-01 北京嘀嘀无限科技发展有限公司 一种非正常图像识别方法、装置、电子设备及存储介质
CN109981755A (zh) * 2019-03-12 2019-07-05 深圳灵图慧视科技有限公司 图像识别方法、装置和电子设备
CN109817201B (zh) * 2019-03-29 2021-03-26 北京金山安全软件有限公司 一种语言学习方法、装置、电子设备及可读存储介质
CN110309735A (zh) * 2019-06-14 2019-10-08 平安科技(深圳)有限公司 异常侦测方法、装置、服务器及存储介质
CN111414946B (zh) * 2020-03-12 2022-09-23 腾讯科技(深圳)有限公司 基于人工智能的医疗影像的噪声数据识别方法和相关装置
CN111611871B (zh) * 2020-04-26 2023-11-28 深圳奇迹智慧网络有限公司 图像识别方法、装置、计算机设备和计算机可读存储介质
US11295167B2 (en) * 2020-04-27 2022-04-05 Toshiba Global Commerce Solutions Holdings Corporation Automated image curation for machine learning deployments
CN112584213A (zh) * 2020-12-11 2021-03-30 海信视像科技股份有限公司 一种显示设备和图像识别结果的展示方法
CN112288883B (zh) * 2020-10-30 2023-04-18 北京市商汤科技开发有限公司 作业指导信息的提示方法、装置、电子设备及存储介质
CN112507605A (zh) * 2020-11-04 2021-03-16 清华大学 基于AnoGAN的配电网异常检测方法
CN112613553B (zh) * 2020-12-18 2022-03-08 中电金信软件有限公司 图片样本集生成方法、装置、计算机设备和存储介质
CN112597895B (zh) * 2020-12-22 2024-04-26 阿波罗智联(北京)科技有限公司 基于偏移量检测的置信度确定方法、路侧设备及云控平台
CN112580745A (zh) * 2020-12-29 2021-03-30 北京五八信息技术有限公司 图像识别方法、装置、电子设备和计算机可读介质
US11869319B2 (en) * 2020-12-31 2024-01-09 Datalogic Usa, Inc. Fixed retail scanner with annotated video and related methods
CN113239804B (zh) * 2021-05-13 2023-06-02 杭州睿胜软件有限公司 图像识别方法、可读存储介质及图像识别***
CN113344055B (zh) * 2021-05-28 2023-08-22 北京百度网讯科技有限公司 图像识别方法、装置、电子设备和介质
CN113378836A (zh) * 2021-06-28 2021-09-10 北京百度网讯科技有限公司 图像识别方法、装置、设备、介质及程序产品
US11960569B2 (en) * 2021-06-29 2024-04-16 7-Eleven, Inc. System and method for refining an item identification model based on feedback
CN114363206B (zh) * 2021-12-28 2024-07-02 奇安信科技集团股份有限公司 终端资产识别方法、装置、计算设备及计算机存储介质
CN114998665B (zh) * 2022-08-04 2022-11-01 创新奇智(广州)科技有限公司 一种图像类别识别方法、装置、电子设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103026368A (zh) * 2010-07-30 2013-04-03 高通股份有限公司 使用增量特征提取的对象辨识
CN104281833A (zh) * 2013-07-08 2015-01-14 深圳市腾讯计算机***有限公司 色情图像识别方法和装置
CN105095919A (zh) * 2015-09-08 2015-11-25 北京百度网讯科技有限公司 图像识别方法和装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5891409B2 (ja) * 2012-01-12 2016-03-23 パナソニックIpマネジメント株式会社 特徴抽出装置、特徴抽出方法、および特徴抽出プログラム
CN103064981A (zh) * 2013-01-18 2013-04-24 浪潮电子信息产业股份有限公司 一种基于云计算的图片搜索方法
US10242036B2 (en) * 2013-08-14 2019-03-26 Ricoh Co., Ltd. Hybrid detection recognition system
CN103942049B (zh) * 2014-04-14 2018-09-07 百度在线网络技术(北京)有限公司 增强现实的实现方法、客户端装置和服务器
US9773209B1 (en) * 2014-07-01 2017-09-26 Google Inc. Determining supervised training data including features pertaining to a class/type of physical location and time location was visited
CN104679863B (zh) * 2015-02-28 2018-05-04 武汉烽火众智数字技术有限责任公司 一种基于深度学习的以图搜图方法和***

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103026368A (zh) * 2010-07-30 2013-04-03 高通股份有限公司 使用增量特征提取的对象辨识
CN104281833A (zh) * 2013-07-08 2015-01-14 深圳市腾讯计算机***有限公司 色情图像识别方法和装置
CN105095919A (zh) * 2015-09-08 2015-11-25 北京百度网讯科技有限公司 图像识别方法和装置

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664997A (zh) * 2018-04-20 2018-10-16 西南交通大学 基于级联Faster R-CNN的高铁接触网等电位线不良状态检测方法
CN112949667A (zh) * 2019-12-09 2021-06-11 北京金山云网络技术有限公司 图像识别方法、***、电子设备及存储介质
CN111429512A (zh) * 2020-04-22 2020-07-17 北京小马慧行科技有限公司 图像处理方法和装置、存储介质及处理器
CN111429512B (zh) * 2020-04-22 2023-08-25 北京小马慧行科技有限公司 图像处理方法和装置、存储介质及处理器
CN111783775A (zh) * 2020-06-30 2020-10-16 京东数字科技控股有限公司 一种图像采集方法、装置、设备和计算机可读存储介质
CN113128247A (zh) * 2021-05-17 2021-07-16 阳光电源股份有限公司 一种图像定位标识验证方法及服务器
CN113128247B (zh) * 2021-05-17 2024-04-12 阳光电源股份有限公司 一种图像定位标识验证方法及服务器

Also Published As

Publication number Publication date
US10796685B2 (en) 2020-10-06
US20180204562A1 (en) 2018-07-19
CN105095919A (zh) 2015-11-25

Similar Documents

Publication Publication Date Title
WO2017041366A1 (zh) 图像识别方法和装置
CN109993150B (zh) 用于识别年龄的方法和装置
US10311877B2 (en) Performing tasks and returning audio and visual answers based on voice command
WO2019242222A1 (zh) 用于生成信息的方法和装置
US11670015B2 (en) Method and apparatus for generating video
CN109919244B (zh) 用于生成场景识别模型的方法和装置
CN111800671B (zh) 用于对齐段落和视频的方法和装置
CN108509611B (zh) 用于推送信息的方法和装置
CN109214501B (zh) 用于识别信息的方法和装置
US20210092462A1 (en) Thin-cloud system for live streaming content
US11750898B2 (en) Method for generating target video, apparatus, server, and medium
US10841115B2 (en) Systems and methods for identifying participants in multimedia data streams
CN110046571B (zh) 用于识别年龄的方法和装置
WO2023071578A1 (zh) 一种文本对齐语音的方法、装置、设备及介质
WO2020078050A1 (zh) 评论信息处理方法和装置、服务器、终端及可读介质
WO2020034981A1 (zh) 编码信息的生成方法和识别方法
CN110019906B (zh) 用于显示信息的方法和装置
CN110008926B (zh) 用于识别年龄的方法和装置
CN109816023B (zh) 用于生成图片标签模型的方法和装置
CN109995543B (zh) 用于添加群成员的方法和设备
CN112309387A (zh) 用于处理信息的方法和装置
WO2020221114A1 (zh) 用于显示信息的方法和设备
CN111797273B (zh) 用于调整参数的方法和装置
CN113312928A (zh) 文本翻译方法、装置、电子设备和存储介质
CN113762056A (zh) 演唱视频识别方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15903469

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 15535006

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15903469

Country of ref document: EP

Kind code of ref document: A1