WO2019233266A1 - Image processing method, computer readable storage medium and electronic device - Google Patents

Image processing method, computer readable storage medium and electronic device Download PDF

Info

Publication number
WO2019233266A1
WO2019233266A1 PCT/CN2019/087590 CN2019087590W WO2019233266A1 WO 2019233266 A1 WO2019233266 A1 WO 2019233266A1 CN 2019087590 W CN2019087590 W CN 2019087590W WO 2019233266 A1 WO2019233266 A1 WO 2019233266A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
image
foreground
processed
area
Prior art date
Application number
PCT/CN2019/087590
Other languages
French (fr)
Chinese (zh)
Inventor
陈岩
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2019233266A1 publication Critical patent/WO2019233266A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Definitions

  • the present application relates to the field of computer technology, and in particular, to an image processing method, a computer-readable storage medium, and an electronic device.
  • Smart devices can capture images through the camera, or they can acquire images through transmission with other smart devices.
  • images captured in different scenes have different color characteristics, and different foreground objects have different performance characteristics.
  • an image processing method a computer-readable storage medium, and an electronic device are provided.
  • An image processing method includes:
  • An image classification label is generated according to a recognition result of the foreground target.
  • An image processing device includes:
  • An image acquisition module configured to acquire an image to be processed
  • a target detection module configured to perform target detection on the image to be processed, and obtain a foreground target in the image to be processed
  • a target recognition module configured to identify the foreground target if the target area occupied by the foreground target in the image to be processed is greater than an area threshold
  • An image classification module is configured to generate an image classification label according to a recognition result of the foreground target.
  • a computer-readable storage medium stores a computer program thereon.
  • the computer program is executed by a processor, the following operations are implemented:
  • An image classification label is generated according to a recognition result of the foreground target.
  • An electronic device includes a memory and a processor.
  • the memory stores computer-readable instructions.
  • the processor causes the processor to perform the following operations:
  • An image classification label is generated according to a recognition result of the foreground target.
  • the image processing method, the computer-readable storage medium, and the electronic device can acquire an image to be processed, perform target detection on the image to be processed, and obtain a foreground target.
  • the foreground target is identified, and an image classification label is generated based on the recognition result of the foreground target.
  • the area occupied by the foreground target is relatively large, the foreground target can be identified more accurately. In this way, an image classification label is generated based on the recognition result of the foreground target, and the image can be classified more accurately.
  • FIG. 1 is an application environment diagram of an image processing method in an embodiment.
  • FIG. 2 is a flowchart of an image processing method according to an embodiment.
  • FIG. 3 is a flowchart of an image processing method in another embodiment.
  • FIG. 4 (a) is a schematic diagram of an image where the target area is less than the area threshold in one embodiment.
  • FIG. 4 (b) is an image diagram of a target area larger than an area threshold in an embodiment.
  • FIG. 5 is a schematic diagram of a model for identifying a foreground and a background of an image in an embodiment.
  • FIG. 6 is a schematic diagram of a model for identifying an image foreground and background in another embodiment.
  • FIG. 7 is a schematic diagram of generating an image classification label in one embodiment.
  • FIG. 8 is a flowchart of an image processing method according to another embodiment.
  • FIG. 9 is a schematic structural diagram of an image processing apparatus according to an embodiment.
  • FIG. 10 is a schematic diagram of an image processing circuit in an embodiment.
  • first the terms “first”, “second”, and the like used in this application can be used herein to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish the first element from another element.
  • the first client may be referred to as the second client, and similarly, the second client may be referred to as the first client. Both the first client and the second client are clients, but they are not the same client.
  • FIG. 1 is an application environment diagram of an image processing method in an embodiment.
  • the application environment includes a terminal 102 and a server 104.
  • the image to be processed may be transmitted between the terminal 102 and the server 104, and the image to be processed may be classified and processed.
  • the terminal 102 may store several images to be processed, and then send the images to be processed to the server 104.
  • a classification algorithm for classifying images is stored in the server 104, and target detection may be performed on the received to-be-processed image to determine a foreground target included in the to-be-processed image.
  • the terminal 102 may perform classification processing on the image to be processed according to the obtained image classification label.
  • the terminal 102 is an electronic device located at the outermost periphery of a computer network and is mainly used for inputting user information and outputting processing results.
  • the terminal 102 may be a personal computer, a mobile terminal, a personal digital assistant, or a wearable electronic device.
  • the server 104 is a device for responding to a service request while providing a computing service, and may be, for example, one or more computers. In other embodiments provided in this application, the foregoing application environment may further include only the terminal 102 or the server 104, which is not limited herein.
  • FIG. 2 is a flowchart of an image processing method according to an embodiment. As shown in FIG. 2, the image processing method includes operations 202 to 208. among them:
  • the image to be processed may be acquired through a camera of an electronic device, or may be acquired from another electronic device, or may be downloaded through a network, which is not limited herein.
  • a camera may be installed on the electronic device, and when the electronic device detects a shooting instruction, it controls the camera through the shooting instruction to collect images to be processed. After obtaining the images, the electronic device can process the images immediately or store the images in a folder in a unified manner. After the images stored in the folder reach a certain number, the stored images are processed in a unified manner.
  • the electronic device may store the acquired images in an album, and when the number of images stored in the album is greater than a certain number, processing of the images in the album is triggered.
  • target detection is performed on the image to be processed to obtain a foreground target in the image to be processed.
  • one or more objects are generally included in a scene where an image is captured.
  • the image when shooting outdoor scenes, the image generally includes pedestrians, blue sky, beaches, buildings, etc.
  • the image When shooting indoor scenes, the image generally includes objects such as furniture, appliances, office supplies, and so on.
  • the foreground target refers to the more prominent main target in the image, which is the object that the user is more concerned about.
  • the area in the image other than the foreground target is the background area.
  • the image to be processed is a two-dimensional pixel matrix composed of several pixels
  • the electronic device can detect the foreground target in the image to be processed. It is detected that the foreground target contains some or all of the pixels in the image to be processed, and then the specific position of the foreground target in the image to be processed is marked. Specifically, after detecting the foreground target, the electronic device may mark the foreground target in the image to be processed through a rectangular frame, so that the user can directly see the specific position of the detected foreground target from the image to be processed.
  • the foreground target is identified.
  • a target identifier may be established for the detected foreground target to uniquely identify a foreground target.
  • the electronic device may establish a correspondence between an image identifier, a target identifier, and a target position.
  • the image identifier is used to uniquely identify an image to be processed
  • the target position is used to indicate a specific position of the foreground target in the image to be processed.
  • the detected foreground target is composed of some or all pixels in the image to be processed.
  • the number of pixels contained in the area where the foreground target is located can be counted, and the target area occupied by the foreground target can be calculated based on the counted number of pixels.
  • the target area may be directly expressed by the number of pixels included in the foreground target, or may be expressed by a ratio of the number of pixels included in the foreground target to the number of pixels included in the image to be processed. The larger the number of pixels contained in the foreground target, the larger the corresponding target area.
  • the electronic device obtains the target area of the foreground target after detecting the foreground target. If the target area is greater than the area threshold, the foreground target is considered too large and the corresponding background area is relatively small. When the background area is too small, the recognition of the background is not accurate. At this time, image classification can be performed according to the foreground target. For example, when the foreground object occupies more than 1/2 of the area of the image to be processed, an image classification label is generated according to the recognition result of the foreground object.
  • an electronic device sets a classification type of the foreground object in advance, and then recognizes which preset classification type the detected foreground object belongs to by using a preset classification algorithm. For example, the electronic device can classify the foreground target into a person, a puppy, a kitten, a gourmet, or other types, and then can identify which type of the aforementioned foreground target the detected foreground target belongs to.
  • a preset classification algorithm For example, the electronic device can classify the foreground target into a person, a puppy, a kitten, a gourmet, or other types, and then can identify which type of the aforementioned foreground target the detected foreground target belongs to.
  • RCNN Registered Reality Network
  • SSD Single Shot MultiBox Detector
  • YOLO You Only Look Look Once
  • Operation 208 Generate an image classification label according to the recognition result of the foreground target.
  • the foreground type of the foreground object can be obtained, and then the image to be processed can be labeled according to the foreground type.
  • the image classification label can be used to mark the type of the image to be processed.
  • the electronic device can classify the image to be processed according to the image classification label, and then classify the image to be processed.
  • the classification label can also be used to find the image to be processed. For example, the electronic device may store the images corresponding to the same image classification label in an album, so that the user can sort and find the corresponding images.
  • the image to be processed can be classified and processed according to the image classification label. For example, when the foreground target is detected as a person, the portrait area in the image can be subjected to beauty treatment; when the foreground target is detected as a plant, the saturation and contrast of the plant can be improved.
  • the image processing method provided in the foregoing embodiment may acquire an image to be processed, and perform target detection on the image to be processed to obtain a foreground target.
  • the foreground target is identified, and an image classification label is generated based on the recognition result of the foreground target.
  • the area occupied by the foreground target is relatively large, the foreground target can be identified more accurately. In this way, an image classification label is generated based on the recognition result of the foreground target, and the image can be classified more accurately.
  • FIG. 3 is a flowchart of an image processing method in another embodiment. As shown in FIG. 3, the image processing method includes operations 302 to 316. among them:
  • Operation 302 Acquire an image set including at least one target image, and calculate the similarity between any two target images.
  • the image to be processed when the image to be processed is identified and an image classification label is generated, it can be a single image to be processed or a batch of images to be processed. For example, when capturing an image, the image is recognized immediately after the image is captured, and an image classification label is generated. It is also possible to store the captured images in an electronic device, and after the captured images exceed a certain number, the recognition processing is unified.
  • the image set includes one or more target images, and the target images may be images stored in the electronic device.
  • the images stored by the electronic device may be obtained in different ways, for example, they may be taken by a user through a camera, they may be downloaded on the network, or they may be sent by a friend.
  • the electronic device recognizes the target image in the image collection and generates an image classification label.
  • Generating the image set may specifically include: at least one target image acquired from a preset file path, and generating an image set according to the acquired target image.
  • the preset file path is used to store images that can be used to identify image classification labels. For example, the preset file path can store only images captured by the user through a camera.
  • an image that needs to be identified may be acquired according to an image generation time when a specified trigger condition is satisfied.
  • a specified trigger condition when a specified trigger condition is met, an image collection is generated according to a target image stored in an electronic device whose storage duration exceeds a duration threshold, and the storage duration refers to a time interval from the time when the target image is acquired by the electronic device to the current time. For example, if the image was captured by a camera, the time is counted from the moment the image is generated by the camera. If the image is downloaded via the network, the time is counted from the moment the image is received.
  • the electronic device can trigger an image recognition process every time a specified time is reached. Or when the number of images included in the image collection exceeds a certain number, the image recognition processing is triggered, which is not limited herein.
  • the target images are classified according to the similarity; wherein the similarity between any two target images in the same type of target images is greater than the similarity threshold.
  • images with a high degree of similarity often have similar recognition results. For example, when continuous shooting is performed by an electronic device, since the interval between successively captured images is relatively short, the captured images are similar, so that the recognition results of the images are relatively close. After generating the image set, the similarity between any two target images in the image set can be calculated. The target images with higher similarity can be identified only once to avoid the consumption of electronic device resources caused by repeated identification.
  • the target image after calculating the similarity of the target image, the target image can be classified according to the similarity, and the images with higher similarity can be classified into the same class.
  • the similarity between the same type of images is relatively high, and the recognition results are relatively close, so that the same type of images can be uniformly processed for recognition.
  • calculate the similarity between any two images in the image set and cluster the target images based on the similarity. Assuming the range of similarity is [0,1], two images with similarity greater than 0.9 can be classified into the same category.
  • Operation 306 Obtain a target image from each type of target image as the image to be processed.
  • a target image After classifying the target image, a target image can be obtained from each type of target image as the image to be processed for recognition processing.
  • the image classification label generated according to the recognition result of the image to be processed can be used as the image classification of the corresponding target image label.
  • a target image may be randomly obtained from each type of target image as an image to be processed, and an image to be processed may also be determined by calculating a similarity difference value.
  • an image subset can be generated according to each type of target image; the target images in the image subset are traversed, and similarities between the target image and other target images in the image subset are accumulated to obtain a total similarity sum; An image to be processed is determined from the image subset according to the sum of similarities. For example, the total similarity corresponding to each target image in the image subset is calculated. The larger the total similarity is, the higher the similarity between the target image and other target images is. The target image with the largest total similarity can be used as Pending image.
  • Operation 308 Perform target detection on the image to be processed to obtain a foreground target in the image to be processed.
  • one or more foreground objects may exist in the image to be processed.
  • the area occupied by the foreground object in the image to be processed is used as the target area; when there are two or two For the above foreground target, the total area occupied by all foreground targets included in the image to be processed is taken as the target area.
  • the target area is larger than the area threshold, it is considered that the area occupied by the foreground target is larger and the area occupied by the background area is smaller.
  • the target area is greater than the area threshold, the foreground target is identified; if the target area is less than or equal to the area threshold, the background area other than the foreground target in the image to be processed is identified; an image classification label is generated based on the recognition result of the background area .
  • the electronic device detects a background area in the image to be processed, and detects which background type the background area belongs to after detecting the background area.
  • the electronic device can set the background type of the background area in advance, and then identify which preset background type the background area specifically belongs to through a preset algorithm.
  • the background area can be divided into scenes such as beach, snow, night, blue sky, indoor, etc.
  • the background type corresponding to the background area can be obtained.
  • An image classification label is generated according to the obtained background type.
  • FIG. 4 (a) is a schematic diagram of an image where the target area is less than the area threshold in an embodiment. As shown in FIG. 4 (a), it includes a background area 402 and a foreground target 404. The area of the background area 402 in the image is larger than the foreground target 404. For the area occupied by the image, the recognition of the background area 402 is more accurate at this time, and an image classification label may be generated according to the recognition result obtained by the recognition of the background area 402.
  • FIG. 4 (b) is a schematic diagram of an image where the target area is greater than the area threshold in one embodiment. As shown in FIG. 4 (b), it includes the background area 406 and the foreground target 408.
  • the area occupied by the foreground target 408 in the image is larger than the background area 406.
  • the recognition of the foreground object 408 is more accurate at this time, and an image classification label may be generated according to the recognition result obtained by the recognition of the foreground object 408.
  • the background region can be identified by the classification model
  • the foreground target can be identified by the detection model.
  • the electronic device trains the classification model and the detection model, and outputs a corresponding loss function, respectively.
  • the loss function is a function that can evaluate the confidence of the classification results.
  • the confidence function corresponding to each preset category can be output through the loss function. The higher the confidence level, the greater the probability that the image belongs to the category. In this way, the background type and foreground type corresponding to the image are determined by the confidence level.
  • the background of the image is defined in advance as types of beach, night scene, fireworks, indoor, etc.
  • the electronic device can train the classification model in advance, and the trained classification model can output a loss function.
  • the background region can be detected by the classification model, and the type of the background region can be identified.
  • the confidence function corresponding to each preset background type can be calculated through the loss function, and the background classification result corresponding to the background region is determined through the confidence degree.
  • the calculated confidence levels of the four types of beach, night view, fireworks, and indoor are 0.01, 0.06, 0.89, and 0.04, respectively. It can be determined that the background region of the image to be processed is the background type with the highest confidence.
  • FIG. 5 is a schematic diagram of a model for identifying a foreground and a background of an image in an embodiment.
  • the electronic device can train the classification model. Before training the model, the image is labeled with a category label, and the classification model is trained through the image and the corresponding category label. After training the classification model, a first loss function can be obtained.
  • a background region in an image can be detected by a classification model, and a first confidence level corresponding to each preset background type can be calculated by using the obtained first loss function. According to the obtained first confidence level, a background classification result corresponding to the background region can be determined.
  • the electronic device can train the detection model.
  • the foreground targets included in the image are marked with a rectangular frame, and the category corresponding to each foreground target is marked.
  • the detection model is trained through images. After the detection model is trained, a second loss function can be obtained.
  • the foreground objects in the image can be detected by the detection model, and the positions of each foreground object can be output.
  • a second confidence function corresponding to each preset foreground type can be calculated through the second loss function. According to the obtained second confidence level, the foreground classification result corresponding to the foreground target can be determined.
  • the above classification model and detection model can be two independent algorithm models
  • the classification model can be a Mobilenet algorithm model
  • the detection model can be an SSD algorithm model, which is not limited here.
  • FIG. 6 is a schematic diagram of a model for identifying an image foreground and background in another embodiment.
  • the recognition model is a neural network model.
  • the input layer of the neural network receives training images with image category labels, performs feature extraction through a basic network (such as a CNN network), and outputs the extracted image features.
  • the feature layer is used to perform category detection on the background training target to obtain a first loss function
  • the foreground training target is subjected to category detection to obtain a second loss function
  • the foreground training target is subjected to position detection based on the foreground area to obtain a position loss.
  • the neural network may be a convolutional neural network.
  • Convolutional neural networks include a data input layer, a convolutional calculation layer, an activation layer, a pooling layer, and a fully connected layer.
  • the data input layer is used to pre-process the original image data.
  • the pre-processing may include de-averaging, normalization, dimensionality reduction, and whitening processes.
  • De-averaging refers to centering all dimensions of the input data to 0 in order to pull the center of the sample back to the origin of the coordinate system.
  • Normalization is normalizing the amplitude to the same range.
  • Whitening refers to normalizing the amplitude on each characteristic axis of the data.
  • the convolution calculation layer is used for local correlation and window sliding.
  • each filter connected to the data window in the convolution calculation layer is fixed.
  • Each filter focuses on an image feature, such as vertical edges, horizontal edges, colors, textures, etc., and these filters are combined to obtain the entire image.
  • a filter is a weight matrix.
  • a weight matrix can be used to convolve with data in different windows.
  • the activation layer is used to non-linearly map the output of the convolution layer.
  • the activation function used by the activation layer may be ReLU (The Rectified Linear Unit).
  • the pooling layer can be sandwiched between consecutive convolutional layers to compress the amount of data and parameters and reduce overfitting.
  • the pooling layer can use the maximum method or average method to reduce the dimensionality of the data.
  • the fully connected layer is located at the tail of the convolutional neural network, and all neurons between the two layers have the right to reconnect.
  • a part of the convolutional neural network is cascaded to the first confidence output node
  • a part of the convolutional layer is cascaded to the second confidence output node
  • a part of the convolutional layer is cascaded to the position output node.
  • the first confidence output node it can be detected.
  • the output node can detect the type of the foreground object of the image according to the second confidence level, and the position corresponding to the foreground object can be detected according to the position output node.
  • the classification model and the detection model may be stored in an electronic device in advance, and when an image to be processed is acquired, the image to be processed is identified through the classification model and the detection model.
  • the classification model and the detection model generally occupy the storage space of the electronic device, and when a large number of images are processed, the storage capacity requirements of the electronic device are also relatively high.
  • the image can be processed through the classification model and detection model stored locally on the terminal, or the image to be processed can be sent to the server for processing through the classification model and detection model stored on the server.
  • the server can send the trained classification model and detection model to the terminal after training the classification model and detection model, and the terminal does not need to train the above model.
  • the classification model and detection model stored in the terminal can be compressed models, so that the compressed model will occupy less resources, but the corresponding recognition accuracy will be lower.
  • the terminal can decide whether to perform the recognition processing locally on the terminal or the recognition processing on the server according to the number of images to be processed. After the terminal obtains the image to be processed, it counts the number of images of the image to be processed. If the number of images exceeds the preset upload number, the terminal uploads the image to be processed to the server and processes the image to be processed on the server. After processing by the server, the processing result is sent to the terminal.
  • FIG. 7 is a schematic diagram of generating an image classification label in one embodiment.
  • the background region of the image is identified, and the image classification labels that can be obtained include landscape, beach, snow, blue sky, green space, night scene, darkness, backlight, sunrise / sunset, indoor, fireworks, spotlight, etc. Recognize the foreground object of the image.
  • the available image classification labels include portraits, babies, cats, dogs, food, etc.
  • an image classification label is generated according to the recognition result of the foreground object; when the area occupied by the background region is larger than 1/2 of the image, an image classification is generated according to the recognition result of the background region label.
  • Operation 314 classify the foreground types identified by each foreground object, and generate a corresponding image classification label according to each foreground type.
  • the image classification label when generating an image classification label according to the foreground classification result, if it is determined according to the foreground classification result that only one foreground type is included in the image to be processed, the image classification label may be directly generated according to the foreground type; if the foreground classification result is to be determined
  • the image contains only two or more foreground types of foreground objects, then multi-level image classification labels can be generated according to the foreground type, that is, the obtained foreground types can be classified, and a corresponding image classification can be generated according to each foreground type. label.
  • an upper limit value for the number of generated tags may be set.
  • a classification label may be generated according to each type of foreground type; the current scene type exceeds this upper limit
  • classification labels are generated only for some foreground types.
  • the method further includes: counting the number of tags of the image classification tags; and if the number of tags exceeds the upper limit of the number, obtaining a target image classification tag from the foregoing image classification tags. The electronic device can mark the image according to the target image classification label.
  • the image may contain three foreground objects, and the corresponding foreground types are "human”, “dog”, and “cat”, respectively.
  • a corresponding image classification label is generated, which are "target-person”, “target-dog”, and “target-cat”.
  • the number of generated image classification labels is three. Assuming that the upper limit of the number is 2, then the number of tags obtained above exceeds the upper limit of the number. Then the target image classification tags can be determined by the above-mentioned image classification tags, which are "target-person” and "target-dog".
  • the total area of the foreground target corresponding to each image classification tag may be calculated, and the target image classification tag may be obtained from the image classification tags according to the total area.
  • the image classification label corresponding to the largest total area can be obtained as the target image classification label, or the image classification labels can be sorted according to the total area, and the target image classification label can be obtained from the sorted image classification labels.
  • the image classification label "Pic-people” can be generated directly based on the foreground type “people”. If the image contains target A, target B, and target C, and the corresponding foreground types are "human”, “cat", and “person”, respectively, the target A and target C corresponding to "people” can be calculated in the image The total area S 1 , the total area S 2 of the target “B” corresponding to “cat” in the image. If S 1 > S 2 , an image classification label will be generated according to the foreground type “person”; if S 1 ⁇ S 2 , an image classification label will be generated according to the foreground type “cat”.
  • the number of foreground targets corresponding to each image classification tag may also be counted, and an object may be obtained from the image classification tags according to the above number of targets.
  • the image classification label corresponding to the largest number of targets can be obtained as the target image classification label, or the image classification labels can be sorted according to the number of targets, and the target image classification label can be obtained from the sorted image classification labels.
  • the image to be processed contains target A, target B, target C, target D, target E, and target F, and the corresponding foreground types are “human”, “dog”, “human”, “human”, “cat” "And” Dog.
  • the foreground types corresponding to the image to be processed include “human”, “dog” and “cat”
  • the generated image classification tags according to the foreground type are "target_person”, “target_dog” and “target_cat”
  • the corresponding number of foreground targets is 3, 2, 1 respectively.
  • the first two image classification labels "target_person” and “target_dog” can be sorted according to the number of targets as the target image classification labels.
  • the operation of identifying the foreground target further includes:
  • Operation 802 Obtain depth data of each detected foreground object, and the depth data is used to represent a distance between the foreground object and the image acquisition device.
  • the depth data is used to indicate the distance between the foreground target and the image acquisition device. It can be considered that the closer the foreground target is to the image acquisition device, the more attention the user receives.
  • Depth data can be obtained, but not limited to, by means of structured light, dual camera ranging.
  • an electronic device can obtain depth data corresponding to each pixel point in an image to be processed, that is, all pixel points included in the foreground target have corresponding depth data.
  • the depth data corresponding to the foreground target may be the depth data corresponding to any pixel in the foreground target, or the average value of the depth data corresponding to all the pixels included in the foreground target, which is not limited here.
  • Operation 804 Identify a foreground target whose depth data is less than a depth threshold.
  • the foreground data that needs to be identified can be filtered by the depth data. Closer foreground targets can be considered as foreground targets that users are more concerned about. Specifically, when the depth data is less than the depth threshold, the foreground target is considered to be a foreground target that the user is more concerned about, and only the foreground target whose depth data is less than the depth threshold may be identified.
  • the operation of identifying the foreground target may further include: acquiring the detected target sharpness of each foreground target, and identifying the foreground target whose target sharpness is greater than the sharpness threshold.
  • multiple foreground targets may be detected from the image to be processed.
  • each foreground target can be identified separately to obtain the foreground type of each foreground target, or one or more of them can be selected for identification to obtain foreground recognition. result.
  • the electronic device After the electronic device detects the foreground target in the image to be processed, it can calculate the target sharpness corresponding to each foreground target.
  • the target sharpness can reflect the sharpness of textures such as the edge details of the foreground target, and can reflect the importance of each foreground object to a certain extent. Therefore, the foreground target for recognition can be obtained according to the target sharpness. For example, when shooting, the user will focus on the object of interest and blur the other objects. When identifying foreground objects, only foreground objects with higher definition can be identified, and foreground objects with lower definition are not identified.
  • the foreground target can include several pixels, and the sharpness of the foreground target can be calculated by calculating the gray difference of each pixel. Generally, the higher the sharpness, the greater the gray difference between pixels; the lower the sharpness, the smaller the gray difference between pixels.
  • the target sharpness calculated according to algorithms such as the Brenner gradient method, the Tenegrad gradient method, the Laplace gradient method, the variance method, and the energy gradient method may be specifically but not limited thereto.
  • the electronic device After the electronic device detects the foreground target, it can assign a foreground identifier to each foreground target to distinguish different foreground targets. Then, the corresponding relationship between the foreground identifier and the foreground coordinate is established. Each foreground target can be marked by the foreground identifier, and the position of each foreground target in the image to be processed can be located by the foreground coordinate. The electronic device can extract the foreground target through the foreground coordinates and identify the extracted foreground target.
  • the sharpness threshold may be a preset fixed value or a dynamically changing value, which is not limited herein. For example, it may be a fixed value stored in the electronic device in advance, or a value input by a user and dynamically adjusted as required, or a value calculated according to the acquired target sharpness.
  • the foreground target can be identified according to the depth data and the target definition at the same time. Specifically, the detected target sharpness of each foreground target is obtained, and the depth data corresponding to the foreground target whose target sharpness is greater than the sharpness threshold is obtained; and the foreground target whose depth data is smaller than the depth threshold is identified.
  • the image processing method provided in the foregoing embodiment may acquire an image to be processed, and perform target detection on the image to be processed to obtain a foreground target.
  • the foreground target is identified, and an image classification label is generated based on the recognition result of the foreground target.
  • the area occupied by the foreground target is relatively large, the foreground target can be identified more accurately. In this way, an image classification label is generated based on the recognition result of the foreground target, and the image can be classified more accurately.
  • FIG. 2, FIG. 3, and FIG. 8 are sequentially displayed according to the directions of the arrows, these operations are not necessarily performed sequentially in the order indicated by the arrows. Unless explicitly stated in this article, there is no strict order in which these operations can be performed, and these operations can be performed in other orders. Moreover, at least a part of the operations in FIG. 2, FIG. 3, and FIG. 8 may include multiple sub-operations or multiple phases. These sub-operations or phases are not necessarily executed at the same time, but may be performed at different times. These sub-operations The execution order of the operations or phases is not necessarily performed sequentially, but may be performed in turn or alternately with at least a part of other operations or sub-operations or phases of other operations.
  • FIG. 9 is a schematic structural diagram of an image processing apparatus according to an embodiment.
  • the image processing apparatus 900 includes an image acquisition module 902, a target detection module 904, a target recognition module 906, and an image classification module 908. among them:
  • the image acquisition module 902 is configured to acquire an image to be processed.
  • a target detection module 904 is configured to perform target detection on the image to be processed, and obtain a foreground target in the image to be processed.
  • a target recognition module 906 is configured to identify the foreground target if the target area occupied by the foreground target in the image to be processed is greater than an area threshold.
  • An image classification module 908 is configured to generate an image classification label according to a recognition result of the foreground object.
  • the image processing apparatus may acquire an image to be processed, and perform target detection on the image to be processed to obtain a foreground target.
  • the foreground target is identified, and an image classification label is generated based on the recognition result of the foreground target.
  • the area occupied by the foreground target is relatively large, the foreground target can be identified more accurately. In this way, an image classification label is generated based on the recognition result of the foreground target, and the image can be classified more accurately.
  • the image acquisition module 902 is further configured to acquire an image set including at least one target image, and calculate the similarity between any two target images; classify the target image according to the similarity; The similarity between any two target images in the same type of target image is greater than the similarity threshold; one target image is obtained from each type of target image as the image to be processed.
  • the target recognition module 906 is further configured to: if two or more foreground objects are detected from the image to be processed, use the total area of all foreground objects included in the image to be processed as Target area; if the target area is greater than an area threshold, identify the foreground target.
  • the target recognition module 906 is further configured to obtain the target sharpness of each detected foreground target, and identify a foreground target whose target sharpness is greater than a sharpness threshold.
  • the target recognition module 906 is further configured to obtain detected depth data of each foreground target, where the depth data is used to represent the distance between the foreground target and the image acquisition device; the depth data is less than a depth threshold The foreground target is identified.
  • the image classification module 908 is further configured to, if two or more foreground objects are detected from the to-be-processed image, classify the foreground type identified by each of the foreground objects; One foreground type generates a corresponding image classification label.
  • the image classification module 908 is further configured to identify a background area other than a foreground target in the image to be processed if the target area is less than or equal to an area threshold; The recognition result generates an image classification label.
  • each module in the above image processing apparatus is for illustration only. In other embodiments, the image processing apparatus may be divided into different modules as needed to complete all or part of the functions of the above image processing apparatus.
  • An embodiment of the present application further provides a computer-readable storage medium.
  • One or more non-volatile computer-readable storage media containing computer-executable instructions, when the computer-executable instructions are executed by one or more processors, causing the processors to perform the image processing provided by the foregoing embodiments method.
  • An embodiment of the present application further provides an electronic device.
  • the above electronic device includes an image processing circuit.
  • the image processing circuit may be implemented by hardware and / or software components, and may include various processing units that define an ISP (Image Signal Processing) pipeline.
  • FIG. 10 is a schematic diagram of an image processing circuit in an embodiment. As shown in FIG. 10, for convenience of explanation, only aspects of the image processing technology related to the embodiments of the present application are shown.
  • the image processing circuit includes an ISP processor 1040 and a control logic 1050.
  • the image data captured by the imaging device 1010 is first processed by an ISP processor 1040, which analyzes the image data to capture image statistical information that can be used to determine and / or one or more control parameters of the imaging device 1010.
  • the imaging device 1010 may include a camera having one or more lenses 1012 and an image sensor 1014.
  • the image sensor 1014 may include a color filter array (such as a Bayer filter).
  • the image sensor 1014 may obtain the light intensity and wavelength information captured by each imaging pixel of the image sensor 1014, and provide a set of raw images Image data.
  • the sensor 1020 may provide parameters (such as image stabilization parameters) of the acquired image processing to the ISP processor 1040 based on the interface type of the sensor 1020.
  • the sensor 1020 interface may use a SMIA (Standard Mobile Imaging Architecture) interface, other serial or parallel camera interfaces, or a combination of the foregoing interfaces.
  • SMIA Standard Mobile Imaging Architecture
  • the image sensor 1014 may also send the original image data to the sensor 1020, and the sensor 1020 may provide the original image data to the ISP processor 1040 based on the interface type of the sensor 1020, or the sensor 1020 stores the original image data in the image memory 1030.
  • the ISP processor 1040 processes the original image data pixel by pixel in a variety of formats.
  • each image pixel may have a bit depth of 8, 10, 12, or 14 bits, and the ISP processor 1040 may perform one or more image processing operations on the original image data and collect statistical information about the image data.
  • the image processing operations may be performed with the same or different bit depth accuracy.
  • the ISP processor 1040 may also receive image data from the image memory 1030.
  • the sensor 1020 interface sends the original image data to the image memory 1030, and the original image data in the image memory 1030 is then provided to the ISP processor 1040 for processing.
  • the image memory 1030 may be a part of a memory device, a storage device, or a separate dedicated memory in an electronic device, and may include a DMA (Direct Memory Access) feature.
  • DMA Direct Memory Access
  • the ISP processor 1040 may perform one or more image processing operations, such as time-domain filtering.
  • the processed image data may be sent to the image memory 1030 for further processing before being displayed.
  • the ISP processor 1040 receives processed data from the image memory 1030, and performs image data processing on the processed data in the original domain and in the RGB and YCbCr color spaces.
  • the image data processed by the ISP processor 1040 may be output to a display 1070 for viewing by a user and / or further processed by a graphics engine or a GPU (Graphics Processing Unit).
  • the output of the ISP processor 1040 can also be sent to the image memory 1030, and the display 1070 can read image data from the image memory 1030.
  • the image memory 1030 may be configured to implement one or more frame buffers.
  • the output of the ISP processor 1040 may be sent to an encoder / decoder 1060 to encode / decode image data.
  • the encoded image data can be saved and decompressed before being displayed on the display 1070 device.
  • the encoder / decoder 1060 may be implemented by a CPU or a GPU or a coprocessor.
  • the statistical data determined by the ISP processor 1040 may be sent to the control logic unit 1050.
  • the statistical data may include image sensor 1014 statistical information such as auto exposure, auto white balance, auto focus, flicker detection, black level compensation, and lens 1012 shading correction.
  • the control logic 1050 may include a processor and / or a microcontroller that executes one or more routines (such as firmware). The one or more routines may determine control parameters and ISP processing of the imaging device 1010 based on the received statistical data. 1040 control parameters.
  • control parameters of the imaging device 1010 may include sensor 1020 control parameters (such as gain, integration time for exposure control, image stabilization parameters, etc.), camera flash control parameters, lens 1012 control parameters (such as focus distance for focusing or zooming), or these A combination of parameters.
  • ISP control parameters may include gain levels and color correction matrices for automatic white balance and color adjustment (eg, during RGB processing), and lens 1012 shading correction parameters.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM), which is used as external cache memory.
  • RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDR, SDRAM), enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDR dual data rate SDRAM
  • SDRAM enhanced SDRAM
  • SLDRAM synchronous Link (Synchlink) DRAM
  • SLDRAM synchronous Link (Synchlink) DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

An image processing method, comprising: obtaining an image to be processed; performing target detection on the image to be processed to obtain a foreground target in the image to be processed; if a target area occupied by the foreground target in the image to be processed is larger than an area threshold, recognizing the foreground target; and generating an image classification label according to the recognition result of the foreground target.

Description

图像处理方法、计算机可读存储介质和电子设备Image processing method, computer-readable storage medium, and electronic device
相关申请的交叉引用Cross-reference to related applications
本申请要求于2018年06月08日提交中国专利局、申请号为201810587091.1、发明名称为“图像处理方法、装置、计算机可读存储介质和电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed on June 08, 2018 with the Chinese Patent Office, application number 201810587091.1, and invention name "Image Processing Method, Apparatus, Computer-readable Storage Media, and Electronic Equipment", its entire contents Incorporated by reference in this application.
技术领域Technical field
本申请涉及计算机技术领域,特别是涉及一种图像处理方法、计算机可读存储介质和电子设备。The present application relates to the field of computer technology, and in particular, to an image processing method, a computer-readable storage medium, and an electronic device.
背景技术Background technique
智能设备可以通过摄像头拍摄图像,也可以通过与其他智能设备的传输来获取图像。图像拍摄的场景可以有很多,例如海滩、雪景、夜景等。拍摄图像中还可能存在很多前景目标,例如汽车、人、动物等。通常情况下,不同场景下拍摄的图像有不同的颜色特征,不同的前景目标的表现特征也不同。Smart devices can capture images through the camera, or they can acquire images through transmission with other smart devices. There can be many scenes for image shooting, such as beach, snow, night, etc. There may also be many foreground targets in the captured image, such as cars, people, animals, and so on. Generally, images captured in different scenes have different color characteristics, and different foreground objects have different performance characteristics.
发明内容Summary of the Invention
根据本申请的各种实施例,提供一种图像处理方法、计算机可读存储介质和电子设备。According to various embodiments of the present application, an image processing method, a computer-readable storage medium, and an electronic device are provided.
一种图像处理方法,所述方法包括:An image processing method, the method includes:
获取待处理图像;Obtaining images to be processed;
对所述待处理图像进行目标检测,获取所述待处理图像中的前景目标;Performing target detection on the image to be processed to obtain a foreground target in the image to be processed;
若所述前景目标在所述待处理图像中所占的目标面积大于面积阈值,则对所述前景目标进行识别;Identifying the foreground target if the target area occupied by the foreground target in the image to be processed is larger than an area threshold;
根据对所述前景目标的识别结果生成图像分类标签。An image classification label is generated according to a recognition result of the foreground target.
一种图像处理装置,所述装置包括:An image processing device includes:
图像获取模块,用于获取待处理图像;An image acquisition module, configured to acquire an image to be processed;
目标检测模块,用于对所述待处理图像进行目标检测,获取所述待处理图像中的前景目标;A target detection module, configured to perform target detection on the image to be processed, and obtain a foreground target in the image to be processed;
目标识别模块,用于若所述前景目标在所述待处理图像中所占的目标面积大于面积阈值,则对所述前景目标进行识别;A target recognition module, configured to identify the foreground target if the target area occupied by the foreground target in the image to be processed is greater than an area threshold;
图像分类模块,用于根据对所述前景目标的识别结果生成图像分类标签。An image classification module is configured to generate an image classification label according to a recognition result of the foreground target.
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如下操作:A computer-readable storage medium stores a computer program thereon. When the computer program is executed by a processor, the following operations are implemented:
获取待处理图像;Obtaining images to be processed;
对所述待处理图像进行目标检测,获取所述待处理图像中的前景目标;Performing target detection on the image to be processed to obtain a foreground target in the image to be processed;
若所述前景目标在所述待处理图像中所占的目标面积大于面积阈值,则对所述前景目标进行识别;Identifying the foreground target if the target area occupied by the foreground target in the image to be processed is larger than an area threshold;
根据对所述前景目标的识别结果生成图像分类标签。An image classification label is generated according to a recognition result of the foreground target.
一种电子设备,包括存储器及处理器,所述存储器中储存有计算机可读指令,所述指令被所述处理器执行时,使得所述处理器执行如下操作:An electronic device includes a memory and a processor. The memory stores computer-readable instructions. When the instructions are executed by the processor, the processor causes the processor to perform the following operations:
获取待处理图像;Obtaining images to be processed;
对所述待处理图像进行目标检测,获取所述待处理图像中的前景目标;Performing target detection on the image to be processed to obtain a foreground target in the image to be processed;
若所述前景目标在所述待处理图像中所占的目标面积大于面积阈值,则对所述前景目标进行识别;Identifying the foreground target if the target area occupied by the foreground target in the image to be processed is larger than an area threshold;
根据对所述前景目标的识别结果生成图像分类标签。An image classification label is generated according to a recognition result of the foreground target.
上述图像处理方法、计算机可读存储介质和电子设备,可获取待处理图像,并对待处理图像进行目标检测,获取前景目标。在前景目标所占的目标面积大于面积阈值时,对前景目标进行识别,并根据对前景目标的识别结果生成图像分类标签。在前景目标所占面积比较大的时候,可以更准确地对前景目标进行识别,这样通过前景目标的识别结果来生成图像分类标签,可以对图像进行更准确地分类。The image processing method, the computer-readable storage medium, and the electronic device can acquire an image to be processed, perform target detection on the image to be processed, and obtain a foreground target. When the target area occupied by the foreground target is greater than the area threshold, the foreground target is identified, and an image classification label is generated based on the recognition result of the foreground target. When the area occupied by the foreground target is relatively large, the foreground target can be identified more accurately. In this way, an image classification label is generated based on the recognition result of the foreground target, and the image can be classified more accurately.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions in the embodiments of the present application or the prior art more clearly, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are merely These are some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without paying creative work.
图1为一个实施例中图像处理方法的应用环境图。FIG. 1 is an application environment diagram of an image processing method in an embodiment.
图2为一个实施例中图像处理方法的流程图。FIG. 2 is a flowchart of an image processing method according to an embodiment.
图3为另一个实施例中图像处理方法的流程图。FIG. 3 is a flowchart of an image processing method in another embodiment.
图4(a)为一个实施例中目标面积小于面积阈值的图像示意图。FIG. 4 (a) is a schematic diagram of an image where the target area is less than the area threshold in one embodiment.
图4(b)为一个实施例中目标面积大于面积阈值的图像示意图。FIG. 4 (b) is an image diagram of a target area larger than an area threshold in an embodiment.
图5为一个实施例中对识别图像前景和背景的模型示意图。FIG. 5 is a schematic diagram of a model for identifying a foreground and a background of an image in an embodiment.
图6为另一个实施例中识别图像前景和背景的模型示意图。FIG. 6 is a schematic diagram of a model for identifying an image foreground and background in another embodiment.
图7为一个实施例中生成图像分类标签的示意图。FIG. 7 is a schematic diagram of generating an image classification label in one embodiment.
图8为又一个实施例中图像处理方法的流程图。FIG. 8 is a flowchart of an image processing method according to another embodiment.
图9为一个实施例中图像处理装置的结构示意图。FIG. 9 is a schematic structural diagram of an image processing apparatus according to an embodiment.
图10为一个实施例中图像处理电路的示意图。FIG. 10 is a schematic diagram of an image processing circuit in an embodiment.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solution, and advantages of the present application clearer, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the application, and are not used to limit the application.
可以理解,本申请所使用的术语“第一”、“第二”等可在本文中用于描述各种元件,但这些元件不受这些术语限制。这些术语仅用于将第一个元件与另一个元件区分。举例来说,在不脱离本申请的范围的情况下,可以将第一客户端称为第二客户端,且类似地,可将第二客户端称为第一客户端。第一客户端和第二客户端两者都是客户端,但其不是同一客户端。It can be understood that the terms “first”, “second”, and the like used in this application can be used herein to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish the first element from another element. For example, without departing from the scope of the present application, the first client may be referred to as the second client, and similarly, the second client may be referred to as the first client. Both the first client and the second client are clients, but they are not the same client.
图1为一个实施例中图像处理方法的应用环境图。如图1所示,该应用环境中包括终端102和服务器104。终端102和服务器104之间可以传输待处理图像,并对待处理图像进行分类处理。在一个实施例中,终端102可以存储若干张待处理图像,然后将待处理图像发送给服务器104。服务器104中存储了对图像进行分类的分类算法,可以对接收到的待处理图像进行目标检测,确定待处理图像中包含的前景目标。若前景目标在待处理图像中所占的目标面积大于面积阈值,则对前景目标进行识别;根据对前景目标的识别结果生成图像分类标签,并将得到的图像分类标签发送给终端102。终端102可以根据得到的图像分类标签对待处理图像进行分类处理。其中,终端102是处于计算机网络最***,主要用于输入用户信息以及输出处理结果的电子设备,例如可以是个人电脑、移动终端、个人数字助理、可穿戴电子设备等。服务器104是用于响应服务请求,同时提供计算服务的设备,例如可以是一台或者多台计算机。在本申请提供的其他实施例中,上述应用环境中还可以只包括终端102或服务器104,在此不做限定。FIG. 1 is an application environment diagram of an image processing method in an embodiment. As shown in FIG. 1, the application environment includes a terminal 102 and a server 104. The image to be processed may be transmitted between the terminal 102 and the server 104, and the image to be processed may be classified and processed. In one embodiment, the terminal 102 may store several images to be processed, and then send the images to be processed to the server 104. A classification algorithm for classifying images is stored in the server 104, and target detection may be performed on the received to-be-processed image to determine a foreground target included in the to-be-processed image. If the target area occupied by the foreground target in the image to be processed is greater than the area threshold, the foreground target is identified; an image classification label is generated according to the recognition result of the foreground target, and the obtained image classification label is sent to the terminal 102. The terminal 102 may perform classification processing on the image to be processed according to the obtained image classification label. The terminal 102 is an electronic device located at the outermost periphery of a computer network and is mainly used for inputting user information and outputting processing results. For example, the terminal 102 may be a personal computer, a mobile terminal, a personal digital assistant, or a wearable electronic device. The server 104 is a device for responding to a service request while providing a computing service, and may be, for example, one or more computers. In other embodiments provided in this application, the foregoing application environment may further include only the terminal 102 or the server 104, which is not limited herein.
图2为一个实施例中图像处理方法的流程图。如图2所示,该图像处理方法包括操作202至操作208。其中:FIG. 2 is a flowchart of an image processing method according to an embodiment. As shown in FIG. 2, the image processing method includes operations 202 to 208. among them:
操作202,获取待处理图像。In operation 202, an image to be processed is obtained.
在一个实施例中,待处理图像可以是通过电子设备的摄像头获取的,也可以是从其他电子设备上获取的,还可以是通过网络下载的,在此不做限定。例如,电子设备上可以安装摄像头,电子设备在检测到拍摄指令时,通过拍摄指令控制摄像头来采集待处理图像。电子设备在获取到图像之后,可以立即对图像进行处理,也可以将图像统一存放在一个文件夹中,在该文件夹中存储的图像到达一定数量之后,再将存储的图像统一进行处理。电子设备可以将获取的图像存储到相册中,当相册中存储的图像大于一定数量时,就触发对相册中的图像进行处理。In one embodiment, the image to be processed may be acquired through a camera of an electronic device, or may be acquired from another electronic device, or may be downloaded through a network, which is not limited herein. For example, a camera may be installed on the electronic device, and when the electronic device detects a shooting instruction, it controls the camera through the shooting instruction to collect images to be processed. After obtaining the images, the electronic device can process the images immediately or store the images in a folder in a unified manner. After the images stored in the folder reach a certain number, the stored images are processed in a unified manner. The electronic device may store the acquired images in an album, and when the number of images stored in the album is greater than a certain number, processing of the images in the album is triggered.
操作204,对待处理图像进行目标检测,获取待处理图像中的前景目标。In operation 204, target detection is performed on the image to be processed to obtain a foreground target in the image to be processed.
具体地,拍摄图像的场景中一般都包含了一个或多个物体。例如,拍摄室外场景的时候,图像中一般会包含行人、蓝天、沙滩、建筑物等,拍摄室内场景的时候,图像中一般会包含家具家电、办公用品等物体。前景目标是指图像中比较突出的主体目标,是用户比较关注的物体。图像中除前景目标之外的区域为背景区域。Specifically, one or more objects are generally included in a scene where an image is captured. For example, when shooting outdoor scenes, the image generally includes pedestrians, blue sky, beaches, buildings, etc. When shooting indoor scenes, the image generally includes objects such as furniture, appliances, office supplies, and so on. The foreground target refers to the more prominent main target in the image, which is the object that the user is more concerned about. The area in the image other than the foreground target is the background area.
可理解的是,待处理图像是由若干个像素点构成的二维像素矩阵,电子设备可以对待处理图像中的前景目标进行检测。检测到前景目标中包含待处理图像中的部分或全部像素点,然后将前景目标在待处理图像中的具***置进行标记。具体的,电子设备在检测到前景目标之后,可以通过矩形框将前景目标在待处理图像中进行标注,这样用户就可以直接从待处理图像中看到检测到的前景目标的具***置。It can be understood that the image to be processed is a two-dimensional pixel matrix composed of several pixels, and the electronic device can detect the foreground target in the image to be processed. It is detected that the foreground target contains some or all of the pixels in the image to be processed, and then the specific position of the foreground target in the image to be processed is marked. Specifically, after detecting the foreground target, the electronic device may mark the foreground target in the image to be processed through a rectangular frame, so that the user can directly see the specific position of the detected foreground target from the image to be processed.
操作206,若前景目标在待处理图像中所占的目标面积大于面积阈值,则对前景目标进行识别。In operation 206, if the target area occupied by the foreground target in the image to be processed is greater than the area threshold, the foreground target is identified.
在一个实施例中,检测出待处理图像中的前景目标之后,可对检测到的前景目标建立一个目标标识,用于唯一标示一个前景目标。电子设备可建立图像标识、目标标识、目标位置的对应关系,图像标识用于唯一标示一张待处理图像,目标位置用于表示前景目标在待处理图像中的具***置。In one embodiment, after detecting a foreground target in an image to be processed, a target identifier may be established for the detected foreground target to uniquely identify a foreground target. The electronic device may establish a correspondence between an image identifier, a target identifier, and a target position. The image identifier is used to uniquely identify an image to be processed, and the target position is used to indicate a specific position of the foreground target in the image to be processed.
检测到的前景目标是由待处理图像中的部分或全部像素点构成的,可以统计前景目标所在区域中包含的像素点数量,根据统计得到的像素点数量计算该前景目标所占的目标面积。具体的,目标面积可以直接通过前景目标中包含的像素点数量进行表示,也可以用前景目标中包含的像素点数量与待处理图像中包含的像素点数量的比例进行表示。前景目标中包含的像素点数量越多,对应的目标面积越大。The detected foreground target is composed of some or all pixels in the image to be processed. The number of pixels contained in the area where the foreground target is located can be counted, and the target area occupied by the foreground target can be calculated based on the counted number of pixels. Specifically, the target area may be directly expressed by the number of pixels included in the foreground target, or may be expressed by a ratio of the number of pixels included in the foreground target to the number of pixels included in the image to be processed. The larger the number of pixels contained in the foreground target, the larger the corresponding target area.
电子设备在检测到前景目标之后,获取前景目标的目标面积。若目标面积大于面积阈值,则认为前景目标过大,相应的背景区域就比较小。背景区域过小的时候,对背景的识别就不准确,这时就可以根据前景目标来进行图像分类。例如,当前景目标占待处理图像的1/2以上的面积时,根据对前景目标的识别结果生成图像分类标签。The electronic device obtains the target area of the foreground target after detecting the foreground target. If the target area is greater than the area threshold, the foreground target is considered too large and the corresponding background area is relatively small. When the background area is too small, the recognition of the background is not accurate. At this time, image classification can be performed according to the foreground target. For example, when the foreground object occupies more than 1/2 of the area of the image to be processed, an image classification label is generated according to the recognition result of the foreground object.
一般地,电子设备在识别前景目标之前,会预先设置前景目标的分类类型,然后通过预设的分类算法识别检测到的前景目标具体属于预设的哪一个分类类型。例如,电子设备可以将前景目标分为人、小狗、小猫、美食、其他等类型,然后就可以识别检测到的前景目标具体属于上述类型的哪一类。具体的,本申请中可以但不限于是通过RCNN(Regions with CNN Features)、SSD(Single Shot MultiBox Detector)、YOLO(You Only Look Once)等算法检测和识别前景目标的。Generally, before identifying a foreground object, an electronic device sets a classification type of the foreground object in advance, and then recognizes which preset classification type the detected foreground object belongs to by using a preset classification algorithm. For example, the electronic device can classify the foreground target into a person, a puppy, a kitten, a gourmet, or other types, and then can identify which type of the aforementioned foreground target the detected foreground target belongs to. Specifically, in this application, it is possible, but not limited to, to detect and identify foreground targets through algorithms such as RCNN (Regions with CNN Features), SSD (Single Shot MultiBox Detector), and YOLO (You Only Look Look Once).
操作208,根据对前景目标的识别结果生成图像分类标签。Operation 208: Generate an image classification label according to the recognition result of the foreground target.
在本申请提供的实施例中,对待处理图像的前景目标进行识别之后,可以得到前景目标的前景类型,然后根据前景类型可以对待处理图像进行标记。图像分类标签可用于对待处理图像的类型进行标记,电子设备可以根据图像分类标签对待处理图像进行分类,然后 将待处理图像进行分类处理,还可以通过分类标签对待处理图像进行查找。例如,电子设备可以将对应同一图像分类标签的图像存放在一个相册中,这样用户可以分类查找对应的图像。In the embodiment provided by the present application, after the foreground object of the image to be processed is identified, the foreground type of the foreground object can be obtained, and then the image to be processed can be labeled according to the foreground type. The image classification label can be used to mark the type of the image to be processed. The electronic device can classify the image to be processed according to the image classification label, and then classify the image to be processed. The classification label can also be used to find the image to be processed. For example, the electronic device may store the images corresponding to the same image classification label in an album, so that the user can sort and find the corresponding images.
根据前景识别结果得到图像分类标签之后,可以根据图像分类标签对待处理图像进行分类处理。例如,检测到前景目标为人时,可以对图像中的人像区域进行美颜处理;检测到前景目标为植物时,可以提高植物的饱和度和对比度等。After the image classification label is obtained according to the foreground recognition result, the image to be processed can be classified and processed according to the image classification label. For example, when the foreground target is detected as a person, the portrait area in the image can be subjected to beauty treatment; when the foreground target is detected as a plant, the saturation and contrast of the plant can be improved.
上述实施例提供的图像处理方法,可获取待处理图像,并对待处理图像进行目标检测,获取前景目标。在前景目标所占的目标面积大于面积阈值时,对前景目标进行识别,并根据对前景目标的识别结果生成图像分类标签。在前景目标所占面积比较大的时候,可以更准确地对前景目标进行识别,这样通过前景目标的识别结果来生成图像分类标签,可以对图像进行更准确地分类。The image processing method provided in the foregoing embodiment may acquire an image to be processed, and perform target detection on the image to be processed to obtain a foreground target. When the target area occupied by the foreground target is greater than the area threshold, the foreground target is identified, and an image classification label is generated based on the recognition result of the foreground target. When the area occupied by the foreground target is relatively large, the foreground target can be identified more accurately. In this way, an image classification label is generated based on the recognition result of the foreground target, and the image can be classified more accurately.
图3为另一个实施例中图像处理方法的流程图。如图3所示,该图像处理方法包括操作302至操作316。其中:FIG. 3 is a flowchart of an image processing method in another embodiment. As shown in FIG. 3, the image processing method includes operations 302 to 316. among them:
操作302,获取包含至少一张目标图像的图像集合,并计算任意两张目标图像之间的相似度。Operation 302: Acquire an image set including at least one target image, and calculate the similarity between any two target images.
可以理解的是,在对待处理图像进行识别生成图像分类标签的时候,可以是对单张待处理图像进行识别,也可以是对批量的待处理图像进行识别。例如,在拍摄图像的时候,采集到图像后就立即对图像进行识别,并生成图像分类标签。也可以将采集到的图像存放在电子设备中,当采集的图像超过一定数量之后,再统一进行识别处理。It can be understood that when the image to be processed is identified and an image classification label is generated, it can be a single image to be processed or a batch of images to be processed. For example, when capturing an image, the image is recognized immediately after the image is captured, and an image classification label is generated. It is also possible to store the captured images in an electronic device, and after the captured images exceed a certain number, the recognition processing is unified.
图像集合中包含一张或多张目标图像,目标图像可以是存储在电子设备中的图像。电子设备存储的图像可能是通过不同的方式获取的,例如可能是用户通过摄像头拍摄的,可能是在网络上下载的,也可能是好友发送的。电子设备对图像集合中的目标图像进行识别,生成图像分类标签。在生成图像集合的时候,可以不用获取电子设备中存储的所有图像,而只获取部分图像来进行处理。生成图像集合具体就可以包括:从预设的文件路径获取的至少一张目标图像,并根据获取的目标图像生成图像集合。预设的文件路径用于存储可用于需要识别图像分类标签的图像,例如预设的文件路径中可只存储用户通过摄像头拍摄的图像。The image set includes one or more target images, and the target images may be images stored in the electronic device. The images stored by the electronic device may be obtained in different ways, for example, they may be taken by a user through a camera, they may be downloaded on the network, or they may be sent by a friend. The electronic device recognizes the target image in the image collection and generates an image classification label. When generating the image collection, it is not necessary to acquire all the images stored in the electronic device, but only acquire a part of the images for processing. Generating the image set may specifically include: at least one target image acquired from a preset file path, and generating an image set according to the acquired target image. The preset file path is used to store images that can be used to identify image classification labels. For example, the preset file path can store only images captured by the user through a camera.
在本申请提供的实施例中,为防止电子设备的资源被频繁消耗,可以在满足指定触发条件时,根据图像的生成时间来获取需要进行识别的图像。具体地,当满足指定触发条件时,根据电子设备中存储的存放时长超过时长阈值的目标图像生成图像集合,存放时长是指从电子设备获取到目标图像的时刻到当前时刻的时间间隔。例如,若图像是通过摄像头拍摄的,则从摄像头生成图像的时刻开始计时。若图像是通过网络下载的,则从接收到图像的时刻开始计时。电子设备可以在每到达指定时刻时,触发对图像的识别处理。或者在图像集合中包含的图像超过一定数量时,触发对图像的识别处理,在此不做限定。In the embodiment provided by the present application, in order to prevent resources of the electronic device from being frequently consumed, an image that needs to be identified may be acquired according to an image generation time when a specified trigger condition is satisfied. Specifically, when a specified trigger condition is met, an image collection is generated according to a target image stored in an electronic device whose storage duration exceeds a duration threshold, and the storage duration refers to a time interval from the time when the target image is acquired by the electronic device to the current time. For example, if the image was captured by a camera, the time is counted from the moment the image is generated by the camera. If the image is downloaded via the network, the time is counted from the moment the image is received. The electronic device can trigger an image recognition process every time a specified time is reached. Or when the number of images included in the image collection exceeds a certain number, the image recognition processing is triggered, which is not limited herein.
操作304,根据相似度将目标图像进行分类;其中,同一类目标图像中任意两张目标图像之间的相似度大于相似度阈值。In operation 304, the target images are classified according to the similarity; wherein the similarity between any two target images in the same type of target images is greater than the similarity threshold.
在对图像进行识别的时候,相似度比较高的图像往往识别结果也比较接近。例如,电子设备在连拍的时候,由于连续采集图像时间隔的时间比较短,所以采集的图像就比较相似,这样对图像的识别结果也是比较接近的。生成图像集合之后,可以计算图像集合中任意两张目标图像之间的相似度,相似度较高的目标图像仅做一次识别即可,避免重复多次识别造成电子设备资源消耗。When images are identified, images with a high degree of similarity often have similar recognition results. For example, when continuous shooting is performed by an electronic device, since the interval between successively captured images is relatively short, the captured images are similar, so that the recognition results of the images are relatively close. After generating the image set, the similarity between any two target images in the image set can be calculated. The target images with higher similarity can be identified only once to avoid the consumption of electronic device resources caused by repeated identification.
具体地,计算得到目标图像的相似度之后,可根据相似度将目标图像进行分类,将相似度较高的图像分到同一类。同一类图像之间的相似度都比较高,识别结果也比较接近,这样就可以将同一类图像统一进行识别处理。例如,计算图像集合中任意两张图像之间的 相似度,根据相似度对目标图像进行聚类。假设相似度的取值范围为[0,1],则可以将两张相似度大于0.9的图像分到同一类。Specifically, after calculating the similarity of the target image, the target image can be classified according to the similarity, and the images with higher similarity can be classified into the same class. The similarity between the same type of images is relatively high, and the recognition results are relatively close, so that the same type of images can be uniformly processed for recognition. For example, calculate the similarity between any two images in the image set, and cluster the target images based on the similarity. Assuming the range of similarity is [0,1], two images with similarity greater than 0.9 can be classified into the same category.
操作306,分别从每一类目标图像中获取一张目标图像作为待处理图像。Operation 306: Obtain a target image from each type of target image as the image to be processed.
对目标图像进行分类之后,可以从每一类目标图像中获取一张目标图像作为待处理图像进行识别处理,根据对待处理图像的识别结果生成的图像分类标签,可以作为对应的目标图像的图像分类标签。在一个实施例中,可以从每一类目标图像中随机获取一张目标图像作为待处理图像,还可以通过计算相似度差值来确定待处理图像。After classifying the target image, a target image can be obtained from each type of target image as the image to be processed for recognition processing. The image classification label generated according to the recognition result of the image to be processed can be used as the image classification of the corresponding target image label. In one embodiment, a target image may be randomly obtained from each type of target image as an image to be processed, and an image to be processed may also be determined by calculating a similarity difference value.
具体地,可以根据每一类目标图像生成图像子集合;遍历该图像子集合中的目标图像,将该目标图像与图像子集合中其他目标图像之间的相似度进行累加,得到相似度总和;根据该相似度总和从图像子集合中确定待处理图像。例如,计算得到图像子集合中每一张目标图像对应的相似度总和,相似度总和越大,说明该目标图像与其他目标图像的相似度越高,则可以将相似度总和最大的目标图像作为待处理图像。Specifically, an image subset can be generated according to each type of target image; the target images in the image subset are traversed, and similarities between the target image and other target images in the image subset are accumulated to obtain a total similarity sum; An image to be processed is determined from the image subset according to the sum of similarities. For example, the total similarity corresponding to each target image in the image subset is calculated. The larger the total similarity is, the higher the similarity between the target image and other target images is. The target image with the largest total similarity can be used as Pending image.
操作308,对待处理图像进行目标检测,获取待处理图像中的前景目标。Operation 308: Perform target detection on the image to be processed to obtain a foreground target in the image to be processed.
操作310,若从待处理图像中检测到两个或两个以上的前景目标,则将待处理图像中包含的所有前景目标的总面积作为目标面积。In operation 310, if two or more foreground objects are detected from the image to be processed, the total area of all foreground objects included in the image to be processed is used as the target area.
可以理解的是,待处理图像中可以存在一个或多个前景目标,当仅存在一个前景目标时,将该前景目标在待处理图像中所占的面积作为目标面积;当存在两个或两个以上的前景目标时,则将待处理图像中包含的所有前景目标所占的总面积作为目标面积。It can be understood that one or more foreground objects may exist in the image to be processed. When there is only one foreground object, the area occupied by the foreground object in the image to be processed is used as the target area; when there are two or two For the above foreground target, the total area occupied by all foreground targets included in the image to be processed is taken as the target area.
操作312,若目标面积大于面积阈值,则对前景目标进行识别。In operation 312, if the target area is greater than the area threshold, the foreground target is identified.
当目标面积大于面积阈值时,认为前景目标所占的面积较大,背景区域所占的面积较小。当目标面积大于面积阈值时,对前景目标进行识别;若目标面积小于或等于面积阈值,则对待处理图像中除前景目标之外的背景区域进行识别;根据对背景区域的识别结果生成图像分类标签。When the target area is larger than the area threshold, it is considered that the area occupied by the foreground target is larger and the area occupied by the background area is smaller. When the target area is greater than the area threshold, the foreground target is identified; if the target area is less than or equal to the area threshold, the background area other than the foreground target in the image to be processed is identified; an image classification label is generated based on the recognition result of the background area .
电子设备对待处理图像中的背景区域进行检测,检测到背景区域之后识别背景区域具体属于哪一个背景类型。电子设备可以预先设置背景区域的背景类型,然后通过预设的算法识别背景区域具体属于哪一个预设的背景类型。例如,可以将背景区域分为海滩、雪景、夜景、蓝天、室内等场景,在对背景区域进行识别后,可以得到背景区域对应的背景类型。根据得到的背景类型生成图像分类标签。The electronic device detects a background area in the image to be processed, and detects which background type the background area belongs to after detecting the background area. The electronic device can set the background type of the background area in advance, and then identify which preset background type the background area specifically belongs to through a preset algorithm. For example, the background area can be divided into scenes such as beach, snow, night, blue sky, indoor, etc. After identifying the background area, the background type corresponding to the background area can be obtained. An image classification label is generated according to the obtained background type.
图4(a)为一个实施例中目标面积小于面积阈值的图像示意图,如图4(a),包括背景区域402和前景目标404,背景区域402在图像中所占的面积大于前景目标404在图像中所占的面积,此时对背景区域402的识别更加精确,则可以根据背景区域402进行识别得到的识别结果生成图像分类标签。图4(b)为一个实施例中目标面积大于面积阈值的图像示意图,如图4(b),包括背景区域406和前景目标408,前景目标408在图像中所占的面积大于背景区域406在图像中所占的面积,此时对前景目标408的识别更加精确,则可以根据前景目标408进行识别得到的识别结果生成图像分类标签。FIG. 4 (a) is a schematic diagram of an image where the target area is less than the area threshold in an embodiment. As shown in FIG. 4 (a), it includes a background area 402 and a foreground target 404. The area of the background area 402 in the image is larger than the foreground target 404. For the area occupied by the image, the recognition of the background area 402 is more accurate at this time, and an image classification label may be generated according to the recognition result obtained by the recognition of the background area 402. FIG. 4 (b) is a schematic diagram of an image where the target area is greater than the area threshold in one embodiment. As shown in FIG. 4 (b), it includes the background area 406 and the foreground target 408. The area occupied by the foreground target 408 in the image is larger than the background area 406. For the area occupied by the image, the recognition of the foreground object 408 is more accurate at this time, and an image classification label may be generated according to the recognition result obtained by the recognition of the foreground object 408.
具体地,可以通过分类模型识别背景区域,通过检测模型来识别前景目标。电子设备在通过分类模型和检测模型识别背景区域和前景目标之前,会对分类模型和检测模型进行训练,并分别输出一个对应的损失函数。损失函数为可评估分类结果的置信度的函数,识别背景区域和前景目标的时候,可通过损失函数分别输出每一个预设类别对应的置信度。置信度越高的类别,表示图像为该类别的概率越大,这样就通过置信度来判断图像对应的背景类型和前景类型。Specifically, the background region can be identified by the classification model, and the foreground target can be identified by the detection model. Before the electronic device recognizes the background area and the foreground target through the classification model and the detection model, the electronic device trains the classification model and the detection model, and outputs a corresponding loss function, respectively. The loss function is a function that can evaluate the confidence of the classification results. When the background area and the foreground target are identified, the confidence function corresponding to each preset category can be output through the loss function. The higher the confidence level, the greater the probability that the image belongs to the category. In this way, the background type and foreground type corresponding to the image are determined by the confidence level.
例如,预先将图像的背景定义为海滩、夜景、烟火、室内等类型,电子设备可以预先将分类模型进行训练,训练后的分类模型可以输出一个损失函数。将待处理图像输入到训练好的分类模型中,就可以通过分类模型检测到背景区域,并识别背景区域的类型。具体地,通过损失函数可以计算每一个预设背景类型对应的置信度,通过置信度来确定背景区 域对应的背景分类结果。比如计算得到的海滩、夜景、烟火、室内等四个类型对应的置信度分别为0.01、0.06、0.89、0.04,则可确定待处理图像的背景区域为置信度最高的背景类型。For example, the background of the image is defined in advance as types of beach, night scene, fireworks, indoor, etc. The electronic device can train the classification model in advance, and the trained classification model can output a loss function. By inputting the image to be processed into the trained classification model, the background region can be detected by the classification model, and the type of the background region can be identified. Specifically, the confidence function corresponding to each preset background type can be calculated through the loss function, and the background classification result corresponding to the background region is determined through the confidence degree. For example, the calculated confidence levels of the four types of beach, night view, fireworks, and indoor are 0.01, 0.06, 0.89, and 0.04, respectively. It can be determined that the background region of the image to be processed is the background type with the highest confidence.
图5为一个实施例中对识别图像前景和背景的模型示意图。如图5所示,电子设备可对分类模型进行训练,在训练模型之前会将图像打上类别标签,并通过图像及对应的类别标签对分类模型进行训练。分类模型训练好之后,可以得到一个第一损失函数。在识别过程中,可通过分类模型检测图像中的背景区域,并通过得到的第一损失函数计算每个预设背景类型对应的第一置信度。根据得到的第一置信度可以确定背景区域对应的背景分类结果。电子设备可对检测模型进行训练,在训练模型之前会将图像中包含的前景目标用矩形框进行标记,并标记每个前景目标对应的类别。通过图像对检测模型进行训练。检测模型训练好之后,可以得到一个第二损失函数。在识别过程中,可通过检测模型检测图像中的前景目标,并输出各个前景目标的位置。通过第二损失函数可计算每个预设前景类型对应的第二置信度。根据得到的第二置信度可以确定前景目标对应的前景分类结果。可以理解的是,上述分类模型和检测模型可以是两个独立的算法模型,分类模型可以是Mobilenet算法模型,检测模型可以是SSD算法模型,在此不做限定。FIG. 5 is a schematic diagram of a model for identifying a foreground and a background of an image in an embodiment. As shown in FIG. 5, the electronic device can train the classification model. Before training the model, the image is labeled with a category label, and the classification model is trained through the image and the corresponding category label. After training the classification model, a first loss function can be obtained. In the recognition process, a background region in an image can be detected by a classification model, and a first confidence level corresponding to each preset background type can be calculated by using the obtained first loss function. According to the obtained first confidence level, a background classification result corresponding to the background region can be determined. The electronic device can train the detection model. Before training the model, the foreground targets included in the image are marked with a rectangular frame, and the category corresponding to each foreground target is marked. The detection model is trained through images. After the detection model is trained, a second loss function can be obtained. In the recognition process, the foreground objects in the image can be detected by the detection model, and the positions of each foreground object can be output. A second confidence function corresponding to each preset foreground type can be calculated through the second loss function. According to the obtained second confidence level, the foreground classification result corresponding to the foreground target can be determined. It can be understood that the above classification model and detection model can be two independent algorithm models, the classification model can be a Mobilenet algorithm model, and the detection model can be an SSD algorithm model, which is not limited here.
图6为另一个实施例中识别图像前景和背景的模型示意图。如图6所示,该识别模型是一个神经网络模型,该神经网络的输入层接收带有图像类别标签的训练图像,通过基础网络(如CNN网络)进行特征提取,并将提取的图像特征输出给特征层,由该特征层对背景训练目标进行类别检测得到第一损失函数,对前景训练目标根据图像特征进行类别检测得到第二损失函数,对前景训练目标根据前景区域进行位置检测得到位置损失函数,将第一损失函数、第二损失函数和位置损失函数进行加权求和得到目标损失函数。该神经网络可为卷积神经网络。卷积神经网络包括数据输入层、卷积计算层、激活层、池化层和全连接层。数据输入层用于对原始图像数据进行预处理。该预处理可包括去均值、归一化、降维和白化处理。去均值是指将输入数据各个维度都中心化为0,目的是将样本的中心拉回到坐标系原点上。归一化是将幅度归一化到同样的范围。白化是指对数据各个特征轴上的幅度归一化。卷积计算层用于局部关联和窗口滑动。卷积计算层中每个滤波器连接数据窗的权重是固定的,每个滤波器关注一个图像特征,如垂直边缘、水平边缘、颜色、纹理等,将这些滤波器合在一起得到整张图像的特征提取器集合。一个滤波器是一个权重矩阵。通过一个权重矩阵可与不同窗口内数据做卷积。激活层用于将卷积层输出结果做非线性映射。激活层采用的激活函数可为ReLU(The Rectified Linear Unit,修正线性单元)。池化层可夹在连续的卷积层中间,用于压缩数据和参数的量,减小过拟合。池化层可采用最大值法或平均值法对数据降维。全连接层位于卷积神经网络的尾部,两层之间所有神经元都有权重连接。卷积神经网络的一部分卷积层级联到第一置信度输出节点,一部分卷积层级联到第二置信度输出节点,一部分卷积层级联到位置输出节点,根据第一置信度输出节点可以检测到图像的背景类型,根据第二置信度输出节点可以检测到图像的前景目标的类别,根据位置输出节点可以检测到前景目标所对应的位置。FIG. 6 is a schematic diagram of a model for identifying an image foreground and background in another embodiment. As shown in Figure 6, the recognition model is a neural network model. The input layer of the neural network receives training images with image category labels, performs feature extraction through a basic network (such as a CNN network), and outputs the extracted image features. The feature layer is used to perform category detection on the background training target to obtain a first loss function, the foreground training target is subjected to category detection to obtain a second loss function, and the foreground training target is subjected to position detection based on the foreground area to obtain a position loss. Function, weighting and summing the first loss function, the second loss function, and the position loss function to obtain a target loss function. The neural network may be a convolutional neural network. Convolutional neural networks include a data input layer, a convolutional calculation layer, an activation layer, a pooling layer, and a fully connected layer. The data input layer is used to pre-process the original image data. The pre-processing may include de-averaging, normalization, dimensionality reduction, and whitening processes. De-averaging refers to centering all dimensions of the input data to 0 in order to pull the center of the sample back to the origin of the coordinate system. Normalization is normalizing the amplitude to the same range. Whitening refers to normalizing the amplitude on each characteristic axis of the data. The convolution calculation layer is used for local correlation and window sliding. The weight of each filter connected to the data window in the convolution calculation layer is fixed. Each filter focuses on an image feature, such as vertical edges, horizontal edges, colors, textures, etc., and these filters are combined to obtain the entire image. Feature extractor collection. A filter is a weight matrix. A weight matrix can be used to convolve with data in different windows. The activation layer is used to non-linearly map the output of the convolution layer. The activation function used by the activation layer may be ReLU (The Rectified Linear Unit). The pooling layer can be sandwiched between consecutive convolutional layers to compress the amount of data and parameters and reduce overfitting. The pooling layer can use the maximum method or average method to reduce the dimensionality of the data. The fully connected layer is located at the tail of the convolutional neural network, and all neurons between the two layers have the right to reconnect. A part of the convolutional neural network is cascaded to the first confidence output node, a part of the convolutional layer is cascaded to the second confidence output node, and a part of the convolutional layer is cascaded to the position output node. According to the first confidence output node, it can be detected. For the background type of the image, the output node can detect the type of the foreground object of the image according to the second confidence level, and the position corresponding to the foreground object can be detected according to the position output node.
具体地,上述分类模型和检测模型可以预先存储在电子设备中,在获取到待处理图像时,通过上述分类模型和检测模型对待处理图像进行识别处理。可以理解的是,分类模型和检测模型一般会占用电子设备的存储空间,而且在对大量图像进行处理的时候,对电子设备的存储能力要求也比较高。在对终端上的待处理图像进行处理时,可通过终端本地存储的分类模型和检测模型进行处理,也可以将待处理图像发送到服务器,通过服务器上存储的分类模型和检测模型进行处理。Specifically, the classification model and the detection model may be stored in an electronic device in advance, and when an image to be processed is acquired, the image to be processed is identified through the classification model and the detection model. It can be understood that the classification model and the detection model generally occupy the storage space of the electronic device, and when a large number of images are processed, the storage capacity requirements of the electronic device are also relatively high. When processing the image to be processed on the terminal, the image can be processed through the classification model and detection model stored locally on the terminal, or the image to be processed can be sent to the server for processing through the classification model and detection model stored on the server.
由于终端的存储能力一般比较有限,所以服务器可以将分类模型和检测模型训练好之后,将训练好的分类模型和检测模型发送给终端,终端就无需再对上述模型进行训练。同时终端存储的分类模型和检测模型可以是经过压缩之后的模型,这样压缩之后的模型占 用的资源就会比较小,但是相应的识别准确率就比较低。终端可以根据需要处理的待处理图像的数量决定在终端本地进行识别处理,还是在服务器上进行识别处理。终端在获取到待处理图像之后,统计待处理图像的图像数量,若图像数量超过预设上传数量,则将待处理图像上传至服务器,并在服务器上进行待处理图像的处理。服务器处理后,将处理结果发送给终端。Because the storage capacity of the terminal is generally limited, the server can send the trained classification model and detection model to the terminal after training the classification model and detection model, and the terminal does not need to train the above model. At the same time, the classification model and detection model stored in the terminal can be compressed models, so that the compressed model will occupy less resources, but the corresponding recognition accuracy will be lower. The terminal can decide whether to perform the recognition processing locally on the terminal or the recognition processing on the server according to the number of images to be processed. After the terminal obtains the image to be processed, it counts the number of images of the image to be processed. If the number of images exceeds the preset upload number, the terminal uploads the image to be processed to the server and processes the image to be processed on the server. After processing by the server, the processing result is sent to the terminal.
图7为一个实施例中生成图像分类标签的示意图。如图7所示,对图像背景区域进行识别,可以得到的图像分类标签包括风景、海滩、雪景、蓝天、绿地、夜景、黑暗、背光、日出/日落、室内、烟火、聚光灯等。对图像的前景目标进行识别,可得到的图像分类标签包括人像、婴儿、猫、狗、美食等。根据前景目标的目标面积决定是通过前景目标的识别结果生成图像分类标签,还是通过背景区域的识别结果生成图像分类标签。当前景目标所占的面积大于图像的1/2时,根据前景目标的识别结果生成图像分类标签;当背景区域所占的面积大于图像的1/2时,根据背景区域的识别结果生成图像分类标签。FIG. 7 is a schematic diagram of generating an image classification label in one embodiment. As shown in FIG. 7, the background region of the image is identified, and the image classification labels that can be obtained include landscape, beach, snow, blue sky, green space, night scene, darkness, backlight, sunrise / sunset, indoor, fireworks, spotlight, etc. Recognize the foreground object of the image. The available image classification labels include portraits, babies, cats, dogs, food, etc. According to the target area of the foreground target, it is determined whether to generate the image classification label based on the recognition result of the foreground target or the image classification label based on the recognition result of the background area. When the area occupied by the foreground object is larger than 1/2 of the image, an image classification label is generated according to the recognition result of the foreground object; when the area occupied by the background region is larger than 1/2 of the image, an image classification is generated according to the recognition result of the background region label.
操作314,将各个前景目标识别得到的前景类型进行分类,根据每一种前景类型生成一个对应的图像分类标签。Operation 314: classify the foreground types identified by each foreground object, and generate a corresponding image classification label according to each foreground type.
具体地,根据前景分类结果生成图像分类标签时,若根据前景分类结果判断待处理图像中只包含一种前景类型,则可以直接根据该前景类型生成图像分类标签;若根据前景分类结果判断待处理图像中只包含两种或两种以上前景类型的前景目标,则可以根据前景类型生成多级图像分类标签,即可以将得到的前景类型进行分类,根据每一种前景类型生成一个对应的图像分类标签。Specifically, when generating an image classification label according to the foreground classification result, if it is determined according to the foreground classification result that only one foreground type is included in the image to be processed, the image classification label may be directly generated according to the foreground type; if the foreground classification result is to be determined The image contains only two or more foreground types of foreground objects, then multi-level image classification labels can be generated according to the foreground type, that is, the obtained foreground types can be classified, and a corresponding image classification can be generated according to each foreground type. label.
在本申请提供的实施例中,可以设定一个生成的标签数量的上限值,当前景类型小于这个上限值时,可以根据每一类前景类型生成分类标签;当前景类型超过这个上限值时,只对部分前景类型生成分类标签。具体的,操作314之后还包括:统计图像分类标签的标签数量;若标签数量超过数量上限值,则从上述图像分类标签中获取目标图像分类标签。电子设备可根据目标图像分类标签对图像进行标记。In the embodiment provided by this application, an upper limit value for the number of generated tags may be set. When the current scene type is less than this upper limit, a classification label may be generated according to each type of foreground type; the current scene type exceeds this upper limit When the value is used, classification labels are generated only for some foreground types. Specifically, after operation 314, the method further includes: counting the number of tags of the image classification tags; and if the number of tags exceeds the upper limit of the number, obtaining a target image classification tag from the foregoing image classification tags. The electronic device can mark the image according to the target image classification label.
例如,图像中可包含三个前景目标,对应的前景类型分别为“人”、“狗”、“猫”。根据每一种前景类型生成一个对应的图像分类标签,分别为“目标-人”、“目标-狗”、“目标-猫”。那么生成的图像分类标签的标签数量就为3个。假设数量上限值为2个,那么上述得到的标签数量就超过了数量上限值,则可以分局上述图像分类标签确定目标图像分类标签,为“目标-人”、“目标-狗”。For example, the image may contain three foreground objects, and the corresponding foreground types are "human", "dog", and "cat", respectively. According to each foreground type, a corresponding image classification label is generated, which are "target-person", "target-dog", and "target-cat". Then the number of generated image classification labels is three. Assuming that the upper limit of the number is 2, then the number of tags obtained above exceeds the upper limit of the number. Then the target image classification tags can be determined by the above-mentioned image classification tags, which are "target-person" and "target-dog".
具体的,生成的图像分类标签的数量超过数量上限值时,可以计算各个图像分类标签对应的前景目标的总面积,根据上述总面积从图像分类标签中获取目标图像分类标签。可获取对应总面积最大的图像分类标签作为目标图像分类标签,也可以根据总面积将图像分类标签进行排序,从排序后的图像分类标签中获取目标图像分类标签。Specifically, when the number of generated image classification tags exceeds the upper limit of the number, the total area of the foreground target corresponding to each image classification tag may be calculated, and the target image classification tag may be obtained from the image classification tags according to the total area. The image classification label corresponding to the largest total area can be obtained as the target image classification label, or the image classification labels can be sorted according to the total area, and the target image classification label can be obtained from the sorted image classification labels.
举例说明,图像中只包含前景类型为“人”的前景目标,则可以直接根据前景类型“人”生成图像分类标签为“Pic-人”。若图像中包含目标A、目标B和目标C,对应的前景类型分别为“人”、“猫”和“人”,则可以分别计算“人”对应的目标A和目标C在图像中占的总面积S 1,“猫”对应的目标“B”在图像中占的总面积S 2。若S 1>S 2,则将根据前景类型“人”生成图像分类标签;若S 1<S 2,则将根据前景类型“猫”生成图像分类标签。 For example, if the image contains only foreground objects whose foreground type is "people", the image classification label "Pic-people" can be generated directly based on the foreground type "people". If the image contains target A, target B, and target C, and the corresponding foreground types are "human", "cat", and "person", respectively, the target A and target C corresponding to "people" can be calculated in the image The total area S 1 , the total area S 2 of the target “B” corresponding to “cat” in the image. If S 1 > S 2 , an image classification label will be generated according to the foreground type “person”; if S 1 <S 2 , an image classification label will be generated according to the foreground type “cat”.
在本申请提供的其他实施例中,生成的图像分类标签的数量超过数量上限值时,还可以统计各个图像分类标签对应的前景目标的目标数量,根据上述目标数量从图像分类标签中获取目标图像分类标签。可获取对应目标数量最多的图像分类标签作为目标图像分类标签,也可以根据目标数量将图像分类标签进行排序,从排序后的图像分类标签中获取目标图像分类标签。In other embodiments provided by the present application, when the number of generated image classification tags exceeds the upper limit of the number, the number of foreground targets corresponding to each image classification tag may also be counted, and an object may be obtained from the image classification tags according to the above number of targets. Image classification labels. The image classification label corresponding to the largest number of targets can be obtained as the target image classification label, or the image classification labels can be sorted according to the number of targets, and the target image classification label can be obtained from the sorted image classification labels.
举例说明,待处理图像中包含目标A、目标B、目标C、目标D、目标E和目标F,对应的前景类型分别为“人”、“狗”、“人”、“人”、“猫”和“狗”。则该待处理图像对应的 前景类型就包括“人”、“狗”和“猫”,生成根据前景类型生成图像分类标签分别为“目标_人”、“目标_狗”和“目标_猫”,对应的前景目标的目标数量分别为3、2、1。那么可以根据目标数量排序前两位的图像分类标签“目标_人”、“目标_狗”作为目标图像分类标签。For example, the image to be processed contains target A, target B, target C, target D, target E, and target F, and the corresponding foreground types are "human", "dog", "human", "human", "cat" "And" Dog. " Then the foreground types corresponding to the image to be processed include "human", "dog" and "cat", and the generated image classification tags according to the foreground type are "target_person", "target_dog" and "target_cat" , The corresponding number of foreground targets is 3, 2, 1 respectively. Then, the first two image classification labels "target_person" and "target_dog" can be sorted according to the number of targets as the target image classification labels.
在一个实施例中,对前景目标进行识别的操作还包括:In one embodiment, the operation of identifying the foreground target further includes:
操作802,获取检测到的各个前景目标的深度数据,深度数据用于表示前景目标到图像采集装置之间的距离。Operation 802: Obtain depth data of each detected foreground object, and the depth data is used to represent a distance between the foreground object and the image acquisition device.
深度数据用于表示前景目标到图像采集装置之间的距离,可以认为前景目标离图像采集装置越近,越被用户关注。深度数据可以但不限于是通过结构光、双摄像头测距等方式进行获取。一般地,电子设备在获取深度数据的时候,可以得到待处理图像中每一个像素点对应的深度数据,也就是前景目标中包含的所有像素点都有对应的深度数据。前景目标对应的深度数据,可以是前景目标中任意一个像素点对应的深度数据,也可以是前景目标中包含的所有像素点对应的深度数据的平均值,在此不做限定。The depth data is used to indicate the distance between the foreground target and the image acquisition device. It can be considered that the closer the foreground target is to the image acquisition device, the more attention the user receives. Depth data can be obtained, but not limited to, by means of structured light, dual camera ranging. Generally, when obtaining depth data, an electronic device can obtain depth data corresponding to each pixel point in an image to be processed, that is, all pixel points included in the foreground target have corresponding depth data. The depth data corresponding to the foreground target may be the depth data corresponding to any pixel in the foreground target, or the average value of the depth data corresponding to all the pixels included in the foreground target, which is not limited here.
操作804,对深度数据小于深度阈值的前景目标进行识别。Operation 804: Identify a foreground target whose depth data is less than a depth threshold.
在获取到深度数据之后,可以通过深度数据来筛选需要进行识别的前景目标。距离较近的前景目标,可以认为是用户比较关注的前景目标。具体的,当深度数据小于深度阈值时,则认为该前景目标为用户比较关注的前景目标,可以只对该深度数据小于深度阈值的前景目标进行识别。After obtaining the depth data, the foreground data that needs to be identified can be filtered by the depth data. Closer foreground targets can be considered as foreground targets that users are more concerned about. Specifically, when the depth data is less than the depth threshold, the foreground target is considered to be a foreground target that the user is more concerned about, and only the foreground target whose depth data is less than the depth threshold may be identified.
在本申请提供的其他实施例中,对前景目标进行识别的操作还可以包括:获取检测到的各个前景目标的目标清晰度,对目标清晰度大于清晰度阈值的前景目标进行识别。在对待处理图像进行目标检测的时候,可能从待处理图像中检测到多个前景目标。在检测到两个或两个以上的前景目标时,可以分别对每一个前景目标进行识别,得到每一个前景目标的前景类型,也可以选取其中的一个或多个前景目标进行识别,得到前景识别结果。In other embodiments provided by this application, the operation of identifying the foreground target may further include: acquiring the detected target sharpness of each foreground target, and identifying the foreground target whose target sharpness is greater than the sharpness threshold. When performing target detection on an image to be processed, multiple foreground targets may be detected from the image to be processed. When two or more foreground targets are detected, each foreground target can be identified separately to obtain the foreground type of each foreground target, or one or more of them can be selected for identification to obtain foreground recognition. result.
电子设备检测到待处理图像中的前景目标之后,可以计算各个前景目标对应的目标清晰度。目标清晰度可以反应前景目标的边缘细节等纹理的清晰程度,在一定程度上可以反映各个前景物体的重要性,因此可以根据目标清晰度来获取进行识别的前景目标。例如,用户在拍摄的时候,会将焦点聚焦在比较关注的物体上,并将其他物体进行模糊化处理。在对前景目标进行识别的时候,可以只对清晰度较高的前景目标进行识别,清晰度较低的前景目标不做识别处理。After the electronic device detects the foreground target in the image to be processed, it can calculate the target sharpness corresponding to each foreground target. The target sharpness can reflect the sharpness of textures such as the edge details of the foreground target, and can reflect the importance of each foreground object to a certain extent. Therefore, the foreground target for recognition can be obtained according to the target sharpness. For example, when shooting, the user will focus on the object of interest and blur the other objects. When identifying foreground objects, only foreground objects with higher definition can be identified, and foreground objects with lower definition are not identified.
前景目标中是可以包括若干个像素点的,则可以通过各个像素点的灰度差来计算得到前景目标的清晰度。一般清晰度越高,像素点之间的灰度差越大;清晰度越低,像素点之间的灰度差越小。在一个实施例中,具体可以是根据Brenner梯度法、Tenegrad梯度法、Laplace梯度法、方差法、能量梯度法等算法计算的目标清晰度,但不限于此。The foreground target can include several pixels, and the sharpness of the foreground target can be calculated by calculating the gray difference of each pixel. Generally, the higher the sharpness, the greater the gray difference between pixels; the lower the sharpness, the smaller the gray difference between pixels. In one embodiment, the target sharpness calculated according to algorithms such as the Brenner gradient method, the Tenegrad gradient method, the Laplace gradient method, the variance method, and the energy gradient method may be specifically but not limited thereto.
电子设备在检测到前景目标之后,可以对每一个前景目标赋予一个前景标识,用于区分不同的前景目标。然后建立前景标识和前景坐标的对应关系,通过前景标识可以对各个前景目标进行标记,通过前景坐标定位到各个前景目标在待处理图像中的位置。电子设备可以通过前景坐标提取前景目标,并对提取的前景目标进行识别。After the electronic device detects the foreground target, it can assign a foreground identifier to each foreground target to distinguish different foreground targets. Then, the corresponding relationship between the foreground identifier and the foreground coordinate is established. Each foreground target can be marked by the foreground identifier, and the position of each foreground target in the image to be processed can be located by the foreground coordinate. The electronic device can extract the foreground target through the foreground coordinates and identify the extracted foreground target.
当前景目标的目标清晰度大于清晰度阈值时,认为该前景目标的清晰度比较高,可以看做是用户比较关注的目标物体。前景目标的目标清晰度较高时,相应的识别准确性也比较高,得到的识别结果更可靠。具体的,清晰度阈值可以是预先设定的固定不变的值,也可以是动态变化的值,在此不做限定。例如,可以是预先存储在电子设备中的一个固定的值,也可以是用户输入的,根据需要进行动态调节的值,还可以是根据获取的各个目标清晰度进行计算的值。When the target sharpness of the foreground target is greater than the sharpness threshold, the sharpness of the foreground target is considered to be relatively high, and it can be regarded as a target object that the user is more concerned about. When the foreground target has a higher definition, the corresponding recognition accuracy is also higher, and the obtained recognition result is more reliable. Specifically, the sharpness threshold may be a preset fixed value or a dynamically changing value, which is not limited herein. For example, it may be a fixed value stored in the electronic device in advance, or a value input by a user and dynamically adjusted as required, or a value calculated according to the acquired target sharpness.
可理解的是,可以同时根据深度数据和目标清晰度进行对前景目标的识别。具体地,获取检测到的各个前景目标的目标清晰度,获取目标清晰度大于清晰度阈值的前景目标所 对应的深度数据;对深度数据小于深度阈值的前景目标进行识别。It is understandable that the foreground target can be identified according to the depth data and the target definition at the same time. Specifically, the detected target sharpness of each foreground target is obtained, and the depth data corresponding to the foreground target whose target sharpness is greater than the sharpness threshold is obtained; and the foreground target whose depth data is smaller than the depth threshold is identified.
上述实施例提供的图像处理方法,可获取待处理图像,并对待处理图像进行目标检测,获取前景目标。在前景目标所占的目标面积大于面积阈值时,对前景目标进行识别,并根据对前景目标的识别结果生成图像分类标签。在前景目标所占面积比较大的时候,可以更准确地对前景目标进行识别,这样通过前景目标的识别结果来生成图像分类标签,可以对图像进行更准确地分类。The image processing method provided in the foregoing embodiment may acquire an image to be processed, and perform target detection on the image to be processed to obtain a foreground target. When the target area occupied by the foreground target is greater than the area threshold, the foreground target is identified, and an image classification label is generated based on the recognition result of the foreground target. When the area occupied by the foreground target is relatively large, the foreground target can be identified more accurately. In this way, an image classification label is generated based on the recognition result of the foreground target, and the image can be classified more accurately.
应该理解的是,虽然图2、图3、图8的流程图中的各个操作按照箭头的指示依次显示,但是这些操作并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些操作的执行并没有严格的顺序限制,这些操作可以以其它的顺序执行。而且图2、图3、图8中的至少一部分操作可以包括多个子操作或者多个阶段,这些子操作或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子操作或者阶段的执行顺序也不必然是依次进行,而是可以与其它操作或者其它操作的子操作或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the operations in the flowcharts of FIG. 2, FIG. 3, and FIG. 8 are sequentially displayed according to the directions of the arrows, these operations are not necessarily performed sequentially in the order indicated by the arrows. Unless explicitly stated in this article, there is no strict order in which these operations can be performed, and these operations can be performed in other orders. Moreover, at least a part of the operations in FIG. 2, FIG. 3, and FIG. 8 may include multiple sub-operations or multiple phases. These sub-operations or phases are not necessarily executed at the same time, but may be performed at different times. These sub-operations The execution order of the operations or phases is not necessarily performed sequentially, but may be performed in turn or alternately with at least a part of other operations or sub-operations or phases of other operations.
图9为一个实施例中图像处理装置的结构示意图。如图9所示,该图像处理装置900包括图像获取模块902、目标检测模块904、目标识别模块906和图像分类模块908。其中:FIG. 9 is a schematic structural diagram of an image processing apparatus according to an embodiment. As shown in FIG. 9, the image processing apparatus 900 includes an image acquisition module 902, a target detection module 904, a target recognition module 906, and an image classification module 908. among them:
图像获取模块902,用于获取待处理图像。The image acquisition module 902 is configured to acquire an image to be processed.
目标检测模块904,用于对所述待处理图像进行目标检测,获取所述待处理图像中的前景目标。A target detection module 904 is configured to perform target detection on the image to be processed, and obtain a foreground target in the image to be processed.
目标识别模块906,用于若所述前景目标在所述待处理图像中所占的目标面积大于面积阈值,则对所述前景目标进行识别。A target recognition module 906 is configured to identify the foreground target if the target area occupied by the foreground target in the image to be processed is greater than an area threshold.
图像分类模块908,用于根据对所述前景目标的识别结果生成图像分类标签。An image classification module 908 is configured to generate an image classification label according to a recognition result of the foreground object.
上述实施例提供的图像处理装置,可获取待处理图像,并对待处理图像进行目标检测,获取前景目标。在前景目标所占的目标面积大于面积阈值时,对前景目标进行识别,并根据对前景目标的识别结果生成图像分类标签。在前景目标所占面积比较大的时候,可以更准确地对前景目标进行识别,这样通过前景目标的识别结果来生成图像分类标签,可以对图像进行更准确地分类。The image processing apparatus provided in the foregoing embodiment may acquire an image to be processed, and perform target detection on the image to be processed to obtain a foreground target. When the target area occupied by the foreground target is greater than the area threshold, the foreground target is identified, and an image classification label is generated based on the recognition result of the foreground target. When the area occupied by the foreground target is relatively large, the foreground target can be identified more accurately. In this way, an image classification label is generated based on the recognition result of the foreground target, and the image can be classified more accurately.
在一个实施例中,图像获取模块902还用于获取包含至少一张目标图像的图像集合,并计算任意两张目标图像之间的相似度;根据所述相似度将所述目标图像进行分类;其中,同一类目标图像中任意两张目标图像之间的相似度大于相似度阈值;分别从每一类目标图像中获取一张目标图像作为待处理图像。In one embodiment, the image acquisition module 902 is further configured to acquire an image set including at least one target image, and calculate the similarity between any two target images; classify the target image according to the similarity; The similarity between any two target images in the same type of target image is greater than the similarity threshold; one target image is obtained from each type of target image as the image to be processed.
在一个实施例中,目标识别模块906还用于若从所述待处理图像中检测到两个或两个以上的前景目标,则将所述待处理图像中包含的所有前景目标的总面积作为目标面积;若所述目标面积大于面积阈值,则对所述前景目标进行识别。In one embodiment, the target recognition module 906 is further configured to: if two or more foreground objects are detected from the image to be processed, use the total area of all foreground objects included in the image to be processed as Target area; if the target area is greater than an area threshold, identify the foreground target.
在一个实施例中,目标识别模块906还用于获取检测到的各个前景目标的目标清晰度,对所述目标清晰度大于清晰度阈值的前景目标进行识别。In one embodiment, the target recognition module 906 is further configured to obtain the target sharpness of each detected foreground target, and identify a foreground target whose target sharpness is greater than a sharpness threshold.
在一个实施例中,目标识别模块906还用于获取检测到的各个前景目标的深度数据,所述深度数据用于表示前景目标到图像采集装置之间的距离;对所述深度数据小于深度阈值的前景目标进行识别。In one embodiment, the target recognition module 906 is further configured to obtain detected depth data of each foreground target, where the depth data is used to represent the distance between the foreground target and the image acquisition device; the depth data is less than a depth threshold The foreground target is identified.
在一个实施例中,图像分类模块908还用于若从所述待处理图像中检测到两个或两个以上的前景目标,则将各个所述前景目标识别得到的前景类型进行分类;根据每一种前景类型生成一个对应的图像分类标签。In one embodiment, the image classification module 908 is further configured to, if two or more foreground objects are detected from the to-be-processed image, classify the foreground type identified by each of the foreground objects; One foreground type generates a corresponding image classification label.
在一个实施例中,图像分类模块908还用于若所述目标面积小于或等于面积阈值,则对所述待处理图像中除前景目标之外的背景区域进行识别;根据对所述背景区域的识别结 果生成图像分类标签。In one embodiment, the image classification module 908 is further configured to identify a background area other than a foreground target in the image to be processed if the target area is less than or equal to an area threshold; The recognition result generates an image classification label.
上述图像处理装置中各个模块的划分仅用于举例说明,在其他实施例中,可将图像处理装置按照需要划分为不同的模块,以完成上述图像处理装置的全部或部分功能。The division of each module in the above image processing apparatus is for illustration only. In other embodiments, the image processing apparatus may be divided into different modules as needed to complete all or part of the functions of the above image processing apparatus.
本申请实施例还提供了一种计算机可读存储介质。一个或多个包含计算机可执行指令的非易失性计算机可读存储介质,当所述计算机可执行指令被一个或多个处理器执行时,使得所述处理器执行上述实施例提供的图像处理方法。An embodiment of the present application further provides a computer-readable storage medium. One or more non-volatile computer-readable storage media containing computer-executable instructions, when the computer-executable instructions are executed by one or more processors, causing the processors to perform the image processing provided by the foregoing embodiments method.
一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述实施例提供的图像处理方法。A computer program product containing instructions, which when run on a computer, causes the computer to execute the image processing method provided by the above embodiments.
本申请实施例还提供一种电子设备。上述电子设备中包括图像处理电路,图像处理电路可以利用硬件和/或软件组件实现,可包括定义ISP(Image Signal Processing,图像信号处理)管线的各种处理单元。图10为一个实施例中图像处理电路的示意图。如图10所示,为便于说明,仅示出与本申请实施例相关的图像处理技术的各个方面。An embodiment of the present application further provides an electronic device. The above electronic device includes an image processing circuit. The image processing circuit may be implemented by hardware and / or software components, and may include various processing units that define an ISP (Image Signal Processing) pipeline. FIG. 10 is a schematic diagram of an image processing circuit in an embodiment. As shown in FIG. 10, for convenience of explanation, only aspects of the image processing technology related to the embodiments of the present application are shown.
如图10所示,图像处理电路包括ISP处理器1040和控制逻辑器1050。成像设备1010捕捉的图像数据首先由ISP处理器1040处理,ISP处理器1040对图像数据进行分析以捕捉可用于确定和/或成像设备1010的一个或多个控制参数的图像统计信息。成像设备1010可包括具有一个或多个透镜1012和图像传感器1014的照相机。图像传感器1014可包括色彩滤镜阵列(如Bayer滤镜),图像传感器1014可获取用图像传感器1014的每个成像像素捕捉的光强度和波长信息,并提供可由ISP处理器1040处理的一组原始图像数据。传感器1020(如陀螺仪)可基于传感器1020接口类型把采集的图像处理的参数(如防抖参数)提供给ISP处理器1040。传感器1020接口可以利用SMIA(Standard Mobile Imaging Architecture,标准移动成像架构)接口、其它串行或并行照相机接口或上述接口的组合。As shown in FIG. 10, the image processing circuit includes an ISP processor 1040 and a control logic 1050. The image data captured by the imaging device 1010 is first processed by an ISP processor 1040, which analyzes the image data to capture image statistical information that can be used to determine and / or one or more control parameters of the imaging device 1010. The imaging device 1010 may include a camera having one or more lenses 1012 and an image sensor 1014. The image sensor 1014 may include a color filter array (such as a Bayer filter). The image sensor 1014 may obtain the light intensity and wavelength information captured by each imaging pixel of the image sensor 1014, and provide a set of raw images Image data. The sensor 1020 (such as a gyroscope) may provide parameters (such as image stabilization parameters) of the acquired image processing to the ISP processor 1040 based on the interface type of the sensor 1020. The sensor 1020 interface may use a SMIA (Standard Mobile Imaging Architecture) interface, other serial or parallel camera interfaces, or a combination of the foregoing interfaces.
此外,图像传感器1014也可将原始图像数据发送给传感器1020,传感器1020可基于传感器1020接口类型把原始图像数据提供给ISP处理器1040,或者传感器1020将原始图像数据存储到图像存储器1030中。In addition, the image sensor 1014 may also send the original image data to the sensor 1020, and the sensor 1020 may provide the original image data to the ISP processor 1040 based on the interface type of the sensor 1020, or the sensor 1020 stores the original image data in the image memory 1030.
ISP处理器1040按多种格式逐个像素地处理原始图像数据。例如,每个图像像素可具有8、10、12或14比特的位深度,ISP处理器1040可对原始图像数据进行一个或多个图像处理操作、收集关于图像数据的统计信息。其中,图像处理操作可按相同或不同的位深度精度进行。The ISP processor 1040 processes the original image data pixel by pixel in a variety of formats. For example, each image pixel may have a bit depth of 8, 10, 12, or 14 bits, and the ISP processor 1040 may perform one or more image processing operations on the original image data and collect statistical information about the image data. The image processing operations may be performed with the same or different bit depth accuracy.
ISP处理器1040还可从图像存储器1030接收图像数据。例如,传感器1020接口将原始图像数据发送给图像存储器1030,图像存储器1030中的原始图像数据再提供给ISP处理器1040以供处理。图像存储器1030可为存储器装置的一部分、存储设备、或电子设备内的独立的专用存储器,并可包括DMA(Direct Memory Access,直接直接存储器存取)特征。The ISP processor 1040 may also receive image data from the image memory 1030. For example, the sensor 1020 interface sends the original image data to the image memory 1030, and the original image data in the image memory 1030 is then provided to the ISP processor 1040 for processing. The image memory 1030 may be a part of a memory device, a storage device, or a separate dedicated memory in an electronic device, and may include a DMA (Direct Memory Access) feature.
当接收到来自图像传感器1014接口或来自传感器1020接口或来自图像存储器1030的原始图像数据时,ISP处理器1040可进行一个或多个图像处理操作,如时域滤波。处理后的图像数据可发送给图像存储器1030,以便在被显示之前进行另外的处理。ISP处理器1040从图像存储器1030接收处理数据,并对所述处理数据进行原始域中以及RGB和YCbCr颜色空间中的图像数据处理。ISP处理器1040处理后的图像数据可输出给显示器1070,以供用户观看和/或由图形引擎或GPU(Graphics Processing Unit,图形处理器)进一步处理。此外,ISP处理器1040的输出还可发送给图像存储器1030,且显示器1070可从图像存储器1030读取图像数据。在一个实施例中,图像存储器1030可被配置为实现一个或多个帧缓冲器。此外,ISP处理器1040的输出可发送给编码器/解码器1060,以便编码/解码图像数据。编码的图像数据可被保存,并在显示于显示器1070设备上之前解压 缩。编码器/解码器1060可由CPU或GPU或协处理器实现。When receiving raw image data from the image sensor 1014 interface or from the sensor 1020 interface or from the image memory 1030, the ISP processor 1040 may perform one or more image processing operations, such as time-domain filtering. The processed image data may be sent to the image memory 1030 for further processing before being displayed. The ISP processor 1040 receives processed data from the image memory 1030, and performs image data processing on the processed data in the original domain and in the RGB and YCbCr color spaces. The image data processed by the ISP processor 1040 may be output to a display 1070 for viewing by a user and / or further processed by a graphics engine or a GPU (Graphics Processing Unit). In addition, the output of the ISP processor 1040 can also be sent to the image memory 1030, and the display 1070 can read image data from the image memory 1030. In one embodiment, the image memory 1030 may be configured to implement one or more frame buffers. In addition, the output of the ISP processor 1040 may be sent to an encoder / decoder 1060 to encode / decode image data. The encoded image data can be saved and decompressed before being displayed on the display 1070 device. The encoder / decoder 1060 may be implemented by a CPU or a GPU or a coprocessor.
ISP处理器1040确定的统计数据可发送给控制逻辑器1050单元。例如,统计数据可包括自动曝光、自动白平衡、自动聚焦、闪烁检测、黑电平补偿、透镜1012阴影校正等图像传感器1014统计信息。控制逻辑器1050可包括执行一个或多个例程(如固件)的处理器和/或微控制器,一个或多个例程可根据接收的统计数据,确定成像设备1010的控制参数及ISP处理器1040的控制参数。例如,成像设备1010的控制参数可包括传感器1020控制参数(例如增益、曝光控制的积分时间、防抖参数等)、照相机闪光控制参数、透镜1012控制参数(例如聚焦或变焦用焦距)、或这些参数的组合。ISP控制参数可包括用于自动白平衡和颜色调整(例如,在RGB处理期间)的增益水平和色彩校正矩阵,以及透镜1012阴影校正参数。The statistical data determined by the ISP processor 1040 may be sent to the control logic unit 1050. For example, the statistical data may include image sensor 1014 statistical information such as auto exposure, auto white balance, auto focus, flicker detection, black level compensation, and lens 1012 shading correction. The control logic 1050 may include a processor and / or a microcontroller that executes one or more routines (such as firmware). The one or more routines may determine control parameters and ISP processing of the imaging device 1010 based on the received statistical data. 1040 control parameters. For example, the control parameters of the imaging device 1010 may include sensor 1020 control parameters (such as gain, integration time for exposure control, image stabilization parameters, etc.), camera flash control parameters, lens 1012 control parameters (such as focus distance for focusing or zooming), or these A combination of parameters. ISP control parameters may include gain levels and color correction matrices for automatic white balance and color adjustment (eg, during RGB processing), and lens 1012 shading correction parameters.
以下为运用图10中图像处理技术实现上述实施例提供的图像处理方法。The following is an implementation of the image processing method provided by the foregoing embodiment by using the image processing technology in FIG. 10.
本申请所使用的对存储器、存储、数据库或其它介质的任何引用可包括非易失性和/或易失性存储器。合适的非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM),它用作外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDR SDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)。Any reference to memory, storage, database, or other media used in this application may include non-volatile and / or volatile memory. Suitable non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM), which is used as external cache memory. By way of illustration and not limitation, RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDR, SDRAM), enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation manners of the present application, and their descriptions are more specific and detailed, but they should not be construed as limiting the patent scope of the present application. It should be noted that, for those of ordinary skill in the art, without departing from the concept of the present application, several modifications and improvements can be made, and these all belong to the protection scope of the present application. Therefore, the protection scope of this application patent shall be subject to the appended claims.

Claims (20)

  1. 一种图像处理方法,所述方法包括:An image processing method, the method includes:
    获取待处理图像;Obtaining images to be processed;
    对所述待处理图像进行目标检测,获取所述待处理图像中的前景目标;Performing target detection on the image to be processed to obtain a foreground target in the image to be processed;
    若所述前景目标在所述待处理图像中所占的目标面积大于面积阈值,则对所述前景目标进行识别;Identifying the foreground target if the target area occupied by the foreground target in the image to be processed is larger than an area threshold;
    根据对所述前景目标的识别结果生成图像分类标签。An image classification label is generated according to a recognition result of the foreground target.
  2. 根据权利要求1所述的方法,其特征在于,所述获取待处理图像,包括:The method according to claim 1, wherein the acquiring an image to be processed comprises:
    获取包含至少一张目标图像的图像集合,并计算任意两张目标图像之间的相似度;Obtaining an image set containing at least one target image, and calculating the similarity between any two target images;
    根据所述相似度将所述目标图像进行分类;其中,同一类目标图像中任意两张目标图像之间的相似度大于相似度阈值;Classifying the target image according to the similarity; wherein the similarity between any two target images in the same type of target image is greater than the similarity threshold;
    分别从每一类目标图像中获取一张目标图像作为待处理图像。A target image is obtained from each type of target image as an image to be processed.
  3. 根据权利要求1所述的方法,其特征在于,所述若所述前景目标在所述待处理图像中所占的目标面积大于面积阈值,则对所述前景目标进行识别,包括:The method according to claim 1, wherein if the target area occupied by the foreground target in the image to be processed is greater than an area threshold, identifying the foreground target comprises:
    若从所述待处理图像中检测到两个或两个以上的前景目标,则将所述待处理图像中包含的所有前景目标的总面积作为目标面积;If two or more foreground objects are detected from the image to be processed, the total area of all foreground objects included in the image to be processed is used as the target area;
    若所述目标面积大于面积阈值,则对所述前景目标进行识别。If the target area is greater than an area threshold, the foreground target is identified.
  4. 根据权利要求3所述的方法,其特征在于,所述对所述前景目标进行识别,包括:The method according to claim 3, wherein the identifying the foreground target comprises:
    获取检测到的各个前景目标的目标清晰度,对所述目标清晰度大于清晰度阈值的前景目标进行识别。Obtain the target sharpness of each detected foreground target, and identify the foreground target whose target sharpness is greater than the sharpness threshold.
  5. 根据权利要求3所述的方法,其特征在于,所述对所述前景目标进行识别,包括:The method according to claim 3, wherein the identifying the foreground target comprises:
    获取检测到的各个前景目标的深度数据,所述深度数据用于表示前景目标到图像采集装置之间的距离;Acquiring depth data of each detected foreground target, where the depth data is used to represent a distance between the foreground target and the image acquisition device;
    对所述深度数据小于深度阈值的前景目标进行识别。Foreground objects whose depth data is less than a depth threshold are identified.
  6. 根据权利要求1所述的方法,其特征在于,所述根据对所述前景目标的识别结果生成图像分类标签,包括:The method according to claim 1, wherein the generating an image classification label according to a recognition result of the foreground target comprises:
    将各个所述前景目标识别得到的前景类型进行分类,并根据每一种前景类型生成一个对应的图像分类标签。The foreground types identified by each of the foreground objects are classified, and a corresponding image classification label is generated according to each of the foreground types.
  7. 根据权利要求1至6中任一项中所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1 to 6, wherein the method further comprises:
    若所述目标面积小于或等于面积阈值,则对所述待处理图像中除前景目标之外的背景区域进行识别;If the target area is less than or equal to an area threshold, identifying a background area other than a foreground target in the image to be processed;
    根据对所述背景区域的识别结果生成图像分类标签。An image classification label is generated according to a recognition result of the background region.
  8. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如下操作:A computer-readable storage medium stores a computer program thereon. When the computer program is executed by a processor, the following operations are implemented:
    获取待处理图像;Obtaining images to be processed;
    对所述待处理图像进行目标检测,获取所述待处理图像中的前景目标;Performing target detection on the image to be processed to obtain a foreground target in the image to be processed;
    若所述前景目标在所述待处理图像中所占的目标面积大于面积阈值,则对所述前景目标进行识别;Identifying the foreground target if the target area occupied by the foreground target in the image to be processed is larger than an area threshold;
    根据对所述前景目标的识别结果生成图像分类标签。An image classification label is generated according to a recognition result of the foreground target.
  9. 根据权利要求8所述的计算机可读存储介质,其特征在于,所述计算机程序被处理器执行所述获取待处理图像时,还执行下操作:The computer-readable storage medium according to claim 8, wherein when the computer program is executed by the processor to obtain the image to be processed, it further performs the following operations:
    获取包含至少一张目标图像的图像集合,并计算任意两张目标图像之间的相似度;Obtaining an image set containing at least one target image, and calculating the similarity between any two target images;
    根据所述相似度将所述目标图像进行分类;其中,同一类目标图像中任意两张目标图像之间的相似度大于相似度阈值;Classifying the target image according to the similarity; wherein the similarity between any two target images in the same type of target image is greater than the similarity threshold;
    分别从每一类目标图像中获取一张目标图像作为待处理图像。A target image is obtained from each type of target image as an image to be processed.
  10. 根据权利要求8所述的计算机可读存储介质,其特征在于,所述计算机程序被处理器执行所述若所述前景目标在所述待处理图像中所占的目标面积大于面积阈值,则对所述前景目标进行识别时,还执行下操作:The computer-readable storage medium of claim 8, wherein the computer program is executed by a processor, and if the target area occupied by the foreground target in the image to be processed is greater than an area threshold, When the foreground target is identified, the following operations are also performed:
    若从所述待处理图像中检测到两个或两个以上的前景目标,则将所述待处理图像中包含的所有前景目标的总面积作为目标面积;If two or more foreground objects are detected from the image to be processed, the total area of all foreground objects included in the image to be processed is used as the target area;
    若所述目标面积大于面积阈值,则对所述前景目标进行识别。If the target area is greater than an area threshold, the foreground target is identified.
  11. 根据权利要求10所述的计算机可读存储介质,其特征在于,所述计算机程序被处理器执行所述对所述前景目标进行识别时,还执行下操作:The computer-readable storage medium according to claim 10, wherein when the computer program is executed by the processor to identify the foreground target, the following operation is further performed:
    获取检测到的各个前景目标的目标清晰度,对所述目标清晰度大于清晰度阈值的前景目标进行识别。Obtain the target sharpness of each detected foreground target, and identify the foreground target whose target sharpness is greater than the sharpness threshold.
  12. 根据权利要求10所述的计算机可读存储介质,其特征在于,所述计算机程序被处理器执行所述对所述前景目标进行识别时,还执行下操作:The computer-readable storage medium according to claim 10, wherein when the computer program is executed by the processor to identify the foreground target, the following operation is further performed:
    获取检测到的各个前景目标的深度数据,所述深度数据用于表示前景目标到图像采集装置之间的距离;Acquiring depth data of each detected foreground target, where the depth data is used to represent a distance between the foreground target and the image acquisition device;
    对所述深度数据小于深度阈值的前景目标进行识别。Foreground objects whose depth data is less than a depth threshold are identified.
  13. 根据权利要求8所述的计算机可读存储介质,其特征在于,所述计算机程序被处理器执行所述根据对所述前景目标的识别结果生成图像分类标签时,还执行下操作:The computer-readable storage medium according to claim 8, wherein when the computer program is executed by the processor and the image classification label is generated based on the recognition result of the foreground target, the following operation is further performed:
    将各个所述前景目标识别得到的前景类型进行分类,并根据每一种前景类型生成一个对应的图像分类标签。The foreground types identified by each of the foreground objects are classified, and a corresponding image classification label is generated according to each of the foreground types.
  14. 根据权利要求8至13中任一项中所述的计算机可读存储介质,其特征在于,所述计算机程序被处理器执行时,还执行下操作:The computer-readable storage medium according to any one of claims 8 to 13, wherein when the computer program is executed by a processor, it further performs the following operations:
    若所述目标面积小于或等于面积阈值,则对所述待处理图像中除前景目标之外的背景区域进行识别;If the target area is less than or equal to an area threshold, identifying a background area other than a foreground target in the image to be processed;
    根据对所述背景区域的识别结果生成图像分类标签。An image classification label is generated according to a recognition result of the background region.
  15. 一种电子设备,包括存储器及处理器,所述存储器中储存有计算机可读指令,所述指令被所述处理器执行时,使得所述处理器执行如下操作:An electronic device includes a memory and a processor. The memory stores computer-readable instructions. When the instructions are executed by the processor, the processor causes the processor to perform the following operations:
    获取待处理图像;Obtaining images to be processed;
    对所述待处理图像进行目标检测,获取所述待处理图像中的前景目标;Performing target detection on the image to be processed to obtain a foreground target in the image to be processed;
    若所述前景目标在所述待处理图像中所占的目标面积大于面积阈值,则对所述前景目标进行识别;Identifying the foreground target if the target area occupied by the foreground target in the image to be processed is larger than an area threshold;
    根据对所述前景目标的识别结果生成图像分类标签。An image classification label is generated according to a recognition result of the foreground target.
  16. 根据权利要求15所述的电子设备,其特征在于,所述处理器执行所述获取待处理图像时,还执行下操作:The electronic device according to claim 15, wherein when the processor executes the acquiring the image to be processed, the processor further performs the following operations:
    获取包含至少一张目标图像的图像集合,并计算任意两张目标图像之间的相似度;Obtaining an image set containing at least one target image, and calculating the similarity between any two target images;
    根据所述相似度将所述目标图像进行分类;其中,同一类目标图像中任意两张目标图像之间的相似度大于相似度阈值;Classifying the target image according to the similarity; wherein the similarity between any two target images in the same type of target image is greater than the similarity threshold;
    分别从每一类目标图像中获取一张目标图像作为待处理图像。A target image is obtained from each type of target image as an image to be processed.
  17. 根据权利要求15所述的电子设备,其特征在于,所述处理器执行所述若所述前景目标在所述待处理图像中所占的目标面积大于面积阈值,则对所述前景目标进行识别时,还执行下操作:The electronic device according to claim 15, wherein the processor executes the step of identifying the foreground target if the target area occupied by the foreground target in the image to be processed is greater than an area threshold. When performing the following operations:
    若从所述待处理图像中检测到两个或两个以上的前景目标,则将所述待处理图像中包含的所有前景目标的总面积作为目标面积;If two or more foreground objects are detected from the image to be processed, the total area of all foreground objects included in the image to be processed is used as the target area;
    若所述目标面积大于面积阈值,则对所述前景目标进行识别。If the target area is greater than an area threshold, the foreground target is identified.
  18. 根据权利要求17所述的电子设备,其特征在于,所述处理器执行所述对所述前景目标进行识别时,还执行下操作:The electronic device according to claim 17, wherein when the processor executes the recognition of the foreground target, the processor further performs the following operations:
    获取检测到的各个前景目标的目标清晰度,对所述目标清晰度大于清晰度阈值的前景目标进行识别。Obtain the target sharpness of each detected foreground target, and identify the foreground target whose target sharpness is greater than the sharpness threshold.
  19. 根据权利要求15所述的电子设备,其特征在于,所述处理器执行所述根据对所述前景目标的识别结果生成图像分类标签时,还执行下操作:The electronic device according to claim 15, wherein when the processor executes the image classification label generated based on the recognition result of the foreground target, the processor further performs the following operations:
    将各个所述前景目标识别得到的前景类型进行分类,并根据每一种前景类型生成一个对应的图像分类标签。The foreground types identified by each of the foreground objects are classified, and a corresponding image classification label is generated according to each of the foreground types.
  20. 根据权利要求15至19中任一项中所述的电子设备,其特征在于,所述处理器还执行下操作:The electronic device according to any one of claims 15 to 19, wherein the processor further performs the following operations:
    若所述目标面积小于或等于面积阈值,则对所述待处理图像中除前景目标之外的背景区域进行识别;If the target area is less than or equal to an area threshold, identifying a background area other than a foreground target in the image to be processed;
    根据对所述背景区域的识别结果生成图像分类标签。An image classification label is generated according to a recognition result of the background region.
PCT/CN2019/087590 2018-06-08 2019-05-20 Image processing method, computer readable storage medium and electronic device WO2019233266A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810587091.1A CN108960290A (en) 2018-06-08 2018-06-08 Image processing method, device, computer readable storage medium and electronic equipment
CN201810587091.1 2018-06-08

Publications (1)

Publication Number Publication Date
WO2019233266A1 true WO2019233266A1 (en) 2019-12-12

Family

ID=64493527

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/087590 WO2019233266A1 (en) 2018-06-08 2019-05-20 Image processing method, computer readable storage medium and electronic device

Country Status (2)

Country Link
CN (1) CN108960290A (en)
WO (1) WO2019233266A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111222419A (en) * 2019-12-24 2020-06-02 深圳市优必选科技股份有限公司 Object identification method, robot and computer readable storage medium
CN111539962A (en) * 2020-01-10 2020-08-14 济南浪潮高新科技投资发展有限公司 Target image classification method, device and medium
CN111797934A (en) * 2020-07-10 2020-10-20 北京嘉楠捷思信息技术有限公司 Road sign identification method and device
CN111833303A (en) * 2020-06-05 2020-10-27 北京百度网讯科技有限公司 Product detection method and device, electronic equipment and storage medium
CN112132206A (en) * 2020-09-18 2020-12-25 青岛商汤科技有限公司 Image recognition method, training method of related model, related device and equipment
CN112182272A (en) * 2020-09-23 2021-01-05 创新奇智(成都)科技有限公司 Image retrieval method and device, electronic device and storage medium
CN113884504A (en) * 2021-08-24 2022-01-04 湖南云眼智能装备有限公司 Capacitor appearance detection control method and device
CN116468882A (en) * 2022-01-07 2023-07-21 荣耀终端有限公司 Image processing method, apparatus, device, storage medium, and program product

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108960290A (en) * 2018-06-08 2018-12-07 Oppo广东移动通信有限公司 Image processing method, device, computer readable storage medium and electronic equipment
CN111435447A (en) * 2019-01-14 2020-07-21 珠海格力电器股份有限公司 Method and device for identifying germ-remaining rice and cooking utensil
CN110163810B (en) * 2019-04-08 2023-04-25 腾讯科技(深圳)有限公司 Image processing method, device and terminal
CN110334635B (en) * 2019-06-28 2021-08-31 Oppo广东移动通信有限公司 Subject tracking method, apparatus, electronic device and computer-readable storage medium
CN111210440B (en) * 2019-12-31 2023-12-22 联想(北京)有限公司 Skin object identification method and device and electronic equipment
CN111274426B (en) * 2020-01-19 2023-09-12 深圳市商汤科技有限公司 Category labeling method and device, electronic equipment and storage medium
CN113705285A (en) * 2020-05-22 2021-11-26 珠海金山办公软件有限公司 Subject recognition method, apparatus, and computer-readable storage medium
CN111738354A (en) * 2020-07-20 2020-10-02 深圳市天和荣科技有限公司 Automatic recognition training method, system, storage medium and computer equipment
CN112560698B (en) * 2020-12-18 2024-01-16 北京百度网讯科技有限公司 Image processing method, device, equipment and medium
CN113283436B (en) * 2021-06-11 2024-01-23 北京有竹居网络技术有限公司 Picture processing method and device and electronic equipment
CN114220111B (en) * 2021-12-22 2022-09-16 深圳市伊登软件有限公司 Image-text batch identification method and system based on cloud platform
CN117372738A (en) * 2022-07-01 2024-01-09 顺丰科技有限公司 Target object quantity detection method and device, electronic equipment and storage medium
CN116563170B (en) * 2023-07-10 2023-09-15 中国人民解放军空军特色医学中心 Image data processing method and system and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110007366A1 (en) * 2009-07-10 2011-01-13 Palo Alto Research Center Incorporated System and method for classifying connected groups of foreground pixels in scanned document images according to the type of marking
CN103985114A (en) * 2014-03-21 2014-08-13 南京大学 Surveillance video person foreground segmentation and classification method
CN107133352A (en) * 2017-05-24 2017-09-05 北京小米移动软件有限公司 Photo display methods and device
CN108960290A (en) * 2018-06-08 2018-12-07 Oppo广东移动通信有限公司 Image processing method, device, computer readable storage medium and electronic equipment

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4808267B2 (en) * 2009-05-27 2011-11-02 シャープ株式会社 Image processing apparatus, image forming apparatus, image processing method, computer program, and recording medium
CN102968802A (en) * 2012-11-28 2013-03-13 无锡港湾网络科技有限公司 Moving target analyzing and tracking method and system based on video monitoring
CN103745230B (en) * 2014-01-14 2017-05-10 四川大学 Adaptive abnormal crowd behavior analysis method
CN104658030B (en) * 2015-02-05 2018-08-10 福建天晴数码有限公司 The method and apparatus of secondary image mixing
CN105913082B (en) * 2016-04-08 2020-11-27 北京邦视科技有限公司 Method and system for classifying targets in image
CN107657051B (en) * 2017-10-16 2020-03-17 Oppo广东移动通信有限公司 Picture label generation method, terminal device and storage medium
CN108038491B (en) * 2017-11-16 2020-12-11 深圳市华尊科技股份有限公司 Image classification method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110007366A1 (en) * 2009-07-10 2011-01-13 Palo Alto Research Center Incorporated System and method for classifying connected groups of foreground pixels in scanned document images according to the type of marking
CN103985114A (en) * 2014-03-21 2014-08-13 南京大学 Surveillance video person foreground segmentation and classification method
CN107133352A (en) * 2017-05-24 2017-09-05 北京小米移动软件有限公司 Photo display methods and device
CN108960290A (en) * 2018-06-08 2018-12-07 Oppo广东移动通信有限公司 Image processing method, device, computer readable storage medium and electronic equipment

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111222419A (en) * 2019-12-24 2020-06-02 深圳市优必选科技股份有限公司 Object identification method, robot and computer readable storage medium
CN111539962A (en) * 2020-01-10 2020-08-14 济南浪潮高新科技投资发展有限公司 Target image classification method, device and medium
CN111833303A (en) * 2020-06-05 2020-10-27 北京百度网讯科技有限公司 Product detection method and device, electronic equipment and storage medium
CN111833303B (en) * 2020-06-05 2023-07-25 北京百度网讯科技有限公司 Product detection method and device, electronic equipment and storage medium
CN111797934A (en) * 2020-07-10 2020-10-20 北京嘉楠捷思信息技术有限公司 Road sign identification method and device
CN112132206A (en) * 2020-09-18 2020-12-25 青岛商汤科技有限公司 Image recognition method, training method of related model, related device and equipment
CN112182272A (en) * 2020-09-23 2021-01-05 创新奇智(成都)科技有限公司 Image retrieval method and device, electronic device and storage medium
CN112182272B (en) * 2020-09-23 2023-07-28 创新奇智(成都)科技有限公司 Image retrieval method and device, electronic equipment and storage medium
CN113884504A (en) * 2021-08-24 2022-01-04 湖南云眼智能装备有限公司 Capacitor appearance detection control method and device
CN116468882A (en) * 2022-01-07 2023-07-21 荣耀终端有限公司 Image processing method, apparatus, device, storage medium, and program product
CN116468882B (en) * 2022-01-07 2024-03-15 荣耀终端有限公司 Image processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN108960290A (en) 2018-12-07

Similar Documents

Publication Publication Date Title
WO2019233266A1 (en) Image processing method, computer readable storage medium and electronic device
US10896323B2 (en) Method and device for image processing, computer readable storage medium, and electronic device
CN108764370B (en) Image processing method, image processing device, computer-readable storage medium and computer equipment
WO2019233297A1 (en) Data set construction method, mobile terminal and readable storage medium
US11138478B2 (en) Method and apparatus for training, classification model, mobile terminal, and readable storage medium
WO2019233394A1 (en) Image processing method and apparatus, storage medium and electronic device
CN108777815B (en) Video processing method and device, electronic equipment and computer readable storage medium
WO2019233393A1 (en) Image processing method and apparatus, storage medium, and electronic device
US11457138B2 (en) Method and device for image processing, method for training object detection model
WO2019233262A1 (en) Video processing method, electronic device, and computer readable storage medium
CN108897786B (en) Recommendation method and device of application program, storage medium and mobile terminal
CN108810418B (en) Image processing method, image processing device, mobile terminal and computer readable storage medium
WO2020259264A1 (en) Subject tracking method, electronic apparatus, and computer-readable storage medium
CN108961302B (en) Image processing method, image processing device, mobile terminal and computer readable storage medium
CN101416219B (en) Foreground/background segmentation in digital images
WO2019237887A1 (en) Image processing method, electronic device, and computer readable storage medium
CN108765033B (en) Advertisement information pushing method and device, storage medium and electronic equipment
CN108875619B (en) Video processing method and device, electronic equipment and computer readable storage medium
WO2019233392A1 (en) Image processing method and apparatus, electronic device, and computer-readable storage medium
WO2020001196A1 (en) Image processing method, electronic device, and computer readable storage medium
WO2019233271A1 (en) Image processing method, computer readable storage medium and electronic device
CN108241645B (en) Image processing method and device
CN109712177B (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN108717530B (en) Image processing method, image processing device, computer-readable storage medium and electronic equipment
CN108804658B (en) Image processing method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19815183

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19815183

Country of ref document: EP

Kind code of ref document: A1