WO2019233394A1 - 图像处理方法和装置、存储介质、电子设备 - Google Patents

图像处理方法和装置、存储介质、电子设备 Download PDF

Info

Publication number
WO2019233394A1
WO2019233394A1 PCT/CN2019/089914 CN2019089914W WO2019233394A1 WO 2019233394 A1 WO2019233394 A1 WO 2019233394A1 CN 2019089914 W CN2019089914 W CN 2019089914W WO 2019233394 A1 WO2019233394 A1 WO 2019233394A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
label
detected
scene recognition
scene
Prior art date
Application number
PCT/CN2019/089914
Other languages
English (en)
French (fr)
Inventor
陈岩
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2019233394A1 publication Critical patent/WO2019233394A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/20Scenes; Scene-specific elements in augmented reality scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • the present application relates to the field of computer technology, and in particular, to an image processing method and device, a storage medium, and an electronic device.
  • the mobile terminal may perform scene recognition on the image to provide a smart experience for the user.
  • the embodiments of the present application provide an image processing method and device, a storage medium, and an electronic device, which can improve the accuracy of scene recognition on an image.
  • An image processing method includes:
  • An image processing device includes:
  • An image acquisition module configured to acquire an image to be detected
  • a scene recognition module is configured to perform scene recognition on the to-be-detected image according to a multi-label classification model to obtain tags corresponding to the to-be-detected image.
  • the multi-label classification model is obtained from a multi-label image including multiple scene elements. ;
  • An output module is configured to output a label corresponding to the image to be detected as a result of scene recognition.
  • a computer-readable storage medium has stored thereon a computer program that, when executed by a processor, implements the operations of the image processing method described above.
  • An electronic device includes a memory, a processor, and a computer program stored on the memory and operable on the processor.
  • the processor executes the computer program, the operations of the image processing method described above are performed.
  • the foregoing scene recognition method and device, storage medium, and electronic device obtain an image to be detected, perform scene recognition according to a multi-label classification model, and obtain tags corresponding to the image to be detected.
  • the multi-label classification model is based on Multi-label images are obtained.
  • the label corresponding to the image to be detected is output as a result of scene recognition.
  • FIG. 1 is an internal structural diagram of an electronic device in an embodiment
  • FIG. 2 is a flowchart of an image processing method according to an embodiment
  • 3A is a flowchart of an image processing method according to another embodiment
  • 3B is a schematic structural diagram of a neural network in an embodiment
  • FIG. 4 is a flowchart of a method for obtaining a label corresponding to an image by performing scene recognition on the image according to the multi-label classification model in FIG. 2;
  • FIG. 6 is a schematic structural diagram of an image processing apparatus according to an embodiment
  • FIG. 7 is a schematic structural diagram of an image processing apparatus according to another embodiment.
  • FIG. 8 is a schematic structural diagram of a scene recognition module in FIG. 6;
  • FIG. 9 is a block diagram of a partial structure of a mobile phone related to an electronic device according to an embodiment.
  • FIG. 1 is a schematic diagram of an internal structure of an electronic device in an embodiment.
  • the electronic device includes a processor, a memory, and a network interface connected through a system bus.
  • the processor is used to provide computing and control capabilities to support the operation of the entire electronic device.
  • the memory is used to store data, programs, and the like. At least one computer program is stored on the memory, and the computer program can be executed by a processor to implement the image processing method applicable to the electronic device provided in the embodiments of the present application.
  • the memory may include a non-volatile storage medium such as a magnetic disk, an optical disc, a read-only memory (ROM), or a random-access memory (RAM).
  • the memory includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and a computer program.
  • the computer program can be executed by a processor to implement an image processing method provided by each of the following embodiments.
  • the internal memory provides a cached operating environment for operating system computer programs in a non-volatile storage medium.
  • the network interface may be an Ethernet card or a wireless network card, and is used to communicate with external electronic devices.
  • the electronic device may be a mobile phone, a tablet computer, or a personal digital assistant or a wearable device.
  • an image processing method is provided.
  • the method is applied to the electronic device in FIG. 1 as an example, and includes:
  • Operation 220 Acquire an image to be detected.
  • the user uses an electronic device (with a photographing function) to take a picture and obtain an image to be detected.
  • the image to be detected may be a photo preview screen, or a photo saved to an electronic device after the photo is taken.
  • the image to be detected refers to an image requiring scene recognition, and includes both an image containing only a single scene element and an image containing multiple scene elements (two or more).
  • the scene elements in the image include landscape, beach, blue sky, green grass, snow, night scene, dark, backlight, sunrise / sunset, fireworks, spotlight, indoor, macro, text document, portrait, baby, cat, dog , Food and more.
  • the above are not exhaustive, but also include many other categories of scene elements.
  • Operation 240 Perform scene recognition according to the multi-label classification model to obtain tags corresponding to the image to be detected, and the multi-label classification model is obtained from a multi-label image including multiple scene elements.
  • scene recognition is performed on the image to be detected.
  • a pre-trained multi-label classification model is used to perform scene recognition on the image to obtain tags corresponding to the scene included in the image.
  • the multi-label classification model is obtained based on a multi-label image including multiple scene elements. That is, the multi-label classification model is a scene recognition model obtained after scene recognition training using an image containing multiple scene elements. After the multi-label classification model performs scene recognition on the images to be detected, labels corresponding to the scenes contained in the images to be detected are obtained.
  • the labels of the image to be detected can be directly output as beach, blue sky, and portrait.
  • the beach, blue sky, and portrait are labels corresponding to the scene in the image to be detected.
  • the label corresponding to the image to be detected is output as a result of scene recognition.
  • the tags corresponding to the scene included in the to-be-detected image are obtained, the tags corresponding to the to-be-detected image are the results of scene recognition. Output the results of scene recognition.
  • an image requiring scene recognition is acquired, and a scene recognition is performed on an image to be detected according to a multi-label classification model to obtain a tag corresponding to the image to be detected.
  • the multi-label classification model is obtained from a multi-label image including multiple scene elements. of.
  • the label corresponding to the image to be detected is output as a result of scene recognition. Because the multi-label classification model is a scene recognition model obtained from multi-label images containing multiple scene elements, it is possible to directly and accurately output multiple scenes in this image after performing scene recognition on images containing different scene elements. s Mark. Therefore, the accuracy of scene recognition for images containing different scene elements is improved, and the efficiency of scene recognition is also improved.
  • the method before acquiring an image to be detected, the method includes:
  • Operation 320 Obtain a multi-label image including multiple scene elements.
  • Obtaining an image containing multiple scene elements is called a multi-label image in this embodiment, because after scene recognition for an image containing multiple scenes, each scene will correspond to a label, and all the labels form the label of the image, that is, Multi-label images.
  • Operation 340 Train a multi-label classification model using a multi-label image including multiple scene elements.
  • scene recognition may be performed on the above-mentioned multi-label image samples manually, and a label corresponding to each multi-label image sample is obtained, which is called a standard label. Then use the images in the above multi-label image samples for scene recognition training one by one, until the error between the trained scene recognition results and the standard tags is getting smaller and smaller. At this time, after training, the multi-label classification model that can realize scene recognition on multi-label images is obtained.
  • the multi-label classification model is a scene recognition model obtained by training using multi-label images containing multiple scene elements
  • the images containing different scene elements can be directly and accurately output after scene recognition. Labels corresponding to multiple scenes in this image.
  • the accuracy of multi-tag image recognition is improved, and the efficiency of multi-tag image recognition is also improved.
  • the multi-label classification model is constructed based on a neural network model.
  • the specific training method of the multi-label classification model is: input a training image containing a background training target and a foreground training target to a neural network, and obtain a first prediction confidence and a first true confidence that reflect each pixel of the background region in the training image.
  • a first loss function that reflects the difference between degrees
  • a second loss function that reflects the difference between the second prediction confidence and the second true confidence of each pixel in the foreground region in the training image
  • the first prediction confidence is The neural network predicts the confidence level that a pixel in the background region belongs to the background training target.
  • the first true confidence level represents the confidence level that the pixel labeled in the training image belongs to the background training target;
  • the second prediction confidence level To use the neural network to predict the confidence level that a pixel in the foreground region belongs to the foreground training target, the second true confidence level represents the confidence level that the pixel labeled in the training image belongs to the foreground training target;
  • the background training target of the training image has corresponding labels
  • the foreground training target also has labels.
  • FIG. 3B is a schematic structural diagram of a neural network model in an embodiment.
  • the input layer of the neural network receives training images with image category labels, performs feature extraction through a basic network (such as a CNN network), and outputs the extracted image features to the feature layer, and the feature layer is used for the background
  • the first loss function is obtained by performing category detection on the training target
  • the second loss function is obtained by performing category detection on the foreground training target based on image features.
  • the position loss function is obtained by performing position detection on the foreground training target based on the foreground area.
  • the weighted sum of the loss function and the position loss function is used to obtain the target loss function.
  • the neural network may be a convolutional neural network.
  • Convolutional neural networks include a data input layer, a convolutional calculation layer, an activation layer, a pooling layer, and a fully connected layer.
  • the data input layer is used to pre-process the original image data.
  • the pre-processing may include de-averaging, normalization, dimensionality reduction, and whitening processes.
  • De-averaging refers to centering each dimension of the input data to 0, the purpose is to pull the center of the sample back to the origin of the coordinate system.
  • Normalization is normalizing the amplitude to the same range.
  • Whitening refers to normalizing the amplitude on each characteristic axis of the data.
  • the convolution calculation layer is used for local correlation and window sliding. The weight of each filter connected to the data window in the convolution calculation layer is fixed.
  • Each filter focuses on an image feature, such as vertical edges, horizontal edges, colors, textures, etc., and these filters are combined to obtain the entire image.
  • a filter is a weight matrix.
  • a weight matrix can be used to convolve with data in different windows.
  • the activation layer is used to non-linearly map the output of the convolution layer.
  • the activation function used by the activation layer may be ReLU (The Rectified Linear Unit).
  • the pooling layer can be sandwiched between consecutive convolutional layers to compress the amount of data and parameters and reduce overfitting.
  • the pooling layer can use the maximum method or average method to reduce the dimensionality of the data.
  • the fully connected layer is located at the tail of the convolutional neural network, and all neurons between the two layers have the right to reconnect.
  • Part of the convolutional neural network is cascaded to the first confidence output node, part of the convolutional layer is cascaded to the second confidence output node, and part of the convolutional layer is cascaded to the position output node.
  • the first confidence output node it can be detected.
  • the output node can detect the type of the foreground object of the image according to the second confidence level, and the position corresponding to the foreground object can be detected according to the position output node.
  • artificial neural networks are also referred to as neural networks (NNs) or connection models. It abstracts the human brain neuron network from the perspective of information processing, establishes some simple model, and forms different networks according to different connection methods. In engineering and academia, it is often referred to as neural network or neural network. It can be understood that artificial neural network is a mathematical model that uses information similar to the structure of brain synapses to process information.
  • Neural networks are often used for classification, for example, the classification of spam, the classification of cats and dogs in images, and so on.
  • This kind of machine that can automatically classify the input variables is called a classifier.
  • the input to the classifier is a numeric vector called a feature (vector).
  • the classifier needs to be trained, that is, the neural network needs to be trained first.
  • the training of artificial neural networks relies on back-propagation algorithms. First, input the feature vector in the input layer and obtain the output through network calculation. The output layer finds that the output is not consistent with the correct class number. At this time, it allows the last layer of neurons to adjust the parameters. , And will also order the penultimate neuron connected to it to adjust its parameters, so that the layers are adjusted backward. The adjusted network will continue to test on the sample. If the output is still wrong, continue to the next round of rollback adjustments until the output through the neural network is as consistent as possible with the correct result.
  • the neural network model includes an input layer, a hidden layer, and an output layer.
  • Feature vectors are extracted from multi-label images containing multiple scene elements, and then the feature vectors are input into the hidden layer to calculate the size of the loss function, and then the parameters of the neural network model are adjusted according to the loss function, so that the loss function continuously converges, and then Multi-label classification model is obtained by training the neural network model.
  • the multi-label classification model can implement scene recognition on the input image to obtain tags for each scene included in the image, and output these tags as the result of scene recognition.
  • the target loss function is obtained by weighted summing the first loss function corresponding to the background training target and the second loss function corresponding to the foreground training target, and the parameters of the neural network are adjusted according to the target loss function, so that the trained multi-label classification model is obtained. Subsequent identification of the background category and the label of the foreground target can obtain more information and improve the recognition efficiency.
  • operation 240 performing scene recognition according to the multi-label classification model to obtain a label corresponding to the image to be detected, including:
  • Operation 242 Perform scene recognition according to the multi-label classification model to obtain an initial label of the image to be detected and a confidence level corresponding to the initial label;
  • Operation 244 Determine whether the confidence level of the initial label is greater than a preset threshold
  • the multi-label classification model obtained by the training is used to perform scene recognition on a to-be-detected image that contains multiple scene elements, multiple initial tags of the to-be-detected image and the confidence levels corresponding to the initial tags will be obtained.
  • the confidence that the initial label of the image to be detected is beach is 0.6
  • the confidence that the initial label of the image to be detected is blue sky is 0.7
  • the confidence that the initial label of the image to be detected is a portrait is 0.8
  • the confidence that the initial label of the image to be detected is a dog is 0.4
  • the confidence that the initial label of the image to be detected is snow is 0.3.
  • the initial labels of the recognition results are filtered. Specifically, it is determined whether the confidence level of the initial labels is greater than a preset threshold.
  • the preset threshold may be a confidence level obtained when the multi-label classification model is trained in the early stage, based on a large number of training samples, when the loss function is relatively small, and the result obtained is close to the actual result. Threshold. For example, if the confidence threshold obtained based on a large number of training samples is 0.5, in the above example, it is determined whether the confidence of the initial label is greater than a preset threshold, and the initial label with a confidence greater than the preset threshold is used as a label corresponding to the image.
  • the labels corresponding to the obtained images to be detected are beach, blue sky, and portrait, and two interference terms, dog and snow scene with confidence lower than the threshold, are discarded.
  • scene recognition is performed on an image to be detected according to a multi-label classification model, and an initial label of the image to be detected and a confidence level corresponding to the initial label are obtained. Because the initial labels obtained from scene recognition are not necessarily the true labels corresponding to the images to be detected, the confidence of each initial label is used to filter the initial labels, and the initial labels larger than the confidence threshold are selected as the corresponding images to be detected. Scene recognition results. This improves the accuracy of the scene recognition results to a certain extent.
  • the range of confidence corresponding to each initial label is [0,1].
  • the multi-label classification model is a scene recognition model that is trained based on multi-label images containing multiple scene elements, it is possible to directly output this image more accurately after performing scene recognition on images to be detected that contain different scene elements. Labels corresponding to multiple scenes in the image to be detected.
  • the identification process of each label is independent, so the probability of each identified label can be between [0,1].
  • the recognition processes of different tags do not affect each other, so it is possible to comprehensively identify all the scenes included in the image to be detected and avoid omissions.
  • the method includes:
  • Operation 520 Obtain position information when the image to be detected is captured
  • the result of scene recognition is corrected according to the position information to obtain a final result of scene recognition after correction.
  • the electronic device records the location of each picture, and generally uses GPS (Global Positioning System) to record address information. Get the address information recorded by the electronic device. After acquiring the address information recorded by the electronic device, the position information of the image to be detected is acquired according to the address information. Match the corresponding scene category and the weight corresponding to the scene category for different address information in advance. Specifically, it may be a result obtained by performing statistical analysis on a large number of image materials, and the corresponding scene category and the corresponding weight of the scene category may be matched for different address information according to the result.
  • GPS Global Positioning System
  • the result of scene recognition can be corrected according to the address information at the time of image shooting and the probability of the scene corresponding to the address information, to obtain the final result of scene recognition after correction.
  • the address information of the picture is "XXX grassland”
  • the scenes corresponding to the "XXX grassland” have higher weights such as “green grass”, “snow landscape”, and “blue sky”, so these scenes have a higher probability of appearing . Therefore, the result of scene recognition is corrected. If the above-mentioned "green grass”, “snow scene”, and “blue sky” appear in the result of scene recognition, then it can be used as the final result of scene recognition. If the scene of "beach” appears in the result of scene recognition, then the "beach” scene should be filtered according to the address information when the image was taken to remove the "beach” scene to avoid getting incorrect and incompatible scene categories.
  • position information at the time of shooting an image to be detected is acquired, and a result of scene recognition is corrected according to the position information to obtain a final result of scene recognition after correction.
  • the scene classification of the to-be-detected image obtained by using the shooting address information of the to-be-detected image can be implemented to calibrate the result of scene recognition, thereby ultimately improving the accuracy of scene detection.
  • the method further includes:
  • the image to be detected is subjected to image processing corresponding to the result of scene recognition.
  • a label corresponding to the image to be detected is obtained, and a label corresponding to the image to be detected is output as a result of scene recognition.
  • the result of scene recognition can be used as the basis for image post-processing, and targeted image processing can be performed according to the result of scene recognition, thereby greatly improving the quality of the image. For example, if the scene type of the image to be detected is identified as night scene, the image may be processed in a suitable manner for the night scene, such as increasing brightness. If it is identified that the scene type of the image to be detected is backlighting, the image can be processed using a suitable processing method for backlighting.
  • the beach area can be processed in a manner suitable for the beach, and the green grass area can be processed in a manner suitable for green grass.
  • the blue sky is processed separately for the blue sky, so that the effect of the entire image is very good.
  • an image processing method is provided.
  • the method is applied to the electronic device in FIG. 1 as an example, and includes:
  • Operation 1 Obtain a multi-label image containing multiple scene elements, and use the multi-label image containing multiple scene elements to train a neural network model to obtain a multi-label classification model, that is, the multi-label classification model is based on a neural network architecture;
  • Operation 2 Perform scene recognition according to the multi-label classification model to obtain the initial label of the image to be detected and the confidence level corresponding to the initial label;
  • Operation three Determine whether the confidence level of the initial label is greater than a preset threshold. When the determination result is yes, use the initial label whose confidence level is greater than the preset threshold as the label corresponding to the image to be detected, and use the label corresponding to the image to be detected as the scene recognition. Output the results;
  • Operation four Obtain position information at the time of shooting the image to be detected, and correct the scene recognition result according to the position information to obtain the final result of the scene recognition after correction;
  • Operation five According to the result of the scene recognition, the image to be detected is subjected to image processing corresponding to the result of the scene recognition to obtain a processed image.
  • the multi-label classification model is a scene recognition model obtained from a multi-label image containing multiple scene elements, it is possible to directly and accurately perform scene recognition on images to be detected that include different scene elements. Output labels corresponding to multiple scenes in this image. Therefore, the accuracy of scene recognition on images to be detected containing different scene elements is improved, and the efficiency of scene recognition is also improved.
  • the result of scene recognition is corrected according to the position information when the image to be detected is captured, to obtain the final result of scene recognition after correction.
  • the scene classification of the to-be-detected image obtained by using the shooting address information of the to-be-detected image can be implemented to calibrate the result of scene recognition, thereby ultimately improving the accuracy of scene detection.
  • the result of scene recognition can be used as the basis for image post-processing, and the image can be targeted for image processing according to the result of scene recognition, thereby greatly improving the quality of the image.
  • an image processing device 600 includes an image acquisition module 610, a scene recognition module 620, and an output module 630. among them,
  • An image acquisition module 610 configured to acquire an image to be detected
  • a scene recognition module 620 is configured to perform scene recognition according to a multi-label classification model to obtain a label corresponding to the image to be detected, and the multi-label classification model is obtained from a multi-label image including multiple scene elements;
  • An output module 630 is configured to output a label corresponding to the image to be detected as a result of scene recognition.
  • an image processing apparatus 600 is provided, and the apparatus further includes:
  • a multi-label image acquisition module 640 configured to acquire a multi-label image including multiple scene elements
  • a multi-label classification model training module 650 is configured to train a multi-label classification model using a multi-label image including multiple scene elements.
  • the scene recognition module 620 includes:
  • An initial label acquisition module 622 is configured to perform scene recognition based on a multi-label classification model to obtain an initial label of the image to be detected and a confidence level corresponding to the initial label;
  • a determining module 624 configured to determine whether the confidence level of the initial label is greater than a preset threshold
  • the image label generation module 626 is configured to, when the determination result is yes, use an initial label with a confidence level greater than a preset threshold as a label corresponding to the image to be detected.
  • an image processing device 600 is provided, which is further configured to obtain position information when an image to be detected is taken; and correct the scene recognition result according to the position information to obtain a final scene recognition result after the correction.
  • an image processing device 600 is provided, and further configured to perform image processing corresponding to a scene recognition result on an image to be detected according to a result of scene recognition.
  • each module in the above image processing apparatus is for illustration only. In other embodiments, the image processing apparatus may be divided into different modules as needed to complete all or part of the functions of the above image processing apparatus.
  • Each module in the image processing apparatus may be implemented in whole or in part by software, hardware, and a combination thereof.
  • the network interface may be an Ethernet card or a wireless network card.
  • the above modules may be embedded in the processor in the form of hardware or independent of the processor in the server, or may be stored in the memory of the server in the form of software to facilitate the processor. Call to perform the operations corresponding to the above modules.
  • a computer-readable storage medium on which a computer program is stored.
  • the computer program is executed by a processor, the operations of the image processing methods provided by the foregoing embodiments are implemented.
  • an electronic device including a memory, a processor, and a computer program stored on the memory and executable on the processor.
  • the processor executes the computer program, the image processing provided by the foregoing embodiments is implemented. The operation of the method.
  • An embodiment of the present application further provides a computer program product, which when executed on a computer, causes the computer to perform operations of the image processing methods provided by the foregoing embodiments.
  • An embodiment of the present application further provides an electronic device.
  • the above electronic device includes an image processing circuit.
  • the image processing circuit may be implemented by hardware and / or software components, and may include various processing units that define an ISP (Image Signal Processing) pipeline.
  • FIG. 9 is a schematic diagram of an image processing circuit in one embodiment. As shown in FIG. 9, for ease of description, only aspects of the image processing technology related to the embodiments of the present application are shown.
  • the image processing circuit includes an ISP processor 940 and a control logic 950.
  • the image data captured by the imaging device 910 is first processed by the ISP processor 940, which analyzes the image data to capture image statistical information that can be used to determine and / or one or more control parameters of the imaging device 910.
  • the imaging device 910 may include a camera having one or more lenses 912 and an image sensor 914.
  • the image sensor 914 may include a color filter array (such as a Bayer filter). The image sensor 914 may obtain the light intensity and wavelength information captured by each imaging pixel of the image sensor 914, and provide a set of Image data.
  • the sensor 920 may provide parameters (such as image stabilization parameters) of the acquired image processing to the ISP processor 940 based on the interface type of the sensor 920.
  • the sensor 920 interface may use a SMIA (Standard Mobile Imaging Architecture) interface, other serial or parallel camera interfaces, or a combination of the foregoing interfaces.
  • SMIA Standard Mobile Imaging Architecture
  • the image sensor 914 may also send the original image data to the sensor 920, and the sensor 920 may provide the original image data to the ISP processor 940 based on the interface type of the sensor 920, or the sensor 920 stores the original image data in the image memory 930.
  • the ISP processor 940 processes the original image data pixel by pixel in a variety of formats.
  • each image pixel may have a bit depth of 8, 10, 12, or 14 bits, and the ISP processor 940 may perform one or more image processing operations on the original image data and collect statistical information about the image data.
  • the image processing operations may be performed with the same or different bit depth accuracy.
  • the ISP processor 940 may also receive image data from the image memory 930.
  • the sensor 920 interface sends the original image data to the image memory 930, and the original image data in the image memory 930 is then provided to the ISP processor 940 for processing.
  • the image memory 930 may be a part of a memory device, a storage device, or a separate dedicated memory in an electronic device, and may include a DMA (Direct Memory Access) feature.
  • DMA Direct Memory Access
  • the ISP processor 940 may perform one or more image processing operations, such as time-domain filtering.
  • the processed image data may be sent to the image memory 930 for further processing before being displayed.
  • the ISP processor 940 receives the processing data from the image memory 930, and performs processing on the image data in the original domain and in the RGB and YCbCr color spaces.
  • the image data processed by the ISP processor 940 may be output to the display 970 for viewing by the user and / or further processed by a graphics engine or a GPU (Graphics Processing Unit).
  • the output of the ISP processor 940 can also be sent to the image memory 930, and the display 970 can read image data from the image memory 930.
  • the image memory 930 may be configured to implement one or more frame buffers.
  • the output of the ISP processor 940 may be sent to an encoder / decoder 960 to encode / decode image data.
  • the encoded image data can be saved and decompressed before being displayed on the display 970 device.
  • the encoder / decoder 960 may be implemented by a CPU or a GPU or a coprocessor.
  • the statistical data determined by the ISP processor 940 may be sent to the control logic 950 unit.
  • the statistical data may include image information of the image sensor 914 such as auto exposure, auto white balance, auto focus, flicker detection, black level compensation, and lens 912 shading correction.
  • the control logic 950 may include a processor and / or a microcontroller that executes one or more routines (such as firmware). The one or more routines may determine the control parameters of the imaging device 910 and the ISP processing according to the received statistical data. Parameters of the controller 940.
  • control parameters of the imaging device 910 may include sensor 920 control parameters (such as gain, integration time for exposure control, image stabilization parameters, etc.), camera flash control parameters, lens 912 control parameters (such as focus distance for focusing or zooming), or these A combination of parameters.
  • ISP control parameters may include gain levels and color correction matrices for automatic white balance and color adjustment (eg, during RGB processing), and lens 912 shading correction parameters.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM), which is used as external cache memory.
  • RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDR, SDRAM), enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDR dual data rate SDRAM
  • SDRAM enhanced SDRAM
  • SLDRAM synchronous Link (Synchlink) DRAM
  • SLDRAM synchronous Link (Synchlink) DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM
  • the program can be stored in a non-volatile computer-readable storage medium.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

本申请涉及一种图像处理方法和装置、电子设备、计算机可读存储介质,获取待检测图像,根据多标签分类模型对待检测图像进行场景识别,得到待检测图像对应的标签,多标签分类模型为根据包含多种场景要素的多标签图像得到的。将待检测图像对应的标签作为场景识别的结果进行输出。

Description

图像处理方法和装置、存储介质、电子设备
相关申请的交叉引用
本申请要求于2018年06月08日提交中国专利局,申请号为201810585679.3,发明名称为“图像处理方法和装置、存储介质、电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,特别是涉及一种图像处理方法和装置、存储介质、电子设备。
背景技术
随着移动终端的普及和移动互联网的迅速发展,移动终端的用户使用量越来越大。移动终端中的拍照功能已经成为用户常用功能之一。在拍照的过程中或在拍照之后,移动终端都可能会对图像进行场景识别,以给用户提供智能化的体验。
发明内容
本申请实施例提供一种图像处理方法和装置、存储介质、电子设备,可以提高对图像进行场景识别的准确性。
一种图像处理方法,包括:
获取待检测图像;
根据多标签分类模型对所述待检测图像进行场景识别,得到所述待检测图像对应的标签,所述多标签分类模型为根据包含多种场景要素的多标签图像得到的;
将所述待检测图像对应的标签作为场景识别的结果进行输出。
一种图像处理装置,所述装置包括:
图像获取模块,用于获取待检测图像;
场景识别模块,用于根据多标签分类模型对所述待检测图像进行场景识别,得到所述 待检测图像对应的标签,所述多标签分类模型为根据包含多种场景要素的多标签图像得到的;
输出模块,用于将所述待检测图像对应的标签作为场景识别的结果进行输出。
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如上所述的图像处理方法的操作。
一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,处理器执行计算机程序时执行如上所述的图像处理方法的操作。
上述场景识别方法和装置、存储介质、电子设备,获取待检测图像,根据多标签分类模型对待检测图像进行场景识别,得到待检测图像对应的标签,多标签分类模型为根据包含多种场景要素的多标签图像得到的。将待检测图像对应的标签作为场景识别的结果进行输出。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为一个实施例中电子设备的内部结构图;
图2为一个实施例中图像处理方法的流程图;
图3A为又一个实施例中图像处理方法的流程图;
图3B为一个实施例中神经网络的架构示意图;
图4为图2中根据多标签分类模型对图像进行场景识别得到图像对应的标签方法的流程图;
图5为再一个实施例中图像处理方法的流程图;
图6为一个实施例中图像处理装置的结构示意图;
图7为又一个实施例中图像处理装置的结构示意图;
图8为图6中场景识别模块的结构示意图;
图9为一个实施例中提供的电子设备相关的手机的部分结构的框图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
图1为一个实施例中电子设备的内部结构示意图。如图1所示,该电子设备包括通过***总线连接的处理器、存储器和网络接口。其中,该处理器用于提供计算和控制能力,支撑整个电子设备的运行。存储器用于存储数据、程序等,存储器上存储至少一个计算机程序,该计算机程序可被处理器执行,以实现本申请实施例中提供的适用于电子设备的图像处理方法。存储器可包括磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等非易失性存储介质,或随机存储记忆体(Random-Access-Memory,RAM)等。例如,在一个实施例中,存储器包括非易失性存储介质及内存储器。非易失性存储介质存储有操作***和计算机程序。该计算机程序可被处理器所执行,以用于实现以下各个实施例所提供的一种图像处理方法。内存储器为非易失性存储介质中的操作***计算机程序提供高速缓存的运行环境。网络接口可以是以太网卡或无线网卡等,用于与外部的电子设备进行通信。该电子设备可以是手机、平板电脑或者个人数字助理或穿戴式设备等。
在一个实施例中,如图2所示,提供了一种图像处理方法,以该方法应用于图1中的电子设备为例进行说明,包括:
操作220,获取待检测图像。
用户使用电子设备(具有拍照功能)进行拍照,获取待检测图像。待检测图像可以是拍照预览画面,也可以是拍照后保存到电子设备中的照片。待检测图像指的是需要进行场景识别的图像,既包括仅包含单一场景要素的图像,也包括包含多个场景要素(两个或两个以上)的图像。一般情况下图像中的场景要素包括风景、海滩、蓝天、绿草、雪景、夜景、黑暗、背光、日出/日落、烟火、聚光灯、室内、微距、文本文档、人像、婴儿、猫、狗、美食等。当然,以上并不是穷举,还包含很多其他类别的场景要素。
操作240,根据多标签分类模型对待检测图像进行场景识别,得到待检测图像对应的 标签,多标签分类模型为根据包含多种场景要素的多标签图像得到的。
在获取了待检测图像之后,对待检测图像进行场景识别。具体地,采用预先训练好的多标签分类模型对图像进行场景识别,得到图像所包含的场景对应的标签。其中,多标签分类模型为根据包含多种场景要素的多标签图像得到的。即多标签分类模型是使用包含多种场景要素的图像进行场景识别训练之后得到的场景识别模型。经过多标签分类模型对待检测图像进行场景识别,得到待检测图像所包含的场景对应的标签。例如,通过多标签分类模型对一张同时包含海滩、蓝天、人像这样多个场景要素的待检测图像进行场景识别,就可以直接输出待检测图像的标签为海滩、蓝天及人像。海滩、蓝天及人像即为待检测图像中的场景所对应的标签。
操作260,将待检测图像对应的标签作为场景识别的结果进行输出。
在通过多标签分类模型对待检测图像进行场景识别,得到待检测图像所包含的场景对应的标签之后,待检测图像对应的标签即为场景识别的结果。将场景识别的结果进行输出。
本申请实施例中,获取需要进行场景识别的图像,根据多标签分类模型对待检测图像进行场景识别,得到待检测图像对应的标签,多标签分类模型为根据包含多种场景要素的多标签图像得到的。将待检测图像对应的标签作为场景识别的结果进行输出。因为多标签分类模型为根据包含多种场景要素的多标签图像所得到的场景识别模型,所以可以对包含不同场景要素的图像,进行场景识别之后直接较为准确地输出这个图像中多个场景分别对应的标签。因此提高了对包含不同场景要素的图像进行场景识别的准确性,且同时提高了场景识别的效率。
在一个实施例中,如图3A所示,在获取待检测图像之前,包括:
操作320,获取包含多种场景要素的多标签图像。
获取包含多种场景要素的图像,在本实施例中称为多标签图像,因为包含多种场景的图像在进行场景识别之后,每个场景都会对应一个标签,所有的标签构成图像的标签,即多标签图像。
操作340,使用包含多种场景要素的多标签图像训练多标签分类模型。
获取一些多标签图像样本,预先可以通过人工对上述多标签图像样本进行场景识别,获取每个多标签图像样本所对应的标签,称为标准标签。然后采用上述多标签图像样本中 的图像一一进行场景识别训练,直到训练出来的场景识别结果与标准标签之间的误差越来越小。此时经过训练之后获得的即为可以实现对多标签图像进行场景识别的多标签分类模型。
本申请实施例中,因为多标签分类模型为使用包含多种场景要素的多标签图像进行训练所得到的场景识别模型,所以可以对包含不同场景要素的图像,进行场景识别之后直接较为准确地输出这个图像中多个场景分别对应的标签。提高了对多标签图像识别的准确性、同时也提高了多标签图像识别的效率。
在一个实施例中,多标签分类模型基于神经网络模型构建。
多标签分类模型的具体的训练方法为:将包含有背景训练目标和前景训练目标的训练图像输入到神经网络,得到反映训练图像中背景区域各像素点的第一预测置信度与第一真实置信度之间的差异的第一损失函数,以及反映训练图像中前景区域各像素点的第二预测置信度与第二真实置信度之间的差异的第二损失函数;第一预测置信度为采用神经网络预测出的训练图像中背景区域某一像素点属于背景训练目标的置信度,第一真实置信度表示在训练图像中预先标注的像素点属于背景训练目标的置信度;第二预测置信度为采用神经网络预测出的训练图像中前景区域某一像素点属于前景训练目标的置信度,第二真实置信度表示在训练图像中预先标注的像素点属于前景训练目标的置信度;
将第一损失函数和第二损失函数进行加权求和得到目标损失函数;
根据目标损失函数调整神经网络的参数,对神经网络进行训练进而最终得到多标签分类模型。其中,训练图像的背景训练目标有对应的标签,前景训练目标中也有标签。
图3B为一个实施例中神经网络模型的架构示意图。如图3B所示,神经网络的输入层接收带有图像类别标签的训练图像,通过基础网络(如CNN网络)进行特征提取,并将提取的图像特征输出给特征层,由该特征层对背景训练目标进行类别检测得到第一损失函数,对前景训练目标根据图像特征进行类别检测得到第二损失函数,对前景训练目标根据前景区域进行位置检测得到位置损失函数,将第一损失函数、第二损失函数和位置损失函数进行加权求和得到目标损失函数。该神经网络可为卷积神经网络。卷积神经网络包括数据输入层、卷积计算层、激活层、池化层和全连接层。数据输入层用于对原始图像数据进行预处理。该预处理可包括去均值、归一化、降维和白化处理。去均值是指将输入数据各个维 度都中心化为0,目的是将样本的中心拉回到坐标系原点上。归一化是将幅度归一化到同样的范围。白化是指对数据各个特征轴上的幅度归一化。卷积计算层用于局部关联和窗口滑动。卷积计算层中每个滤波器连接数据窗的权重是固定的,每个滤波器关注一个图像特征,如垂直边缘、水平边缘、颜色、纹理等,将这些滤波器合在一起得到整张图像的特征提取器集合。一个滤波器是一个权重矩阵。通过一个权重矩阵可与不同窗口内数据做卷积。激活层用于将卷积层输出结果做非线性映射。激活层采用的激活函数可为ReLU(The Rectified Linear Unit,修正线性单元)。池化层可夹在连续的卷积层中间,用于压缩数据和参数的量,减小过拟合。池化层可采用最大值法或平均值法对数据降维。全连接层位于卷积神经网络的尾部,两层之间所有神经元都有权重连接。卷积神经网络的一部分卷积层级联到第一置信度输出节点,一部分卷积层级联到第二置信度输出节点,一部分卷积层级联到位置输出节点,根据第一置信度输出节点可以检测到图像的背景分类,根据第二置信度输出节点可以检测到图像的前景目标的类别,根据位置输出节点可以检测到前景目标所对应的位置。
具体地,人工神经网络(Artificial Neural Networks,简写为ANNs),也简称为神经网络(NNs)或称作连接模型(Connection Model)。它从信息处理角度对人脑神经元网络进行抽象,建立某种简单模型,按不同的连接方式组成不同的网络。在工程与学术界也常直接简称为神经网络或类神经网络。可以理解为,人工神经网络就是一种应用类似于大脑神经突触联接的结构进行信息处理的数学模型。
神经网络常用于分类,例如,对垃圾邮件的识别分类、对图像中猫狗的识别分类等。这种能自动对输入的变量进行分类的机器,就叫做分类器。分类器的输入是一个数值向量,叫做特征(向量)。在使用分类器之前,需要对分类器进行训练,即需要先对神经网络进行训练。
人工神经网络的训练依靠反向传播算法。最开始在输入层输入特征向量,经过网络计算获得输出,输出层发现输出和正确的类号不一致,这时它就让最后一层神经元进行参数调整,最后一层神经元不仅调整自身的参数,还会勒令连接它的倒数第二层神经元进行调整自身参数,如此层层往回退着调整。经过调整的网络将会在样本上继续测试,如果输出依然出错,继续下一轮回退调整,直到经过神经网络输出的结果与正确的结果尽可能的一 致为止。
本申请实施例中,神经网络模型包括输入层、隐层和输出层。从包含多种场景要素的多标签图像中提取特征向量,然后将特征向量输入至隐层中进行计算损失函数的大小,再根据损失函数来调整神经网络模型的参数,使得损失函数不断收敛,进而实现对神经网络模型进行训练得到多标签分类模型。该多标签分类模型可以实现对输入的图像进行场景识别得到图像中所包含的每个场景的标签,并将这些标签作为场景识别的结果进行输出。通过对背景训练目标所对应的第一损失函数和前景训练目标所对应的第二损失函数的加权求和得到目标损失函数,根据目标损失函数调整神经网络的参数,使得训练得到的多标签分类模型后续可以同时识别出背景类别和前景目标的标签,获取更多的信息,且提高了识别效率。
在一个实施例中,如图4所示,操作240,根据多标签分类模型对待检测图像进行场景识别,得到待检测图像对应的标签,包括:
操作242,根据多标签分类模型对待检测图像进行场景识别,得到待检测图像的初始标签及初始标签对应的置信度;
操作244,判断初始标签的置信度是否大于预设阈值;
操作246,当判断结果为是,则将置信度大于预设阈值的初始标签作为待检测图像对应的标签。
采用经过训练所得到的多标签分类模型,在实际中进行图像场景识别时的输出还是可能存在一定的误差,因此,需要进一步减小误差。一般情况下,如果是采用上述训练所得的多标签分类模型对一张包含多种场景要素的待检测图像进行场景识别,那么会得到待检测图像的多个初始标签及初始标签对应的置信度。例如,对于一张包含海滩、蓝天、人像的待检测图像来进行场景识别,识别出待检测图像的初始标签为海滩的置信度为0.6,识别出待检测图像的初始标签为蓝天的置信度为0.7,识别出待检测图像的初始标签为人像的置信度为0.8,识别出待检测图像的初始标签为狗的置信度为0.4,识别出待检测图像的初始标签为雪景的置信度为0.3。
然后再对识别结果的初始标签进行筛选,具体的,判断初始标签的置信度是否大于预设阈值。其中,预设阈值可以是在前期训练出这个多标签分类模型的时候,根据大量的训 练样本,当损失函数比较小,所得出的结果比较接近实际的结果的的时候,所得出的一个置信度阈值。例如,根据大量的训练样本所得出的置信度阈值为0.5,则在上述例子中,判断初始标签的置信度是否大于预设阈值,将置信度大于预设阈值的初始标签作为图像对应的标签。所得出的待检测图像对应的标签为海滩、蓝天、人像,舍弃了置信度低于阈值的狗和雪景这两个干扰项。
本申请实施例中,根据多标签分类模型对待检测图像进行场景识别,得到待检测图像的初始标签及初始标签对应的置信度。因为进行场景识别所得的初始标签不一定是待检测图像对应的真实的标签,因此,采用每个初始标签的置信度对初始标签进行筛选,筛选出大于置信度阈值的初始标签作为待检测图像对应的场景识别结果。这样在一定程度上提高了场景识别结果的准确性。
在一个实施例中,每个初始标签对应的置信度的范围为[0,1]。
具体地,因为多标签分类模型为根据包含多种场景要素的多标签图像进行训练所得到的场景识别模型,所以可以对包含不同场景要素的待检测图像,进行场景识别之后直接较为准确地输出这个待检测图像中多个场景分别对应的标签。该多标签分类模型中对每一个标签的识别过程都是独立的,所以每一个识别出来的标签的概率都可以是在[0,1]之间。在本申请实施例中,不同标签的识别过程是互不影响的,所以就能够全面地识别出待检测图像中包含的所有场景,避免遗漏。
在一个实施例中,如图5所示,在将待检测图像对应的标签作为场景识别的结果进行输出之后,包括:
操作520,获取待检测图像拍摄时的位置信息;
操作540,根据位置信息对场景识别的结果进行校正,得到校正之后的场景识别的最终结果。
具体地,一般情况下,电子设备会对每次拍照的地点进行记录,一般采用GPS(Global Positioning System,全球定位***)来进行记录地址信息。获取电子设备所记录的地址信息。在获取电子设备所记录的地址信息之后,根据地址信息获取待检测图像的位置信息。预先为不同的地址信息匹配对应的场景类别及场景类别对应的权值。具体地,可以是根据对大量的图像素材进行统计学分析后得出的结果,根据结果相应地为不同的地址信息匹配 对应的场景类别及场景类别对应的权值。例如,根据对大量的图像素材进行统计学分析后得出,当地址信息显示为“XXX草原”时,则与地址为“草原”对应的场景为“绿草”的权值为9,“雪景”的权值为7,“风景”的权值为4,“蓝天”的权值为6,“海滩”的权值为-8,权值的取值范围为[-10,10]。权值越大说明在该图像中出现该场景的概率就越大,权值越小说明在该图像中出现该场景的概率就越小。这样就可以根据图像拍摄时的地址信息及与该地址信息对应的场景的概率大小,对场景识别的结果进行校正,得到校正之后的场景识别的最终结果。例如,如果图片的地址信息为“XXX草原”,那么与该“XXX草原”对应的场景为“绿草”、“雪景”、“蓝天”的权值较高,则这些场景出现的概率较大。因此,对场景识别的结果进行校正,如果场景识别的结果中出现上述“绿草”、“雪景”、“蓝天”,那么就可以作为场景识别的最终结果。如果场景识别的结果中出现“海滩”这个场景,那么就应该根据图像拍摄时的地址信息对“海滩”场景进行过滤,去除“海滩”场景,避免得到不正确、不符合实际的场景类别。
本申请实施例中,获取待检测图像拍摄时的位置信息,根据位置信息对场景识别的结果进行校正,得到校正之后的场景识别的最终结果。可以实现用通过待检测图像的拍摄地址信息获取到的待检测图像的场景类别,来对场景识别的结果进行校准,从而最终提高了场景检测的准确度。
在一个实施例中,在将待检测图像对应的标签作为场景识别的结果进行输出之后,还包括:
根据场景识别的结果对待检测图像进行与场景识别结果相对应的图像处理。
本申请实施例中,在对待检测图像经过多标签分类模型进行场景识别之后,得到了待检测图像对应的标签,并将待检测图像对应的标签作为场景识别的结果进行输出之后。场景识别的结果可以用来作为图像后期处理的依据,可以根据场景识别的结果来对待检测图像进行针对性地图像处理,从而大大提高图像的质量。例如,如果识别出待检测图像的场景类别为夜景,则就可以采用夜景所适合的处理方式对该图像进行处理,例如增加亮度等。如果识别出待检测图像的场景类别为逆光,则就可以采用逆光所合适的处理方式对该图像进行处理。当然,如果识别出待检测图像的场景类别为多标签,例如包含海滩、绿草、蓝天,而可以分别对海滩区域采用适合海滩的处理方式,对绿草区域采用绿草所适合的处理 方式,而对蓝天则采用适合蓝天的处理方式分别进行图像处理,从而使得整个图像的效果都非常好。
在一个具体的实施例中,提供了一种图像处理方法,以该方法应用于图1中的电子设备为例进行说明,包括:
操作一,获取包含多种场景要素的多标签图像,使用包含多种场景要素的多标签图像,对神经网络模型进行训练以得到多标签分类模型,即多标签分类模型基于神经网络架构;
操作二,根据多标签分类模型对待检测图像进行场景识别,得到待检测图像的初始标签及初始标签对应的置信度;
操作三,判断初始标签的置信度是否大于预设阈值,当判断结果为是,则将置信度大于预设阈值的初始标签作为待检测图像对应的标签,将待检测图像对应的标签作为场景识别的结果进行输出;
操作四,获取待检测图像拍摄时的位置信息,根据位置信息对场景识别的结果进行校正,得到校正之后的场景识别的最终结果;
操作五,根据场景识别的结果对待检测图像进行与场景识别结果相对应的图像处理,得到处理之后的图像。
在本申请实施例中,因为多标签分类模型为根据包含多种场景要素的多标签图像所得到的场景识别模型,所以可以对包含不同场景要素的待检测图像,进行场景识别之后直接较为准确地输出这个图像中多个场景分别对应的标签。因此提高了对包含不同场景要素的待检测图像进行场景识别的准确性,且同时提高了场景识别的效率。根据待检测图像拍摄时的位置信息对场景识别的结果进行校正,得到校正之后的场景识别的最终结果。可以实现用通过待检测图像的拍摄地址信息获取到的待检测图像的场景类别,来对场景识别的结果进行校准,从而最终提高了场景检测的准确度。且场景识别的结果可以用来作为图像后期处理的依据,可以根据场景识别的结果来对图像进行针对性地图像处理,从而大大提高图像的质量。
应该理解的是,虽然上述流程图中的各个操作按照箭头的指示依次显示,但是这些操作并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些操作的执行并没有严格的顺序限制,这些操作可以以其它的顺序执行。而且,上述图中的至少一部分 操作可以包括多个子操作或者多个阶段,这些子操作或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子操作或者阶段的执行顺序也不必然是依次进行,而是可以与其它操作或者其它操作的子操作或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图6所示,提供了一种图像处理装置600,装置包括:图像获取模块610、场景识别模块620及输出模块630。其中,
图像获取模块610,用于获取待检测图像;
场景识别模块620,用于根据多标签分类模型对待检测图像进行场景识别,得到待检测图像对应的标签,多标签分类模型为根据包含多种场景要素的多标签图像得到的;
输出模块630,用于将待检测图像对应的标签作为场景识别的结果进行输出。
在一个实施例中,如图7所示,提供了一种图像处理装置600,装置还包括:
多标签图像获取模块640,用于获取包含多种场景要素的多标签图像;
多标签分类模型训练模块650,用于使用包含多种场景要素的多标签图像训练多标签分类模型。
在一个实施例中,如图8所示,场景识别模块620包括:
初始标签获取模块622,用于根据多标签分类模型对待检测图像进行场景识别,得到待检测图像的初始标签及初始标签对应的置信度;
判断模块624,用于判断初始标签的置信度是否大于预设阈值;
图像标签生成模块626,用于当判断结果为是,则将置信度大于预设阈值的初始标签作为待检测图像对应的标签。
在一个实施例中,提供了一种图像处理装置600,还用于获取待检测图像拍摄时的位置信息;根据位置信息对场景识别的结果进行校正,得到校正之后的场景识别的最终结果。
在一个实施例中,提供了一种图像处理装置600,还用于根据场景识别的结果对待检测图像进行与场景识别结果相对应的图像处理。
上述图像处理装置中各个模块的划分仅用于举例说明,在其他实施例中,可将图像处理装置按照需要划分为不同的模块,以完成上述图像处理装置的全部或部分功能。
上述图像处理装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。其中,网络接口可以是以太网卡或无线网卡等,上述各模块可以以硬件形式内嵌于或独立于服务 器中的处理器中,也可以以软件形式存储于服务器中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现上述各实施例所提供的图像处理方法的操作。
在一个实施例中,提供了一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,处理器执行计算机程序时实现上述各实施例所提供的图像处理方法的操作。
本申请实施例还提供了一种计算机程序产品,当其在计算机上运行时,使得计算机执行上述各实施例所提供的图像处理方法的操作。
本申请实施例还提供一种电子设备。上述电子设备中包括图像处理电路,图像处理电路可以利用硬件和/或软件组件实现,可包括定义ISP(Image Signal Processing,图像信号处理)管线的各种处理单元。图9为一个实施例中图像处理电路的示意图。如图9所示,为便于说明,仅示出与本申请实施例相关的图像处理技术的各个方面。
如图9所示,图像处理电路包括ISP处理器940和控制逻辑器950。成像设备910捕捉的图像数据首先由ISP处理器940处理,ISP处理器940对图像数据进行分析以捕捉可用于确定和/或成像设备910的一个或多个控制参数的图像统计信息。成像设备910可包括具有一个或多个透镜912和图像传感器914的照相机。图像传感器914可包括色彩滤镜阵列(如Bayer滤镜),图像传感器914可获取用图像传感器914的每个成像像素捕捉的光强度和波长信息,并提供可由ISP处理器940处理的一组原始图像数据。传感器920(如陀螺仪)可基于传感器920接口类型把采集的图像处理的参数(如防抖参数)提供给ISP处理器940。传感器920接口可以利用SMIA(Standard Mobile Imaging Architecture,标准移动成像架构)接口、其它串行或并行照相机接口或上述接口的组合。
此外,图像传感器914也可将原始图像数据发送给传感器920,传感器920可基于传感器920接口类型把原始图像数据提供给ISP处理器940,或者传感器920将原始图像数据存储到图像存储器930中。
ISP处理器940按多种格式逐个像素地处理原始图像数据。例如,每个图像像素可具有8、10、12或14比特的位深度,ISP处理器940可对原始图像数据进行一个或多个图 像处理操作、收集关于图像数据的统计信息。其中,图像处理操作可按相同或不同的位深度精度进行。
ISP处理器940还可从图像存储器930接收图像数据。例如,传感器920接口将原始图像数据发送给图像存储器930,图像存储器930中的原始图像数据再提供给ISP处理器940以供处理。图像存储器930可为存储器装置的一部分、存储设备、或电子设备内的独立的专用存储器,并可包括DMA(Direct Memory Access,直接直接存储器存取)特征。
当接收到来自图像传感器914接口或来自传感器920接口或来自图像存储器930的原始图像数据时,ISP处理器940可进行一个或多个图像处理操作,如时域滤波。处理后的图像数据可发送给图像存储器930,以便在被显示之前进行另外的处理。ISP处理器940从图像存储器930接收处理数据,并对处理数据进行原始域中以及RGB和YCbCr颜色空间中的图像数据处理。ISP处理器940处理后的图像数据可输出给显示器970,以供用户观看和/或由图形引擎或GPU(Graphics Processing Unit,图形处理器)进一步处理。此外,ISP处理器940的输出还可发送给图像存储器930,且显示器970可从图像存储器930读取图像数据。在一个实施例中,图像存储器930可被配置为实现一个或多个帧缓冲器。此外,ISP处理器940的输出可发送给编码器/解码器960,以便编码/解码图像数据。编码的图像数据可被保存,并在显示于显示器970设备上之前解压缩。编码器/解码器960可由CPU或GPU或协处理器实现。
ISP处理器940确定的统计数据可发送给控制逻辑器950单元。例如,统计数据可包括自动曝光、自动白平衡、自动聚焦、闪烁检测、黑电平补偿、透镜912阴影校正等图像传感器914统计信息。控制逻辑器950可包括执行一个或多个例程(如固件)的处理器和/或微控制器,一个或多个例程可根据接收的统计数据,确定成像设备910的控制参数及ISP处理器940的控制参数。例如,成像设备910的控制参数可包括传感器920控制参数(例如增益、曝光控制的积分时间、防抖参数等)、照相机闪光控制参数、透镜912控制参数(例如聚焦或变焦用焦距)、或这些参数的组合。ISP控制参数可包括用于自动白平衡和颜色调整(例如,在RGB处理期间)的增益水平和色彩校正矩阵,以及透镜912阴影校正参数。
本申请所使用的对存储器、存储、数据库或其它介质的任何引用可包括非易失性和/或易 失性存储器。合适的非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM),它用作外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDR SDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,该程序可存储于一非易失性计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,该存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (16)

  1. 一种图像处理方法,其特征在于,包括:
    获取待检测图像;
    根据多标签分类模型对所述待检测图像进行场景识别,得到所述待检测图像对应的标签,所述多标签分类模型为根据包含多种场景要素的多标签图像得到的;及
    将所述待检测图像对应的标签作为场景识别的结果进行输出。
  2. 根据权利要求1所述的方法,其特征在于,在所述获取待检测图像之前,包括:
    获取包含多种场景要素的多标签图像;及
    使用所述包含多种场景要素的多标签图像训练所述多标签分类模型。
  3. 根据权利要求2所述的方法,其特征在于,所述多标签分类模型基于神经网络模型构建。
  4. 根据权利要求1所述的方法,其特征在于,所述根据多标签分类模型对所述待检测图像进行场景识别,得到所述待检测图像对应的标签,包括:
    根据多标签分类模型对所述待检测图像进行场景识别,得到所述待检测图像的初始标签及所述初始标签对应的置信度;
    判断所述初始标签的置信度是否大于预设阈值;及
    当判断结果为是,则将置信度大于预设阈值的所述初始标签作为所述待检测图像对应的标签。
  5. 根据权利要求4所述的方法,其特征在于,所述每个初始标签对应的置信度的范围为[0,1]。
  6. 根据权利要求1所述的方法,其特征在于,在将所述待检测图像对应的标签作为场景识别的结果进行输出之后,包括:
    获取所述待检测图像拍摄时的位置信息;及
    根据所述位置信息对所述场景识别的结果进行校正,得到校正之后的场景识别的最终结果。
  7. 根据权利要求1所述的方法,其特征在于,在将所述待检测图像对应的标签作为 场景识别的结果进行输出之后,还包括:
    根据场景识别的结果对所述待检测图像进行与所述场景识别结果相对应的图像处理。
  8. 一种图像处理装置,其特征在于,所述装置包括:
    图像获取模块,用于获取待检测图像;
    场景识别模块,用于根据多标签分类模型对所述待检测图像进行场景识别,得到所述待检测图像对应的标签,所述多标签分类模型为根据包含多种场景要素的多标签图像得到的;及
    输出模块,用于将所述待检测图像对应的标签作为场景识别的结果进行输出。
  9. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至7中任一项所述的图像处理方法的操作。
  10. 一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现以下操作:
    获取待检测图像;
    根据多标签分类模型对所述待检测图像进行场景识别,得到所述待检测图像对应的标签,所述多标签分类模型为根据包含多种场景要素的多标签图像得到的;及
    将所述待检测图像对应的标签作为场景识别的结果进行输出。
  11. 根据权利要求10所述的电子设备,其特征在于,在所述获取待检测图像之前,包括:
    获取包含多种场景要素的多标签图像;及
    使用所述包含多种场景要素的多标签图像训练所述多标签分类模型。
  12. 根据权利要求11所述的电子设备,其特征在于,所述多标签分类模型基于神经网络模型构建。
  13. 根据权利要求10所述的电子设备,其特征在于,所述根据多标签分类模型对所述待检测图像进行场景识别,得到所述待检测图像对应的标签,包括:
    根据多标签分类模型对所述待检测图像进行场景识别,得到所述待检测图像的初始标签及所述初始标签对应的置信度;
    判断所述初始标签的置信度是否大于预设阈值;及
    当判断结果为是,则将置信度大于预设阈值的所述初始标签作为所述待检测图像对应的标签。
  14. 根据权利要求13所述的电子设备,其特征在于,所述每个初始标签对应的置信度的范围为[0,1]。
  15. 根据权利要求10所述的电子设备,其特征在于,在将所述待检测图像对应的标签作为场景识别的结果进行输出之后,包括:
    获取所述待检测图像拍摄时的位置信息;及
    根据所述位置信息对所述场景识别的结果进行校正,得到校正之后的场景识别的最终结果。
  16. 根据权利要求10所述的电子设备,其特征在于,在将所述待检测图像对应的标签作为场景识别的结果进行输出之后,还包括:
    根据场景识别的结果对所述待检测图像进行与所述场景识别结果相对应的图像处理。
PCT/CN2019/089914 2018-06-08 2019-06-04 图像处理方法和装置、存储介质、电子设备 WO2019233394A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810585679.3A CN108764208B (zh) 2018-06-08 2018-06-08 图像处理方法和装置、存储介质、电子设备
CN201810585679.3 2018-06-08

Publications (1)

Publication Number Publication Date
WO2019233394A1 true WO2019233394A1 (zh) 2019-12-12

Family

ID=64000474

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/089914 WO2019233394A1 (zh) 2018-06-08 2019-06-04 图像处理方法和装置、存储介质、电子设备

Country Status (2)

Country Link
CN (1) CN108764208B (zh)
WO (1) WO2019233394A1 (zh)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111008145A (zh) * 2019-12-19 2020-04-14 中国银行股份有限公司 一种测试信息采集方法及装置
CN111125177A (zh) * 2019-12-26 2020-05-08 北京奇艺世纪科技有限公司 生成数据标签的方法、装置、电子设备及可读存储介质
CN111128348A (zh) * 2019-12-27 2020-05-08 上海联影智能医疗科技有限公司 医学图像处理方法、装置、存储介质及计算机设备
CN111160289A (zh) * 2019-12-31 2020-05-15 欧普照明股份有限公司 针对目标用户的意外事故的检测方法、装置及电子设备
CN111291800A (zh) * 2020-01-21 2020-06-16 青梧桐有限责任公司 房屋装修类型分析方法、***、电子设备及可读存储介质
CN111292331A (zh) * 2020-02-23 2020-06-16 华为技术有限公司 图像处理的方法与装置
CN111353549A (zh) * 2020-03-10 2020-06-30 创新奇智(重庆)科技有限公司 图像标签的核验方法及装置、电子设备、存储介质
CN111461260A (zh) * 2020-04-29 2020-07-28 上海东普信息科技有限公司 基于特征融合的目标检测方法、装置、设备及存储介质
CN111612034A (zh) * 2020-04-15 2020-09-01 中国科学院上海微***与信息技术研究所 一种对象识别模型的确定方法、装置、电子设备及存储介质
CN111709371A (zh) * 2020-06-17 2020-09-25 腾讯科技(深圳)有限公司 基于人工智能的分类方法、装置、服务器和存储介质
CN111985449A (zh) * 2020-09-03 2020-11-24 深圳壹账通智能科技有限公司 救援现场图像的识别方法、装置、设备及计算机介质
CN112023400A (zh) * 2020-07-24 2020-12-04 上海米哈游天命科技有限公司 一种高度图生成方法、装置、设备及存储介质
CN112579587A (zh) * 2020-12-29 2021-03-30 北京百度网讯科技有限公司 数据清洗方法及装置、设备和存储介质
CN112926158A (zh) * 2021-03-16 2021-06-08 上海设序科技有限公司 一种工业机械设计场景下基于参数微调的通用设计方法
CN113065513A (zh) * 2021-01-27 2021-07-02 武汉星巡智能科技有限公司 智能摄像头自训练置信度阈值的优化方法、装置及设备
CN113177498A (zh) * 2021-05-10 2021-07-27 清华大学 基于物体真实大小和物体特征的图像识别方法和装置
CN113221800A (zh) * 2021-05-24 2021-08-06 珠海大横琴科技发展有限公司 一种待检测目标的监控判断方法及***
CN113329173A (zh) * 2021-05-19 2021-08-31 Tcl通讯(宁波)有限公司 一种影像优化方法、装置、存储介质及终端设备
CN113569593A (zh) * 2020-04-28 2021-10-29 京东方科技集团股份有限公司 智能花瓶***及花卉识别展示方法、电子设备
CN113642595A (zh) * 2020-05-11 2021-11-12 北京金山数字娱乐科技有限公司 一种基于图片的信息提取方法及装置
CN114049420A (zh) * 2021-10-29 2022-02-15 马上消费金融股份有限公司 一种模型训练方法、图像渲染方法、装置和电子设备
CN114118114A (zh) * 2020-08-26 2022-03-01 顺丰科技有限公司 一种图像检测方法、装置及其存储介质
CN114255381A (zh) * 2021-12-23 2022-03-29 北京瑞莱智慧科技有限公司 图像识别模型的训练方法、图像识别方法、装置及介质
CN115100419A (zh) * 2022-07-20 2022-09-23 中国科学院自动化研究所 目标检测方法、装置、电子设备及存储介质

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764208B (zh) * 2018-06-08 2021-06-08 Oppo广东移动通信有限公司 图像处理方法和装置、存储介质、电子设备
CN109635701B (zh) * 2018-12-05 2023-04-18 宽凳(北京)科技有限公司 车道通行属性获取方法、装置和计算机可读存储介质
CN109657517B (zh) * 2018-12-21 2021-12-03 深圳智可德科技有限公司 微型二维码识别方法、装置、可读存储介质及扫码枪
CN109741288B (zh) * 2019-01-04 2021-07-13 Oppo广东移动通信有限公司 图像处理方法、装置、存储介质及电子设备
CN109831629B (zh) * 2019-03-14 2021-07-02 Oppo广东移动通信有限公司 终端拍照模式的调整方法、装置、终端及存储介质
CN109831628B (zh) * 2019-03-14 2021-07-16 Oppo广东移动通信有限公司 终端拍照模式的调整方法、装置、终端及存储介质
CN110348291A (zh) * 2019-05-28 2019-10-18 华为技术有限公司 一种场景识别方法、一种场景识别装置及一种电子设备
CN110266946B (zh) * 2019-06-25 2021-06-25 普联技术有限公司 一种拍照效果自动优化方法、装置、存储介质及终端设备
CN110796715B (zh) * 2019-08-26 2023-11-24 腾讯科技(深圳)有限公司 电子地图标注方法、装置、服务器及存储介质
CN110704650B (zh) * 2019-09-29 2023-04-25 携程计算机技术(上海)有限公司 Ota图片标签的识别方法、电子设备和介质
CN110781834A (zh) * 2019-10-28 2020-02-11 上海眼控科技股份有限公司 交通异常图像检测方法、装置、计算机设备及存储介质
CN111191706A (zh) * 2019-12-25 2020-05-22 深圳市赛维网络科技有限公司 图片识别的方法、装置、设备及存储介质
CN111212243B (zh) * 2020-02-19 2022-05-20 深圳英飞拓智能技术有限公司 一种用于混行检测的自动曝光调节***
CN111523390B (zh) * 2020-03-25 2023-11-03 杭州易现先进科技有限公司 一种图像识别的方法及增强现实ar图标识别的***
CN111597921B (zh) * 2020-04-28 2024-06-18 深圳市人工智能与机器人研究院 场景识别方法、装置、计算机设备和存储介质
CN111709283A (zh) * 2020-05-07 2020-09-25 顺丰科技有限公司 物流件状态的检测方法以及装置
CN111613212B (zh) * 2020-05-13 2023-10-31 携程旅游信息技术(上海)有限公司 语音识别方法、***、电子设备和存储介质
CN111626353A (zh) * 2020-05-26 2020-09-04 Oppo(重庆)智能科技有限公司 一种图像处理方法及终端、存储介质
CN111915598B (zh) * 2020-08-07 2023-10-13 温州医科大学 一种基于深度学习的医疗图像处理方法和装置
CN112163110B (zh) * 2020-09-27 2023-01-03 Oppo(重庆)智能科技有限公司 图像分类方法、装置、电子设备和计算机可读存储介质
CN112329725B (zh) * 2020-11-27 2022-03-25 腾讯科技(深圳)有限公司 一种道路场景的要素识别方法、装置、设备以及存储介质
CN112651332A (zh) * 2020-12-24 2021-04-13 携程旅游信息技术(上海)有限公司 基于照片库的场景设施识别方法、***、设备及存储介质
CN112686316A (zh) * 2020-12-30 2021-04-20 上海掌门科技有限公司 一种用于确定标签的方法与设备
CN112906811B (zh) * 2021-03-09 2023-04-18 西安电子科技大学 基于物联网架构的工程车载设备图像自动分类方法
CN113222058B (zh) * 2021-05-28 2024-05-10 芯算一体(深圳)科技有限公司 一种图像分类方法、装置、电子设备及存储介质
CN113222055B (zh) * 2021-05-28 2023-01-10 新疆爱华盈通信息技术有限公司 一种图像分类方法、装置、电子设备及存储介质
CN113065615A (zh) * 2021-06-02 2021-07-02 南京甄视智能科技有限公司 基于场景化的边缘分析算法下发方法、装置及存储介质
CN114998357B (zh) * 2022-08-08 2022-11-15 长春摩诺维智能光电科技有限公司 基于多信息分析的工业检测方法、***、终端和介质
CN116310665B (zh) * 2023-05-17 2023-08-15 济南博观智能科技有限公司 一种图像环境分析方法、设备及介质
CN117671497B (zh) * 2023-12-04 2024-05-28 广东筠诚建筑科技有限公司 一种基于数字图像的工程建筑废料分类方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845549A (zh) * 2017-01-22 2017-06-13 珠海习悦信息技术有限公司 一种基于多任务学习的场景与目标识别的方法及装置
CN106951911A (zh) * 2017-02-13 2017-07-14 北京飞搜科技有限公司 一种快速的多标签图片检索***及实现方法
CN108052966A (zh) * 2017-12-08 2018-05-18 重庆邮电大学 基于卷积神经网络的遥感图像场景自动提取和分类方法
CN108090497A (zh) * 2017-12-28 2018-05-29 广东欧珀移动通信有限公司 视频分类方法、装置、存储介质及电子设备
CN108764208A (zh) * 2018-06-08 2018-11-06 Oppo广东移动通信有限公司 图像处理方法和装置、存储介质、电子设备

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107622281B (zh) * 2017-09-20 2021-02-05 Oppo广东移动通信有限公司 图像分类方法、装置、存储介质及移动终端

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845549A (zh) * 2017-01-22 2017-06-13 珠海习悦信息技术有限公司 一种基于多任务学习的场景与目标识别的方法及装置
CN106951911A (zh) * 2017-02-13 2017-07-14 北京飞搜科技有限公司 一种快速的多标签图片检索***及实现方法
CN108052966A (zh) * 2017-12-08 2018-05-18 重庆邮电大学 基于卷积神经网络的遥感图像场景自动提取和分类方法
CN108090497A (zh) * 2017-12-28 2018-05-29 广东欧珀移动通信有限公司 视频分类方法、装置、存储介质及电子设备
CN108764208A (zh) * 2018-06-08 2018-11-06 Oppo广东移动通信有限公司 图像处理方法和装置、存储介质、电子设备

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111008145B (zh) * 2019-12-19 2023-09-22 中国银行股份有限公司 一种测试信息采集方法及装置
CN111008145A (zh) * 2019-12-19 2020-04-14 中国银行股份有限公司 一种测试信息采集方法及装置
CN111125177B (zh) * 2019-12-26 2024-01-16 北京奇艺世纪科技有限公司 生成数据标签的方法、装置、电子设备及可读存储介质
CN111125177A (zh) * 2019-12-26 2020-05-08 北京奇艺世纪科技有限公司 生成数据标签的方法、装置、电子设备及可读存储介质
CN111128348A (zh) * 2019-12-27 2020-05-08 上海联影智能医疗科技有限公司 医学图像处理方法、装置、存储介质及计算机设备
CN111128348B (zh) * 2019-12-27 2024-03-26 上海联影智能医疗科技有限公司 医学图像处理方法、装置、存储介质及计算机设备
CN111160289A (zh) * 2019-12-31 2020-05-15 欧普照明股份有限公司 针对目标用户的意外事故的检测方法、装置及电子设备
CN111291800A (zh) * 2020-01-21 2020-06-16 青梧桐有限责任公司 房屋装修类型分析方法、***、电子设备及可读存储介质
CN111292331A (zh) * 2020-02-23 2020-06-16 华为技术有限公司 图像处理的方法与装置
CN111292331B (zh) * 2020-02-23 2023-09-12 华为云计算技术有限公司 图像处理的方法与装置
CN111353549A (zh) * 2020-03-10 2020-06-30 创新奇智(重庆)科技有限公司 图像标签的核验方法及装置、电子设备、存储介质
CN111612034A (zh) * 2020-04-15 2020-09-01 中国科学院上海微***与信息技术研究所 一种对象识别模型的确定方法、装置、电子设备及存储介质
CN111612034B (zh) * 2020-04-15 2024-04-12 中国科学院上海微***与信息技术研究所 一种对象识别模型的确定方法、装置、电子设备及存储介质
CN113569593A (zh) * 2020-04-28 2021-10-29 京东方科技集团股份有限公司 智能花瓶***及花卉识别展示方法、电子设备
CN111461260B (zh) * 2020-04-29 2023-04-18 上海东普信息科技有限公司 基于特征融合的目标检测方法、装置、设备及存储介质
CN111461260A (zh) * 2020-04-29 2020-07-28 上海东普信息科技有限公司 基于特征融合的目标检测方法、装置、设备及存储介质
CN113642595A (zh) * 2020-05-11 2021-11-12 北京金山数字娱乐科技有限公司 一种基于图片的信息提取方法及装置
CN111709371A (zh) * 2020-06-17 2020-09-25 腾讯科技(深圳)有限公司 基于人工智能的分类方法、装置、服务器和存储介质
CN111709371B (zh) * 2020-06-17 2023-12-22 腾讯科技(深圳)有限公司 基于人工智能的分类方法、装置、服务器和存储介质
CN112023400A (zh) * 2020-07-24 2020-12-04 上海米哈游天命科技有限公司 一种高度图生成方法、装置、设备及存储介质
CN114118114A (zh) * 2020-08-26 2022-03-01 顺丰科技有限公司 一种图像检测方法、装置及其存储介质
CN111985449A (zh) * 2020-09-03 2020-11-24 深圳壹账通智能科技有限公司 救援现场图像的识别方法、装置、设备及计算机介质
CN112579587A (zh) * 2020-12-29 2021-03-30 北京百度网讯科技有限公司 数据清洗方法及装置、设备和存储介质
CN113065513A (zh) * 2021-01-27 2021-07-02 武汉星巡智能科技有限公司 智能摄像头自训练置信度阈值的优化方法、装置及设备
CN112926158A (zh) * 2021-03-16 2021-06-08 上海设序科技有限公司 一种工业机械设计场景下基于参数微调的通用设计方法
CN112926158B (zh) * 2021-03-16 2023-07-14 上海设序科技有限公司 一种工业机械设计场景下基于参数微调的通用设计方法
CN113177498A (zh) * 2021-05-10 2021-07-27 清华大学 基于物体真实大小和物体特征的图像识别方法和装置
CN113177498B (zh) * 2021-05-10 2022-08-09 清华大学 基于物体真实大小和物体特征的图像识别方法和装置
CN113329173A (zh) * 2021-05-19 2021-08-31 Tcl通讯(宁波)有限公司 一种影像优化方法、装置、存储介质及终端设备
CN113221800A (zh) * 2021-05-24 2021-08-06 珠海大横琴科技发展有限公司 一种待检测目标的监控判断方法及***
CN114049420A (zh) * 2021-10-29 2022-02-15 马上消费金融股份有限公司 一种模型训练方法、图像渲染方法、装置和电子设备
CN114255381B (zh) * 2021-12-23 2023-05-12 北京瑞莱智慧科技有限公司 图像识别模型的训练方法、图像识别方法、装置及介质
CN114255381A (zh) * 2021-12-23 2022-03-29 北京瑞莱智慧科技有限公司 图像识别模型的训练方法、图像识别方法、装置及介质
CN115100419A (zh) * 2022-07-20 2022-09-23 中国科学院自动化研究所 目标检测方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN108764208B (zh) 2021-06-08
CN108764208A (zh) 2018-11-06

Similar Documents

Publication Publication Date Title
WO2019233394A1 (zh) 图像处理方法和装置、存储介质、电子设备
WO2019233393A1 (zh) 图像处理方法和装置、存储介质、电子设备
US11138478B2 (en) Method and apparatus for training, classification model, mobile terminal, and readable storage medium
CN108764370B (zh) 图像处理方法、装置、计算机可读存储介质和计算机设备
CN108777815B (zh) 视频处理方法和装置、电子设备、计算机可读存储介质
WO2019233297A1 (zh) 数据集的构建方法、移动终端、可读存储介质
CN108921161B (zh) 模型训练方法、装置、电子设备和计算机可读存储介质
US10896323B2 (en) Method and device for image processing, computer readable storage medium, and electronic device
WO2020001197A1 (zh) 图像处理方法、电子设备、计算机可读存储介质
WO2019233266A1 (zh) 图像处理方法、计算机可读存储介质和电子设备
CN108810413B (zh) 图像处理方法和装置、电子设备、计算机可读存储介质
EP3598736B1 (en) Method and apparatus for processing image
US11132771B2 (en) Bright spot removal using a neural network
WO2019233392A1 (zh) 图像处理方法、装置、电子设备和计算机可读存储介质
CN108804658B (zh) 图像处理方法和装置、存储介质、电子设备
WO2019233260A1 (zh) 广告信息推送方法和装置、存储介质、电子设备
CN108961302B (zh) 图像处理方法、装置、移动终端及计算机可读存储介质
CN108897786B (zh) 应用程序的推荐方法、装置、存储介质及移动终端
WO2019233271A1 (zh) 图像处理方法、计算机可读存储介质和电子设备
WO2019223594A1 (zh) 神经网络模型处理方法和装置、图像处理方法、移动终端
WO2020001196A1 (zh) 图像处理方法、电子设备、计算机可读存储介质
WO2019223513A1 (zh) 图像识别方法、电子设备和存储介质
CN108848306B (zh) 图像处理方法和装置、电子设备、计算机可读存储介质
CN108717530A (zh) 图像处理方法、装置、计算机可读存储介质和电子设备
CN110956679B (zh) 图像处理方法和装置、电子设备、计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19816116

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19816116

Country of ref document: EP

Kind code of ref document: A1