CN110619251A

CN110619251A - Image processing method and device, storage medium and electronic equipment

Info

Publication number: CN110619251A
Application number: CN201810629566.9A
Authority: CN
Inventors: 刘耀勇
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2018-06-19
Filing date: 2018-06-19
Publication date: 2019-12-27
Anticipated expiration: 2038-06-19
Also published as: CN110619251B

Abstract

The application relates to an image processing method and device, electronic equipment and a computer readable storage medium, which are used for carrying out scene recognition on an image, obtaining a scene recognition initial result, obtaining time information during image shooting, correcting the scene recognition initial result according to the time information and obtaining a corrected scene recognition final result. The method combines the analysis of the time information when the image is shot on the basis of the scene recognition method, and the time information of the shot image is adopted to correct the initial result of the scene recognition because different time information can cause certain influence on the initial result of the scene recognition, thereby further optimizing the rationality of the final result of the scene recognition and improving the accuracy of the scene recognition.

Description

Image processing method and device, storage medium and electronic equipment

Technical Field

The present application relates to the field of computer technologies, and in particular, to an image processing method and apparatus, a storage medium, and an electronic device.

Background

With the popularization of mobile terminals and the rapid development of mobile internet, the usage amount of users of mobile terminals is increasing. The photographing function in the mobile terminal has become one of the functions commonly used by the user. During or after the photographing, the mobile terminal may perform scene recognition on the image to provide an intelligent experience for the user.

Disclosure of Invention

The embodiment of the application provides an image processing method and device, a storage medium and electronic equipment, which can improve the accuracy of scene recognition on an image.

An image processing method comprising:

carrying out scene recognition on the image to obtain a scene recognition initial result;

acquiring time information when the image is shot;

and correcting the initial scene recognition result according to the time information to obtain a corrected final scene recognition result.

An image processing apparatus, the apparatus comprising:

the scene recognition module is used for carrying out scene recognition on the image and acquiring a scene recognition initial result;

the time determining module is used for acquiring time information when the image is shot;

and the first correction module is used for correcting the initial scene recognition result according to the time information to obtain a corrected final scene recognition result.

A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the image processing method as described above.

An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor performing the steps of the image processing method as described above when executing the computer program.

The image processing method and device, the storage medium and the electronic equipment perform scene recognition on the image, acquire the initial result of the scene recognition, acquire the time information during image shooting, and correct the initial result of the scene recognition according to the time information to obtain the final result of the corrected scene recognition. The method combines the analysis of the time information when the image is shot on the basis of the scene recognition method, and the time information of the shot image is adopted to correct the initial result of the scene recognition because different time information can cause certain influence on the initial result of the scene recognition, thereby further optimizing the rationality of the final result of the scene recognition and improving the accuracy of the scene recognition.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a diagram of the internal structure of an electronic device in one embodiment;

FIG. 2 is a flow diagram of a method of image processing in one embodiment;

FIG. 3 is a schematic diagram of an embodiment of a neural network;

FIG. 4 is a flowchart of an image processing method in another embodiment;

FIG. 5 is a flowchart of the method for correcting the initial result of scene recognition according to the time information and the location information to obtain the final result of scene recognition after correction in FIG. 4;

FIG. 6 is a flowchart of another method for correcting the initial scene recognition result according to the time information and the location information to obtain a corrected final scene recognition result in FIG. 4;

FIG. 7 is a diagram showing a configuration of an image processing apparatus according to an embodiment;

FIG. 8 is a schematic diagram showing a configuration of an image processing apparatus according to another embodiment;

fig. 9 is a block diagram of a partial structure of a cellular phone related to an electronic device provided in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

Fig. 1 is a schematic diagram of an internal structure of an electronic device in one embodiment. As shown in fig. 1, the electronic device includes a processor, a memory, and a network interface connected by a system bus. Wherein, the processor is used for providing calculation and control capability and supporting the operation of the whole electronic equipment. The memory is used for storing data, programs and the like, and the memory stores at least one computer program which can be executed by the processor to realize the scene recognition method suitable for the electronic device provided in the embodiment of the application. The Memory may include a non-volatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random-Access-Memory (RAM). For example, in one embodiment, the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The computer program can be executed by a processor to implement an image processing method provided in the following embodiments. The internal memory provides a cached execution environment for the operating system computer programs in the non-volatile storage medium. The network interface may be an ethernet card or a wireless network card, etc. for communicating with an external electronic device. The electronic device may be a mobile phone, a tablet computer, or a personal digital assistant or a wearable device, etc.

In one embodiment, as shown in fig. 2, an image processing method is provided, which is described by taking the method as an example applied to the electronic device in fig. 1, and includes:

and step 220, carrying out scene recognition on the image to obtain a scene recognition initial result.

The user uses the electronic equipment (with the photographing function) to photograph, obtains the photographed image, and performs scene recognition on the image. The image for scene recognition may be a photographed preview image, a photograph stored in the electronic device after being photographed, or a video file obtained by photographing. Specifically, a traditional scene recognition algorithm is adopted to perform scene recognition on the image, and which scene is included in the image is detected. The deep Neural network model mainly used in the scene recognition algorithm is a Convolutional Neural Network (CNN). For example, the scene category may be landscape, beach, blue sky, green grass, snow scene, fireworks, spotlights, text, portrait, baby, cat, dog, delicacy, and the like. After the image is subjected to scene recognition, an initial scene recognition result is obtained.

Specifically, a neural network model is adopted to perform scene recognition on the image, and the specific training process of the neural network model is as follows: inputting a training image containing a background training target and a foreground training target into a neural network to obtain a first loss function reflecting the difference between a first prediction confidence and a first real confidence of each pixel point in a background area in the training image and a second loss function reflecting the difference between a second prediction confidence and a second real confidence of each pixel point in a foreground area in the training image; the first prediction confidence coefficient is the confidence coefficient that a certain pixel point in a background area in a training image predicted by adopting a neural network belongs to a background training target, and the first real confidence coefficient represents the confidence coefficient that a pixel point labeled in advance in the training image belongs to the background training target; the second prediction confidence coefficient is the confidence coefficient that a certain pixel point in a foreground region in the training image predicted by adopting the neural network belongs to the foreground training target, and the second real confidence coefficient represents the confidence coefficient that a pixel point labeled in advance in the training image belongs to the foreground training target; weighting and summing the first loss function and the second loss function to obtain a target loss function; and adjusting parameters of the neural network according to the target loss function, and training the neural network. And training a neural network model, and carrying out scene recognition on the image according to the neural network model to obtain the scene category to which the image belongs.

FIG. 3 is an architectural diagram of a neural network model in one embodiment. As shown in fig. 3, an input layer of a neural network receives a training image with an image category label, performs feature extraction through a basic network (e.g., a CNN network), outputs the extracted image features to a feature layer, performs category detection on a background training target by the feature layer to obtain a first loss function, performs category detection on a foreground training target according to the image features to obtain a second loss function, performs position detection on the foreground training target according to a foreground region to obtain a position loss function, and performs weighted summation on the first loss function, the second loss function, and the position loss function to obtain a target loss function. The neural network may be a convolutional neural network. The convolutional neural network comprises a data input layer, a convolutional calculation layer, an activation layer, a pooling layer and a full-link layer. The data input layer is used for preprocessing the original image data. The preprocessing may include de-averaging, normalization, dimensionality reduction, and whitening processing. Deaveraging refers to centering the input data to 0 for each dimension in order to pull the center of the sample back to the origin of the coordinate system. Normalization is to normalize the amplitude to the same range. Whitening refers to normalizing the amplitude on each characteristic axis of the data. The convolution computation layer is used for local correlation and window sliding. The weights of each filter connection data window in the convolution calculation layer are fixed, each filter pays attention to one image feature, such as vertical edge, horizontal edge, color, texture and the like, and the filters are combined to obtain a feature extractor set of the whole image. One filter is a weight matrix. The convolution can be performed with the data in different windows through a weight matrix. The activation layer is used for carrying out nonlinear mapping on the convolution layer output result. The activation function used by The activation layer may be ReLU (The reconstructed Linear Unit). A pooling layer may be sandwiched between successive convolutional layers for compressing the amount of data and parameters, reducing overfitting. The pooling layer may employ a maximum or mean method to dimensionality-reduce the data. The fully connected layer is positioned at the tail part of the convolutional neural network, and all neurons between the two layers are connected in a weighted mode. And one part of convolutional layers of the convolutional neural network are cascaded to a first confidence coefficient output node, one part of convolutional layers are cascaded to a second confidence coefficient output node, one part of convolutional layers are cascaded to a position output node, the classification of the background of the image can be detected according to the first confidence coefficient output node, the classification of the foreground target of the image can be detected according to the second confidence coefficient output node, and the position corresponding to the foreground target can be detected according to the position output node.

And step 240, acquiring time information when the image is shot.

Generally, the electronic device records time information of each photographing, specifically, date information, time information, and the like at the time of photographing. For example, the time information of one image obtained by shooting is recorded as: beijing at 2018, 5 months, 1 day, 12:00 AM.

And step 260, correcting the initial scene recognition result according to the time information to obtain a corrected final scene recognition result.

Since different time information generally corresponds to different scenes, the initial scene recognition result can be corrected by the time information when the image is captured. For example, for the Beijing time, 12:00AM on 5/1/2018, it is generally the daytime. At this time, images captured at the above time are available, and the probability of the images being in the daytime is higher than the probability of the images being in the nighttime in the scene recognition result. Therefore, the scene recognition result is eliminated in the night, and the reasonability and the accuracy of the scene recognition result are improved.

In the embodiment of the application, the image is subjected to scene recognition, the initial scene recognition result is obtained, the time information during image shooting is obtained, the initial scene recognition result is corrected according to the time information, and the final corrected scene recognition result is obtained. The method combines the analysis of the time information when the image is shot on the basis of the scene recognition method, and the time information of the shot image is adopted to correct the initial result of the scene recognition because different time information can cause certain influence on the initial result of the scene recognition, thereby further optimizing the rationality of the final result of the scene recognition and improving the accuracy of the scene recognition.

In one embodiment, as shown in fig. 4, after acquiring time information at the time of image capturing, the method includes:

in step 270, position information is obtained when the image is captured.

In general, the electronic device records position information of each shot, and generally records the position information by using a Global Positioning System (GPS). For example, when a user takes a picture in shenzhen city lotus mountain park, the address of the picture after taking the picture can be recorded as "shenzhen city lotus mountain park". If the address is lotus mountain park in Shenzhen city, the probability of blue sky, green grass, portrait and landscape appearing in the corresponding shot image is higher, and the probability of beach, snow landscape and the like is lower.

And step 280, correcting the initial scene recognition result according to the time information and the position information to obtain a corrected final scene recognition result.

After the time information and the position information when the image is captured are acquired, the time information and the position information may be combined and comprehensively considered, so that the initial result of the scene recognition is corrected from two dimensions to obtain the final result of the corrected scene recognition. For example, if the shooting time of a certain image is beijing time 2018, 3/14 noon 12 p, and the position information when the image is shot is shenzhen lotus mountain park, the scene presented at this time can be spring warm blooming, bird-language floral, and salix floribunda, so the corresponding scene at this time can be a scene of blue sky, green grass, and fresh flowers, that is, the probability of the scene appearing at this time is relatively high. If the shooting time of a certain image is also Beijing 12 am # 3/14 in 2018, but the position information of the image shooting is a certain park in Harbin city, the scene can be a scene of flying snow or snow in a flying day, so that the corresponding scene can be a snow scene, namely the probability of the snow scene is higher.

As can be seen from the above, different combinations of time information and location information correspond to a specific scene with a relatively high probability of occurrence. Therefore, the initial scene recognition result is corrected according to the time information and the position information, and a corrected final scene recognition result is obtained. For example, when the shooting time of a certain image is beijing, 2018, 3, 14, noon 12 p, and the position information during shooting is shenzhen, lotus mountain park, the probability of scenes such as blue sky, green grass and flowers is high, and if the initial scene recognition result shows snow scenery, the snow scenery can be directly removed.

In the embodiment of the application, the time information and the position information during image shooting can be combined and comprehensively considered, so that the initial scene recognition result is corrected from two dimensions, and the final corrected scene recognition result is obtained. Because the initial result of scene recognition is corrected after the scene is limited or eliminated from two dimensions of time information and position information, the accuracy of the final result of the obtained scene recognition is greatly improved.

In one embodiment, as shown in fig. 5, the correcting the initial scene recognition result according to the time information and the position information to obtain the final corrected scene recognition result includes:

and 282, determining shooting scene information of the image according to the time information and the position information, wherein the shooting scene information comprises season information.

Seasons are periods of time during which the geographical landscape appears cyclically each year with relatively large differences. The seasons of different regions are divided into different parts, and scenes displayed in the seasons are different. For temperate zones, especially for the climate of China, the climate is divided into four seasons in one year, namely spring, summer, autumn and winter, but the division and the length of each season are different in different areas; and only dry seasons and rainy seasons for tropical grassland. Therefore, the seasonal information of the image when being shot can be determined according to the time information and the position information and by combining the local seasonal division condition of many years. For example, the seasonal division of harbourine is generally spring, summer and fall (5-9 months), winter (10 to 4 months in the following year). Snowfall generally occurs in winter (10 to 4 months in the next year), and snowfall generally rarely occurs in spring, summer and autumn (5 to 9 months). The seasons of Shenzhen are generally divided into spring (2-4 months), summer (4-10 months), autumn and winter (11-2 months next year). And snow rarely falls in autumn and winter all the year round (except a few areas with higher altitude).

In step 284, a scene result corresponding to the season information is obtained.

Having acquired the time information and the position information at the time of image capturing, it is possible to determine the capturing scene information at the time of capturing a picture, such as season information. Of course, the shooting scene information may also include some scene information corresponding to the derived season information of the current position at the time of shooting. For example, when the time information of the image is beijing, 3/14 noon 12 p in 2018, and the position information is harbin park, it can be determined that the seasonal information of harbin is winter, the probability of snow scenery is high, and the probability of reddish willow green and green grass is low. And when the time information of the image is Beijing time 2018, 3, 14, noon 12 and the position information is Shenzhen city lotus mountain park, the seasonal information of Shenzhen can be determined to be spring, the probability of snowscape occurrence is low, and the probability of reddish willow green and green grass occurrence is high.

And 286, correcting the initial scene recognition result according to the scene result to obtain a corrected final scene recognition result.

After some scene results corresponding to the seasonal information of the current position at the time of shooting are obtained as described above, the scene recognition initial result is corrected based on the scene results. For example, if the image is subjected to scene recognition, the initial scene recognition result is obtained as green grass, blue sky and snow scene, and the time information and the position information of the image during shooting are obtained, and the obtained time information is: 12 am # 3/14 in 2018 Beijing, and the position information is as follows: shenzhen city lotus mountain park. Then, shooting scene information of the image is determined according to the time information and the position information, the shooting scene information comprises seasonal information, and it can be determined that the seasonal information of Shenzhen is spring, the probability of snowscape occurrence is low, and the probability of reddish willow green, green grass and blue sky occurrence is high. After the scene result corresponding to the season information is acquired, the scene recognition initial result can be corrected. Because the probability of the snow scene is very low, the snow scene is removed. The final result of the obtained scene recognition is green grass and blue sky.

In the embodiment of the application, because of different regions, the division of seasons is different, and the scenes presented in the seasons are also different. Therefore, the shooting scene information of the image is determined according to the time information and the position information when the image is shot, and the shooting scene information comprises season information. And then acquiring a scene result corresponding to the season information. Even if the same season is determined, scenes in the same season of different regions differ, so that scene results also differ. And finally, correcting the initial scene recognition result according to the scene result to obtain a corrected final scene recognition result. Therefore, the scene result with high probability of occurrence in the image can be accurately acquired according to the time information and the position information during image shooting, and the correction of the initial scene recognition result is realized. The accuracy of the final result of scene recognition after correction is improved.

In one embodiment, as shown in fig. 6, the correcting the initial scene recognition result according to the time information and the position information to obtain the final corrected scene recognition result includes:

and step 620, determining shooting scene information of the image according to the time information and the position information, wherein the shooting scene information comprises weather information.

Having acquired the time information and the position information at the time of image capturing, it is possible to determine the capturing scene information at the time of capturing a picture, such as weather information. The current weather information of the current position can be obtained by referring to the time information and the position information, for example, when the time information of the image is beijing time of 2018, 3, month, 14, noon and 12, and the position information is a harlbine park, the current weather information of the current position is rain and snow fall/snow fall.

And step 640, acquiring a scene result corresponding to the weather information.

After the weather information at the time of image shooting is acquired, a scene result corresponding to the weather information is acquired. Because each different weather information is combined with the time information and the position information during image shooting, the scene result corresponding to the weather information during image shooting can be accurately determined. For example, in the above example, a certain park at haerbin 12 am, 3/14/3/2018 in beijing may be snowing, so that a scene result with a relatively high probability may be a snow scene, and green grass and blue sky may be relatively low in probability.

And 660, correcting the initial scene recognition result according to the scene result to obtain a corrected final scene recognition result.

And after a scene result corresponding to the weather information is acquired, correcting the initial scene recognition result according to the scene result. For example, in the above example, the two scene recognition initial results of green grass and blue sky may be eliminated, and snow scenery may be left as the scene recognition final result.

In the embodiment of the application, after the time information and the position information of the image shooting are obtained, shooting scene information, such as weather information, of the image shooting can be determined. Since the weather information at the time of taking the image is a factual event that has already occurred, different weather will have a particular scene result. Therefore, the initial result of scene recognition is corrected by means of weather information, and the accuracy of the initial result of scene recognition after correction is greatly improved.

In one embodiment, the scene recognition initial result includes an initial category of the scene recognition and a confidence corresponding to the initial category of the scene recognition.

Specifically, the scene recognition initial result includes an initial category of the scene recognition and a confidence corresponding to the initial category of the scene recognition. For example, after scene recognition is performed on a certain image, the obtained initial scene recognition result includes green grass, blue sky, and snow scene. Wherein the confidence of the green grass is 70 percent, the confidence of the blue sky is 70 percent, and the confidence of the snow scene is 30 percent. A higher value of the confidence of an initial category of scene recognition indicates a higher probability that the initial category actually appears in the image.

In one embodiment, the image processing method further comprises: matching corresponding scene types and weights corresponding to the scene types for different time information and position information in advance.

Matching corresponding scene types and weights corresponding to the scene types for different combinations of time information and position information in advance, and storing the data in a database for calling at any time. Specifically, the local current season information, the scene type appearing in the season, and the weight corresponding to the scene type may be determined according to different combinations of the time information and the location information. Of course, the local and current weather information may also be determined according to a combination of different time information and location information, and the scene type that may appear and the weight corresponding to the scene type may be determined according to the current and local weather information.

For example, the time information of capturing the image is: 12 am # 3/14 in 2018 Beijing, and the position information is as follows: shenzhen city lotus mountain park. And then determining the scene type of the image and the weight value corresponding to the scene type according to the combination of the time information and the position information. The time period and the position information are combined to match corresponding scene types and weight values corresponding to the scene types according to the result obtained by performing statistical analysis on a large number of image materials in the time period and the position. For example, statistical analysis can be performed on a large number of image materials which are in the near week of 3-14 in 2018 of Beijing and have position information in Shenzhen city (or are accurate to the Lotus mountain park in Shenzhen city), so as to analyze the scene type of the image materials and the weight corresponding to the scene type. The scene type and the weight corresponding to the scene type may be: green grass, the corresponding weight is 9; blue sky, the corresponding weight is 8.5; in rainy days, the corresponding weight is 3, in snowscapes, the corresponding weight is-8. A larger weight indicates a higher probability of the scene occurring.

In the embodiment of the application, the corresponding scene type and the weight corresponding to the scene type are matched for the combination of different time information and position information in advance, and the result is obtained after a large amount of image materials are subjected to statistical analysis. Firstly, the result is obtained by carrying out statistical analysis on a large number of image materials, and has higher universality and accuracy. Secondly, according to the result obtained after the statistical analysis is carried out on a large number of image materials, the scene of the image is predicted and calibrated, and the accuracy of scene recognition can be finally improved.

In one embodiment, the correcting the initial scene recognition result according to the time information and the position information to obtain a corrected final scene recognition result includes:

calculating confidence coefficient of the initial scene recognition result according to the scene category corresponding to the time information and the position information and the weight corresponding to the scene category;

and taking the scene recognition initial result with the confidence coefficient exceeding a preset threshold value as a scene recognition final result.

Specifically, if the weight of the scene type "green grass" obtained from the combination of the time information and the position information is 9, the confidence of "green grass" in the initial result of scene recognition is 70%, and the confidence is 70% × (1+ 9%) -0.763. It is shown that the confidence of "green grass" is enhanced after "green grass" in the scene recognition initial result is corrected according to the combination of the time information and the position information, and the confidence of "green grass" recalculated at this time is 0.763. For example, if the weight of the scene type "snow scene" obtained from the address information is-8, the confidence of "snow scene" in the scene recognition initial result is 30%, which is equal to 0.276 by 30% × (1-8%). It means that the confidence of the "snow scene" is weakened after the "snow scene" in the initial scene recognition result is corrected based on the combination of the time information and the position information, and the confidence of the "snow scene" recalculated at this time is 0.276. And in the same way, the calculation is sequentially carried out on each scene type in the initial scene recognition result, and the confidence coefficient of the recalculated scene type is obtained.

In general, a preset threshold of the confidence coefficient may be set to 0.5, and the initial result of the scene recognition with the confidence coefficient exceeding the preset threshold is used as the final result of the scene recognition. Of course, the preset threshold of the confidence level may be other reasonable values.

In the embodiment of the present application, a process of calculating a confidence of a scene recognition initial result according to a scene type corresponding to a combination of time information and position information and a weight corresponding to the scene type is described in detail. By the method for recalculating the confidence coefficient, the confidence coefficient of the scene type with higher accuracy can be acquired, and the scene type exceeding the preset threshold of the confidence coefficient is screened out, so that the result with higher accuracy is screened out from the initial result of the scene recognition and is output as the final result of the scene recognition.

In a specific embodiment, an image processing method is provided, which is described by taking the application of the method to the electronic device in fig. 1 as an example, and includes:

the method comprises the following steps: matching corresponding scene types and weight values corresponding to the scene types for different time information and position information in advance, and storing the scene types and the weight values in a database;

step two: carrying out scene recognition on the image to obtain a scene recognition initial result, wherein the scene recognition initial result comprises an initial category of the scene recognition and a confidence coefficient corresponding to the initial category of the scene recognition;

step three: acquiring time information and position information when the image is shot;

step four: determining shooting scene information of the image according to the time information and the position information, wherein the shooting scene information comprises season information, weather information and the like;

step five: acquiring scene results corresponding to the season information and the weather information, wherein the scene results can be inquired from the database in the step one and comprise scene types and weights corresponding to the scene types;

step six; and according to the scene type and the weight value corresponding to the scene type, recalculating the confidence coefficient of the initial scene recognition result, and taking the initial scene recognition result with the recalculated confidence coefficient exceeding a preset threshold value as a final scene recognition result.

In the embodiment of the application, the corresponding scene type and the weight corresponding to the scene type are matched for different time information and position information in advance and stored in the database. Then, after the time information and the position information when the image is shot are actually obtained, the scene type matched with the time information and the position information and the weight value corresponding to the scene type can be further obtained. Then, the confidence coefficient of the initial scene recognition result can be recalculated according to the scene type and the weight value corresponding to the scene type, and the initial scene recognition result with the confidence coefficient exceeding the preset threshold value obtained by recalculation is used as the final scene recognition result. Therefore, the scene result with high probability of occurrence in the image can be accurately acquired according to the time information and the position information during image shooting, and the correction of the initial scene recognition result is realized. The accuracy of the final result of scene recognition after correction is improved.

In one embodiment, as shown in fig. 7, there is provided an image processing apparatus 700 including: a detection module 720, a time determination module 740, and a correction module 760. Wherein the content of the first and second substances,

a scene recognition module 720, configured to perform scene recognition on the image, and obtain a scene recognition initial result;

a time determining module 740 for acquiring time information when the image is photographed;

the first correcting module 760 is configured to correct the initial scene recognition result according to the time information to obtain a final corrected scene recognition result.

In one embodiment, as shown in fig. 8, there is provided an image processing apparatus 700 further comprising:

a position determining module 750 for acquiring position information when the image is photographed;

and the second correcting module 770 is configured to correct the initial scene recognition result according to the time information and the position information, so as to obtain a final corrected scene recognition result.

In one embodiment, the second correction module 770 is further configured to determine shooting scene information of the image according to the time information and the position information, wherein the shooting scene information includes season information; acquiring a scene result corresponding to the season information; and correcting the initial scene recognition result according to the scene result to obtain a corrected final scene recognition result.

In one embodiment, the second correction module 770 is further configured to determine shooting scene information of the image according to the time information and the position information, wherein the shooting scene information includes weather information; acquiring a scene result corresponding to the weather information; and correcting the initial scene recognition result according to the scene result to obtain a corrected final scene recognition result.

In one embodiment, an image processing apparatus 700 is further provided for matching corresponding scene types and weights corresponding to the scene types for different time information and position information in advance.

In an embodiment, the second correcting module 770 is further configured to calculate a confidence level of the initial result of scene recognition according to the scene type corresponding to the time information and the position information and the weight value corresponding to the scene type; and taking the scene recognition initial result with the confidence coefficient exceeding a preset threshold value as a scene recognition final result.

The division of the modules in the image processing apparatus is only for illustration, and in other embodiments, the image processing apparatus may be divided into different modules as needed to complete all or part of the functions of the image processing apparatus.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which, when being executed by a processor, implements the steps of the image processing method provided by the above embodiments.

In one embodiment, an electronic device is provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the steps of the image processing method provided in the above embodiments are implemented.

The embodiments of the present application also provide a computer program product, which when run on a computer, causes the computer to execute the steps of the image processing method provided in the foregoing embodiments.

The embodiment of the application also provides the electronic equipment. As shown in fig. 9, for convenience of explanation, only the parts related to the embodiments of the present application are shown, and details of the technology are not disclosed, please refer to the method part of the embodiments of the present application. The electronic device may be any terminal device including a mobile phone, a tablet computer, a PDA (Personal Digital Assistant), a POS (Point of Sales), a vehicle-mounted computer, a wearable device, and the like, taking the electronic device as the mobile phone as an example:

fig. 9 is a block diagram of a partial structure of a mobile phone related to an electronic device provided in an embodiment of the present application. Referring to fig. 9, the handset includes: radio Frequency (RF) circuit 910, memory 920, input unit 930, display unit 940, sensor 950, audio circuit 990, wireless fidelity (WiFi) module 970, processor 980, and power supply 990. Those skilled in the art will appreciate that the handset configuration shown in fig. 9 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The RF circuit 910 may be used for receiving and transmitting signals during information transmission or communication, and may receive downlink information of a base station and then process the downlink information to the processor 980; the uplink data may also be transmitted to the base station. Typically, the RF circuitry includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuit 910 may also communicate with networks and other devices via wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to Global System for mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE)), e-mail, Short Messaging Service (SMS), and the like.

The memory 920 may be used to store software programs and modules, and the processor 980 may execute various functional applications and data processing of the mobile phone by operating the software programs and modules stored in the memory 920. The memory 920 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function (such as an application program for a sound playing function, an application program for an image playing function, and the like), and the like; the data storage area may store data (such as audio data, an address book, etc.) created according to the use of the mobile phone, and the like. Further, the memory 920 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The input unit 930 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the cellular phone 900. Specifically, the input unit 930 may include a touch panel 931 and other input devices 932. The touch panel 931, which may also be referred to as a touch screen, may collect a touch operation performed by a user on or near the touch panel 931 (e.g., a user operating the touch panel 931 or near the touch panel 931 by using a finger, a stylus, or any other suitable object or accessory), and drive the corresponding connection device according to a preset program. In one embodiment, the touch panel 931 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 980, and can receive and execute commands sent by the processor 980. In addition, the touch panel 931 may be implemented by various types, such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 930 may include other input devices 932 in addition to the touch panel 931. In particular, other input devices 932 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), and the like.

The display unit 940 may be used to display information input by the user or information provided to the user and various menus of the mobile phone. The display unit 940 may include a display panel 941. In one embodiment, the Display panel 941 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. In one embodiment, the touch panel 931 may overlay the display panel 941, and when the touch panel 931 detects a touch operation thereon or nearby, the touch operation is transmitted to the processor 980 to determine the type of touch event, and then the processor 980 provides a corresponding visual output on the display panel 941 according to the type of touch event. Although in fig. 9, the touch panel 931 and the display panel 941 are two independent components to implement the input and output functions of the mobile phone, in some embodiments, the touch panel 931 and the display panel 941 may be integrated to implement the input and output functions of the mobile phone.

Cell phone 900 may also include at least one sensor 950, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that adjusts the brightness of the display panel 941 according to the brightness of ambient light, and a proximity sensor that turns off the display panel 941 and/or backlight when the mobile phone is moved to the ear. The motion sensor can comprise an acceleration sensor, the acceleration sensor can detect the magnitude of acceleration in each direction, the magnitude and the direction of gravity can be detected when the mobile phone is static, and the motion sensor can be used for identifying the application of the gesture of the mobile phone (such as horizontal and vertical screen switching), the vibration identification related functions (such as pedometer and knocking) and the like; the mobile phone may be provided with other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor.

The audio circuit 990, speaker 991, and microphone 992 may provide an audio interface between a user and a cell phone. The audio circuit 990 may convert the received audio data into an electrical signal, transmit the electrical signal to the speaker 991, and convert the electrical signal into an audio signal by the speaker 991 and output the audio signal; on the other hand, the microphone 992 converts the collected sound signal into an electrical signal, which is received by the audio circuit 990 and converted into audio data, and then the audio data is output to the processor 980, and then the audio data is transmitted to another mobile phone through the RF circuit 910, or the audio data is output to the memory 920 for subsequent processing.

WiFi belongs to short-distance wireless transmission technology, and the mobile phone can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 970, and provides wireless broadband Internet access for the user. Although fig. 9 shows WiFi module 970, it is to be understood that it does not belong to the essential components of cell phone 900 and may be omitted as desired.

The processor 980 is a control center of the mobile phone, connects various parts of the entire mobile phone by using various interfaces and lines, and performs various functions of the mobile phone and processes data by operating or executing software programs and/or modules stored in the memory 920 and calling data stored in the memory 920, thereby integrally monitoring the mobile phone. In one embodiment, processor 980 may include one or more processing units. In one embodiment, the processor 980 may integrate an application processor and a modem processor, wherein the application processor primarily handles operating systems, user interfaces, applications, and the like; the modem processor handles primarily wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 980.

The handset 900 also includes a power supply 990 (e.g., a battery) for supplying power to various components, which may preferably be logically connected to the processor 980 via a power management system, such that the power management system may be used to manage charging, discharging, and power consumption.

In one embodiment, the cell phone 900 may also include a camera, a bluetooth module, and the like.

Any reference to memory, storage, database, or other medium used herein may include non-volatile and/or volatile memory. Suitable non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which acts as external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), Enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and bus dynamic RAM (RDRAM).

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. An image processing method, comprising:

acquiring time information when the image is shot;

2. The method according to claim 1, characterized by, after acquiring the time information at the time of image capturing, comprising:

acquiring position information when the image is shot;

and correcting the initial scene recognition result according to the time information and the position information to obtain a corrected final scene recognition result.

3. The method of claim 2, wherein correcting the initial scene recognition result according to the time information and the position information to obtain a final corrected scene recognition result comprises:

determining shooting scene information of the image according to the time information and the position information, wherein the shooting scene information comprises season information;

acquiring a scene result corresponding to the season information;

and correcting the initial scene recognition result according to the scene result to obtain a corrected final scene recognition result.

4. The method of claim 2, wherein correcting the initial scene recognition result according to the time information and the position information to obtain a final corrected scene recognition result comprises:

determining shooting scene information of the image according to the time information and the position information, wherein the shooting scene information comprises weather information;

acquiring a scene result corresponding to the weather information;

5. The method of claim 1, wherein the initial result of scene recognition comprises an initial class of scene recognition and a confidence corresponding to the initial class of scene recognition.

6. The method of claim 2, further comprising:

matching corresponding scene types and weights corresponding to the scene types for different time information and position information in advance.

7. The method according to claim 6, wherein the correcting the initial scene recognition result according to the time information and the position information to obtain a final corrected scene recognition result comprises:

calculating confidence coefficient of the scene recognition initial result according to the scene type corresponding to the time information and the position information and the weight corresponding to the scene type;

8. An image processing apparatus, characterized in that the apparatus comprises:

9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the image processing method according to any one of claims 1 to 7.

10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the image processing method according to any of claims 1 to 7 are implemented by the processor when executing the computer program.