CN107480684B

CN107480684B - Image processing method and device

Info

Publication number: CN107480684B
Application number: CN201710738880.6A
Authority: CN
Inventors: 杨阳; 黄秀; 杨子豪; 沈复民; 谢宁; 申恒涛
Original assignee: Chengdu Aohaichuan Technology Co ltd
Current assignee: Chengdu Aohaichuan Technology Co ltd
Priority date: 2017-08-24
Filing date: 2017-08-24
Publication date: 2020-06-05
Anticipated expiration: 2037-08-24
Also published as: CN107480684A

Abstract

The embodiment of the invention provides an image processing method and device, and relates to the field of image processing. The method comprises the steps of obtaining an image in a microblog of a user; then obtaining an image input matrix corresponding to the image; then, a total input matrix is obtained based on the image input matrix and a preset training set input matrix, and the distribution condition of the subjects in the image is obtained based on the total input matrix. Therefore, the theme distribution condition in the image is obtained by processing the image, and then the attribute of the user is extracted, so that the efficiency is high, the accuracy is high, and the practicability is high.

Description

Image processing method and device

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to an image processing method and apparatus.

Background

At present, in the prior art, such as Poisson Gamma Belief Network (PGBN), the attribute of a user can be obtained only by processing text content, and the prior art cannot be directly applied in a large-scale social media environment, and is low in efficiency and inaccurate.

Disclosure of Invention

It is an object of the present invention to provide an image processing apparatus and method to improve the above-mentioned problems. In order to achieve the purpose, the technical scheme adopted by the invention is as follows:

in a first aspect, an embodiment of the present invention provides an image processing method, where the method includes acquiring an image in a microblog of a user; obtaining an image input matrix corresponding to the image; and obtaining a total input matrix based on the image input matrix and a preset training set input matrix, and obtaining the distribution condition of the theme in the image based on the total input matrix.

In a second aspect, an embodiment of the present invention provides an image processing apparatus, which includes a first acquisition unit, a second acquisition unit, and a third acquisition unit. The first acquisition unit is used for acquiring images in the microblog of the user. And the second acquisition unit is used for acquiring an image input matrix corresponding to the image. And the third acquisition unit is used for acquiring a total input matrix based on the image input matrix and a preset training set input matrix and acquiring the theme distribution condition in the image based on the total input matrix.

According to the image processing method and device provided by the embodiment of the invention, the image in the microblog of the user is acquired; then obtaining an image input matrix corresponding to the image; then, a total input matrix is obtained based on the image input matrix and a preset training set input matrix, and the distribution condition of the subjects in the image is obtained based on the total input matrix. Therefore, the theme distribution condition in the image is obtained by processing the image, and then the attribute of the user is extracted, so that the efficiency is high, the accuracy is high, and the practicability is high.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 is a block diagram of an electronic device according to an embodiment of the present invention;

FIG. 2 is a flowchart of an image processing method according to an embodiment of the present invention;

fig. 3 is a block diagram of an image processing apparatus according to an embodiment of the present invention;

FIG. 4 is a block diagram of another image processing apparatus according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

Fig. 1 shows a block diagram of an electronic device 100 applicable to an embodiment of the present invention. As shown in FIG. 1, electronic device 100 may include a memory 102, a memory controller 104, one or more processors 106 (only one shown in FIG. 1), a peripherals interface 108, an input-output module 110, an audio module 112, a display module 114, a radio frequency module 116, and an image processing apparatus.

The memory 102, the memory controller 104, the processor 106, the peripheral interface 108, the input/output module 110, the audio module 112, the display module 114, and the radio frequency module 116 are electrically connected directly or indirectly to realize data transmission or interaction. For example, electrical connections between these components may be made through one or more communication or signal buses. The image processing methods respectively include at least one software functional module that can be stored in the memory 102 in the form of software or firmware (firmware), such as a software functional module or a computer program included in the image processing apparatus.

The memory 102 may store various software programs and modules, such as program instructions/modules corresponding to the image processing method and apparatus provided in the embodiments of the present application. The processor 106 executes various functional applications and data processing by executing software programs and modules stored in the memory 102, that is, implements the image processing method in the embodiment of the present application.

The Memory 102 may include, but is not limited to, Random Access Memory (RAM), Read Only Memory (ROM), Programmable Read Only Memory (PROM), Erasable Read Only Memory (EPROM), electrically Erasable Read Only Memory (EEPROM), and the like.

The processor 106 may be an integrated circuit chip having signal processing capabilities. The processor may be a general-purpose processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. Which may implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present application. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The peripherals interface 108 couples various input/output devices to the processor 106 and to the memory 102. In some embodiments, the peripheral interface 108, the processor 106, and the memory controller 104 may be implemented in a single chip. In other examples, they may be implemented separately from the individual chips.

The input-output module 110 is used for providing input data to a user to enable the user to interact with the electronic device 100. The input/output module 110 may be, but is not limited to, a mouse, a keyboard, and the like.

Audio module 112 provides an audio interface to a user that may include one or more microphones, one or more speakers, and audio circuitry.

The display module 114 provides an interactive interface (e.g., a user interface) between the electronic device 100 and a user or for displaying image data to a user reference. In this embodiment, the display module 114 may be a liquid crystal display or a touch display. In the case of a touch display, the display can be a capacitive touch screen or a resistive touch screen, which supports single-point and multi-point touch operations. Supporting single-point and multi-point touch operations means that the touch display can sense touch operations from one or more locations on the touch display at the same time, and the sensed touch operations are sent to the processor 106 for calculation and processing.

The rf module 116 is used for receiving and transmitting electromagnetic waves, and implementing interconversion between the electromagnetic waves and electrical signals, so as to communicate with a communication network or other devices.

It will be appreciated that the configuration shown in FIG. 1 is merely illustrative and that electronic device 100 may include more or fewer components than shown in FIG. 1 or have a different configuration than shown in FIG. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.

In the embodiment of the invention, the electronic device 100 may be a user terminal or a server. The user terminal may be a pc (personal computer), a tablet computer, a mobile phone, a notebook computer, an intelligent television, a set-top box, a vehicle-mounted terminal, and other terminal devices.

Referring to fig. 2, an embodiment of the present invention provides an image processing method, including: step S200, step S210, and step S220.

Step S200: and acquiring an image in the microblog of the user.

The image in the microblog of the user can be acquired from the green microblog.

Step S210: and obtaining an image input matrix corresponding to the image.

Based on step S210, further performing sift feature extraction on the image, obtaining a first feature vector corresponding to the image, and obtaining an image input matrix corresponding to the image based on the first feature vector.

Step S220: and obtaining a total input matrix based on the image input matrix and a preset training set input matrix, and obtaining the distribution condition of the theme in the image based on the total input matrix.

Based on step S220, further, the image input matrix is spliced with a preset training set input matrix to obtain a total input matrix.

Prior to step S220, the method may further include:

and acquiring training images in a plurality of microblogs, and acquiring the training set input matrix based on the training images.

Further, performing sift feature extraction on each training image to obtain a second feature vector corresponding to each training image; acquiring a clustering center of each type and image features contained in each type based on a preset clustering algorithm and a second feature vector corresponding to each training image; and counting the number of the image features contained in each training image to obtain a training set input matrix corresponding to a plurality of training images.

Further, after obtaining the training set input matrix based on the training image, the method may further include:

setting the maximum theme number of the bottommost layer of the Poisson gamma belief network;

randomly distributing a theme for each training image based on the training set input matrix, obtaining an initialization matrix and generating an initial value of each probability parameter;

and iteratively training the Poisson gamma belief network based on the initialization matrix and the initial values of all probability parameters to obtain the theme distribution condition in the training image.

Specifically, after the image input matrix and a preset training set input matrix are spliced along the dimension of the image index, a spliced matrix is obtained, and the spliced matrix is partitioned along the dimension and summed according to users to obtain a total input matrix. In the total input matrix, each column represents a user, and each user represents an image.

Specifically, more than 1000 training images in microblogs are obtained from the Xinlang microblog, the training images in the microblogs are used as a training set, after all the training images sequentially use sift to extract features, a plurality of second feature vectors are obtained, the second feature vectors are connected in series, a preset clustering algorithm K-means algorithm is used for carrying out K clustering on the second feature vectors, and a clustering center of each type and image features contained in each type are obtained; counting the number of the image features contained in each training image to obtain a training set input matrix X corresponding to a plurality of training images_v。X_v(i, j) is the frequency with which the ith class of features appears in the jth image. The preset clustering algorithm can be a K-means algorithm.

Setting the maximum theme number K of the bottom layer of the Poisson gamma belief network_0maxDetermining an upper limit of the number of topics extracted through the first layer (the number of topics decreases from the lowest layer to the highest layer, namely, the topics at the higher layers are more general); inputting matrix X in training set_vRandomly assigning a topic (topic) (total K) to each occurrence of a feature class in each image_0maxTopic) to get the initialized matrix: that is, the assigned frequency of each feature class of each image in each microblog under the subjects of the lowest classes

Frequency (number) of ith feature class in jth image assigned under kth class subject, and bottom layer subject-feature class matrix

(i, k) represents the proportion of the ith feature class under the kth subject, considering all images), andand the proportion matrix of the characteristic class contained in each image of each microblog corresponding to each theme

The proportion of each theme in the jth image is respectively expressed as a vector, and initial values of each probability parameter (without practical significance, only participating in calculation) are generated. In the following process, the meaning of each matrix is not changed, but the values are changed.

Setting the current outer layer as T, starting from the bottom layer to the top layer, and executing certain iteration number (B) for each T_T+C_T) Training all layers below the layer in two steps according to a certain rule (T is less than or equal to T), wherein each iteration comprises the following steps: from the bottommost layer up to the outer current layer T, layer by layer, each layer samples the values of a portion of the matrix. Assuming that the internal current layer (current layer from the bottommost layer to the T-th layer) is T, when T is 1, the gibbs sampling method is used first

The theme of the feature class is re-sampled and distributed, and after a certain sampling times, a stable (no longer changing along with sampling) theme-feature class matrix Z _ v can be obtained through combination⁽¹⁾，Z_v⁽¹⁾(k, i) representing the frequency with which the ith feature class is assigned as the kth topic) and a topic-image matrix

Z_D⁽¹⁾(k, j) represents the number of feature classes contained in the jth image and assigned as the kth topic).

When t is more than or equal to 2, the number K of the themes of the layer is firstly counted_tNumber of subjects initialized to previous layer K_t-1Then, respectively sampling the stable frequency of the feature classes contained in the previous layer of each theme in the image of each microblog distributed to the current layer of each theme according to the Gibbs sampling principle

(Here, the topic of the previous layer can be regarded as the feature class under the topic of the current layer), and an image-current layer topic matrix is obtained

(i, j) represents the number of features of the ith image which are allocated to the ith subject of the current layer; sampling the ratio of each topic feature class in the upper layer under each topic in the current layer

And (i, k) represents the proportion of the feature class of the ith theme of the upper layer under the kth theme of the current layer, and all images are considered.

Sampling and calculating probability parameters layer by layer. From the outer current layer T to the bottommost layer down layer by layer, each layer samples the values of the other partial matrix. First using Z _ v^(T)Sampling out the weight vector r of the feature class contained in each topic of the external current layer T^(T)(the larger the corresponding theme weight is, the heavier the occupied proportion is); then use r^(T)(when T is T) or

And

(t<t time) as probability generation parameter and sampling basis, and sampling to obtain the data of lower layer

It is noted that, starting from a layer, θ and the associated probability parameters of all layers above it become imaged (θ)From images

Obtaining a matrix obtained by splicing along the theme dimension as a probability parameter sample; the relevant common probability parameter is obtained by theta sampling), so that when the layers are sampled, theta sum of higher layers is obtained

And after matrix multiplication, splicing to obtain a matrix which is a public probability parameter for sampling. When the number of iterations reaches a certain threshold B_TWhen the active theme in the current layer is removed (namely, some theme do not contain any feature class of the lower layer theme), the number K of the theme in the current layer is cut_t. And when the iterative sampling of all layers is finished, finishing the training to obtain the distribution condition of all microblog picture characteristic classes in the training image under each layer theme.

Based on the step S220, further, initializing a first network parameter, a second network parameter and a preset third network parameter of the poisson gamma belief network; and taking the image input matrix as the input of a Poisson gamma belief network, sampling layer by layer from the bottommost layer to the topmost layer of the Poisson gamma belief network, and iteratively updating the first network parameter, the second network parameter and the third network parameter until a preset iteration number is reached to obtain the distribution condition of the subjects in the image.

Further, specifically, a first network parameter θ, a second network parameter r, and a preset third network parameter Φ of the poisson gamma belief network are initialized, where the preset third network parameter Φ is Φ obtained by training in the training image. At a certain number of iterations (B)_T+C_T) Training all layers in two steps, and executing the following process in each iteration: using the total input matrix X layer by layer from the bottommost layer to the topmost layer_vGenerated by training set

And formed by splicing training sets and test sets along user dimensions

(or theta)^(t)) For total data set

(i.e., the topic-user matrix) is sampled. The correlated probability parameters are generated from the second to top-most samples. At the top layer, the top layer r obtained by training is used as a probability generation parameter, and theta is sampled and generated^(T)。θ^(T)The tail of (i.e. all columns after a certain column) is the distribution situation of the topics of the microblog images of the user. And after the iteration is finished, obtaining the distribution condition of each theme in the microblog image of the user.

Referring to fig. 3, an embodiment of the present invention provides an image processing apparatus 300, which may include: a first acquisition unit 320, a second acquisition unit 330, and a third acquisition unit 340.

The first obtaining unit 320 is configured to obtain an image in the microblog of the user.

A second obtaining unit 330, configured to obtain an image input matrix corresponding to the image.

The second acquisition unit 330 may include a second acquisition sub-unit 331.

The second obtaining subunit 331 is configured to perform sift feature extraction on the image, obtain a first feature vector corresponding to the image, and obtain an image input matrix corresponding to the image based on the first feature vector.

A third obtaining unit 340, configured to obtain a total input matrix based on the image input matrix and a preset training set input matrix, and obtain a theme distribution situation in the image based on the total input matrix.

The third acquiring unit 340 may include a third acquiring subunit 341.

The third obtaining subunit 341 is configured to splice the image input matrix with a preset training set input matrix to obtain a total input matrix.

The third obtaining subunit 341 is further configured to initialize a first network parameter, a second network parameter, and a preset third network parameter of the poisson gamma belief network; and taking the image input matrix as the input of a Poisson gamma belief network, sampling layer by layer from the bottommost layer to the topmost layer of the Poisson gamma belief network, and iteratively updating the first network parameter, the second network parameter and the third network parameter until a preset iteration number is reached to obtain the distribution condition of the subjects in the image.

Referring to fig. 4, the apparatus 300 may further include: a training unit 310.

The training unit 310 is configured to obtain training images in multiple microblogs, and obtain the training set input matrix based on the training images.

The training unit 310 may comprise a training subunit 311.

A training subunit 311, configured to perform sift feature extraction on each training image to obtain a second feature vector corresponding to each training image; acquiring a clustering center of each type and image features contained in each type based on a preset clustering algorithm and a second feature vector corresponding to each training image; and counting the number of the image features contained in each training image to obtain a training set input matrix corresponding to a plurality of training images.

The training unit 310 is further configured to set a maximum number of topics at a bottommost layer of the poisson gamma belief network; randomly distributing a theme for each training image based on the training set input matrix, obtaining an initialization matrix and generating an initial value of each probability parameter; and iteratively training the Poisson gamma belief network based on the initialization matrix and the initial values of all probability parameters to obtain the theme distribution condition in the training image.

The above units may be implemented by software codes, and in this case, the above units may be stored in the memory 102. The above units may also be implemented by hardware, for example, an integrated circuit chip.

The image processing apparatus 300 according to the embodiment of the present invention has the same implementation principle and technical effect as the foregoing method embodiments, and for the sake of brief description, reference may be made to the corresponding contents in the foregoing method embodiments for parts of the embodiment without reference.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims

1. An image processing method applied to an electronic device, the method comprising:

acquiring an image in a microblog of a user;

obtaining an image input matrix corresponding to the image;

obtaining a total input matrix based on the image input matrix and a preset training set input matrix, and obtaining a theme distribution condition in the image based on the total input matrix;

obtaining an image input matrix corresponding to the image, including:

performing sift feature extraction on the image to obtain a first feature vector corresponding to the image and obtain an image input matrix corresponding to the image based on the first feature vector;

obtaining a total input matrix based on the image input matrix and a preset training set input matrix, including:

splicing the image input matrix with a preset training set input matrix to obtain a total input matrix;

before the image input matrix is spliced with a preset training set input matrix to obtain a total input matrix, the method further comprises the following steps:

acquiring training images in a plurality of microblogs, and acquiring the training set input matrix based on the training images;

obtaining the training set input matrix based on the training image, including:

performing sift feature extraction on each training image to obtain a second feature vector corresponding to each training image;

acquiring a clustering center of each type and image features contained in each type based on a preset clustering algorithm and a second feature vector corresponding to each training image;

and counting the number of the image features contained in each training image to obtain a training set input matrix corresponding to a plurality of training images.

2. The method of claim 1, wherein after obtaining the training set input matrix based on the training image, the method further comprises:

3. The method of claim 1, wherein obtaining the distribution of the subject in the image based on the image input matrix comprises:

initializing a first network parameter, a second network parameter and a preset third network parameter of the Poisson gamma belief network;

and taking the image input matrix as the input of a Poisson gamma belief network, sampling layer by layer from the bottommost layer to the topmost layer of the Poisson gamma belief network, and iteratively updating the first network parameter, the second network parameter and the third network parameter until a preset iteration number is reached to obtain the distribution condition of the subjects in the image.

4. An image processing apparatus, characterized in that the apparatus comprises:

the first acquisition unit is used for acquiring an image in a microblog of a user;

the second acquisition unit is used for acquiring an image input matrix corresponding to the image;

a third obtaining unit, configured to obtain a total input matrix based on the image input matrix and a preset training set input matrix, and obtain a theme distribution condition in the image based on the total input matrix;

wherein the second acquisition unit includes:

the second obtaining subunit is used for performing sift feature extraction on the image, obtaining a first feature vector corresponding to the image and obtaining an image input matrix corresponding to the image based on the first feature vector;

the third acquisition unit includes:

the third acquisition subunit is used for splicing the image input matrix with a preset training set input matrix to obtain a total input matrix;

the device further comprises: the training unit is used for acquiring training images in a plurality of microblogs and acquiring the training set input matrix based on the training images;

the training unit includes: the training subunit is used for performing sift feature extraction on each training image to obtain a second feature vector corresponding to each training image; acquiring a clustering center of each type and image features contained in each type based on a preset clustering algorithm and a second feature vector corresponding to each training image; and counting the number of the image features contained in each training image to obtain a training set input matrix corresponding to a plurality of training images.