CN114821216A - Method for modeling and using picture descreening neural network model and related equipment - Google Patents

Method for modeling and using picture descreening neural network model and related equipment Download PDF

Info

Publication number
CN114821216A
CN114821216A CN202110127054.4A CN202110127054A CN114821216A CN 114821216 A CN114821216 A CN 114821216A CN 202110127054 A CN202110127054 A CN 202110127054A CN 114821216 A CN114821216 A CN 114821216A
Authority
CN
China
Prior art keywords
picture
descreening
neural network
network model
convolutions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110127054.4A
Other languages
Chinese (zh)
Inventor
曲华
黄平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN202110127054.4A priority Critical patent/CN114821216A/en
Publication of CN114821216A publication Critical patent/CN114821216A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

The embodiment of the disclosure provides a picture descreening neural network model modeling method, a using method, a device, a computer readable storage medium and an electronic device, and belongs to the technical field of computers and communication. The modeling method comprises the following steps: acquiring a training set of a picture descreening neural network model; training the picture descreening neural network model by using the training set; the image descreening neural network model comprises a global branch and a local branch; the global branch comprises a hole space convolution pooling network, parallel sampling a given input with a plurality of hole convolutions having different expansion rates; the local branches include dense multi-scale fusion blocks that include convolutions with different convolution kernels, and hierarchical features extracted from different branch convolutions are combined and merged. The modeling method disclosed by the invention can realize the modeling of the picture descreening neural network model.

Description

Method for modeling and using picture descreening neural network model and related equipment
Technical Field
The present disclosure relates to the field of computer and communication technologies, and in particular, to a method and an apparatus for modeling and using a picture descreening neural network model, a computer-readable storage medium, and an electronic device.
Background
A large number of identification photos and field photos exist in a service system of a telecommunication operator, and real-name verification is required according to a real-name system management system of a national telecommunication product to confirm that the field photos and the identification photos of an operator belong to the same person.
However, in the certificate photo storage photo library, due to the reason of user information safety, a certain proportion of certificate photos are encrypted by random cross hatch lines, which directly changes the characteristic value of the human face, leads to different human face characteristics, the inter-class distance becomes smaller, the intra-class distance becomes larger, and if the human face comparison algorithm is directly used for identification, the false judgment can be made seriously. Therefore, the comparison can be performed by using the face comparison algorithm only by removing the mesh lines on the photo and recovering the face information of the user.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The embodiment of the disclosure provides a picture descreening neural network model modeling method, a using device, a computer readable storage medium and an electronic device, which can realize the modeling of the picture descreening neural network model.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
According to one aspect of the present disclosure, a method for modeling a picture descreening neural network model is provided, which includes:
acquiring a training set of a picture descreening neural network model;
training the picture descreening neural network model by using the training set;
the image descreening neural network model comprises a global branch and a local branch;
the global branch comprises a hole space convolution pooling network, and a given input is subjected to parallel sampling by a plurality of hole convolutions with different expansion rates;
the local branches include dense multi-scale fusion blocks that include convolutions with different convolution kernels, and hierarchical features extracted from different branch convolutions are combined and merged.
In one embodiment, further comprising:
acquiring a picture without reticulate patterns;
adding artificial textures to the pictures without textures to obtain pictures with textures;
wherein the training set comprises the artificial texture, the non-textured picture and the textured picture.
In one embodiment, convolving a given input with a plurality of holes having different expansion rates with parallel sampling comprises:
a given input is sampled in parallel with 4 hole convolutions having different expansion rates, the expansion rates of the 4 hole convolutions being 1, 6, 12 and 18, respectively.
In one embodiment, prior to the parallel sampling process, the expansion rate of the convolution of the previous net block of the 4 hole convolutions is 2.
In one embodiment, the number of dense multi-scale fusion blocks in the local branch is 3, 4 or 5.
In one embodiment, further comprising:
optimizing the parameters of the image descreening neural network model by adopting a new loss function;
wherein the new loss function comprises a pixel level loss function and a feature level loss function;
wherein the pixel level loss function is:
Figure BDA0002924354760000021
wherein Mean is error Is the average error of the pixel;
Figure BDA0002924354760000022
is generated by a picture without reticulate pattern on a picture desreticulate pattern neural network model layer I gt Is generated by the activation feature map of (1),
Figure BDA0002924354760000023
is generated on the layer I after the picture with the reticulate pattern passes through the picture desreticulate pattern neural network model output Is generated by the activation feature map of (1),
Figure BDA0002924354760000031
is that
Figure BDA0002924354760000032
The number of elements (c); ☉ is an element product operator;
Figure BDA0002924354760000033
Figure BDA0002924354760000034
to represent
Figure BDA0002924354760000035
The number of channels outputting the feature map at model 1 level, 1e3 represents the number 1000;
wherein the feature level loss function is:
Figure BDA0002924354760000036
where m is the number of representation picture feature values; e is a natural vector; s is a eigenvalue scaling ratio; theta is an included angle between the feature vector and the model weight; y is i Representing a real tag; n represents an angular interval, typically 0.5; i and j are variables.
According to an aspect of the present disclosure, there is provided a method for descreening a picture, comprising:
acquiring a picture with a reticulate pattern;
removing the reticulate patterns in the reticulate pattern-carrying picture by using a picture desreticulate neural network model;
the image descreening neural network model comprises a global branch and a local branch;
the global branch comprises a hole space convolution pooling network, parallel sampling a given input with a plurality of hole convolutions having different expansion rates;
the local branches include dense multi-scale fusion blocks that include convolutions with different convolution kernels, and hierarchical features extracted from different branch convolutions are combined and merged.
According to an aspect of the present disclosure, there is provided an apparatus for descreening a picture, comprising:
the acquisition module is configured to acquire a picture with a reticulate pattern;
a descreening module configured to remove the screen in the screen-added picture using a picture descreening neural network model;
the picture descreening neural network model comprises global branches and local branches;
the global branch comprises a hole space convolution pooling network, parallel sampling a given input with a plurality of hole convolutions having different expansion rates;
the local branches include dense multi-scale fusion blocks that include convolutions with different convolution kernels, and hierarchical features extracted from different branch convolutions are combined and merged.
According to an aspect of the present disclosure, there is provided an electronic device including:
one or more processors;
a storage device configured to store one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the method of any of the above embodiments.
According to an aspect of the present disclosure, there is provided a computer readable storage medium storing a computer program which, when executed by a processor, implements the method of any one of the above embodiments.
In the technical scheme provided by some embodiments of the disclosure, the modeling of the picture descreening neural network model can be realized.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The following figures depict certain illustrative embodiments of the invention in which like reference numerals refer to like elements. These described embodiments are to be considered as exemplary embodiments of the disclosure and not limiting in any way.
FIG. 1 illustrates a schematic diagram of an exemplary system architecture to which the picture descreening neural network model modeling method and method of use of embodiments of the present disclosure may be applied;
FIG. 2 illustrates a schematic structural diagram of a computer system suitable for use with the electronic device implementing embodiments of the present disclosure;
FIG. 3 schematically illustrates a flow diagram of a picture descreening neural network model modeling method, according to an embodiment of the present disclosure;
FIG. 4 illustrates a flow of a method for obtaining a training set of a picture descreening neural network model according to an embodiment of the present disclosure;
FIG. 5 illustrates a flow of a method for obtaining a textured picture for use in the training set according to an embodiment of the present disclosure;
FIG. 6 is a diagram illustrating the structure and usage flow of a graph descreening neural network model according to an embodiment of the present disclosure;
FIG. 7 shows a schematic structural diagram of a dense multi-scale fusion block according to an embodiment of the present disclosure;
FIG. 8 illustrates a pre-and post-ablation effect graph of a textured picture removed by a picture descreened neural network model according to an embodiment of the present disclosure;
FIG. 9 illustrates a flow diagram for training and using a picture descreening neural network model according to an embodiment of the present disclosure;
FIG. 10 schematically illustrates a block diagram of a picture descreening device, according to an embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the disclosure.
The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.
The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
FIG. 1 illustrates a schematic diagram of an exemplary system architecture 100 to which the picture descreening neural network model modeling methods and methods of use of embodiments of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include one or more of terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. For example, server 105 may be a server cluster comprised of multiple servers, or the like.
The staff member may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may be various electronic devices having display screens including, but not limited to, smart phones, tablets, portable and desktop computers, digital cinema projectors, and the like.
The server 105 may be a server that provides various services. For example, the staff member sends a modeling request of the picture descreening neural network model to the server 105 by using the terminal device 103 (which may also be the terminal device 101 or 102). The server 105 may obtain a training set of the picture descreening neural network model; training the picture descreening neural network model by using the training set; the image descreening neural network model comprises a global branch and a local branch; the global branch comprises a hole space convolution pooling network, parallel sampling a given input with a plurality of hole convolutions having different expansion rates; the local branches include dense multi-scale fusion blocks that include convolutions with different convolution kernels, and hierarchical features extracted from different branch convolutions are combined and merged. The server 105 may display the trained picture descreening neural network model on the terminal device 103, and further, the staff may view the picture descreening neural network model based on the content displayed on the terminal device 103.
Also, for example, the terminal device 103 (also may be the terminal device 101 or 102) may be a smart tv, a VR (Virtual Reality)/AR (Augmented Reality) helmet display, or a mobile terminal such as a smart phone, a tablet computer, etc. on which a navigation, a network appointment, an instant messaging, a video Application (APP) and the like are installed, and a worker may send a picture descreening neural network model modeling request to the server 105 through the smart tv, the VR/AR helmet display or the navigation, the network appointment, the instant messaging, the video APP. The server 105 can obtain the picture cobwebbing-removing neural network model based on the picture cobwebbing-removing neural network model modeling request, and return the picture cobwebbing-removing neural network model to the smart television, the VR/AR helmet-mounted display or the navigation, the network appointment, the instant messaging and the video APP, so that the picture cobwebbing-removing neural network model is displayed through the smart television, the VR/AR helmet-mounted display or the navigation, the network appointment, the instant messaging and the video APP.
FIG. 2 illustrates a schematic structural diagram of a computer system suitable for use in implementing the electronic device of an embodiment of the present disclosure.
It should be noted that the computer system 200 of the electronic device shown in fig. 2 is only an example, and should not bring any limitation to the functions and the scope of the application of the embodiments of the present disclosure.
As shown in fig. 2, the computer system 200 includes a Central Processing Unit (CPU)201 that can perform various appropriate actions and processes in accordance with a program stored in a Read-Only Memory (ROM) 202 or a program loaded from a storage section 208 into a Random Access Memory (RAM) 203. In the RAM 203, various programs and data necessary for system operation are also stored. The CPU 201, ROM 202, and RAM 203 are connected to each other via a bus 204. An input/output (I/O) interface 205 is also connected to bus 204.
The following components are connected to the I/O interface 205: an input portion 206 including a keyboard, a mouse, and the like; an output section 207 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 208 including a hard disk and the like; and a communication section 209 including a Network interface card such as a LAN (Local Area Network) card, a modem, or the like. The communication section 209 performs communication processing via a network such as the internet. A drive 210 is also connected to the I/O interface 205 as needed. A removable medium 211, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is mounted on the drive 210 as necessary, so that a computer program read out therefrom is installed into the storage section 208 as necessary.
In particular, the processes described below with reference to the flowcharts may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via the communication section 209 and/or installed from the removable medium 211. The computer program, when executed by a Central Processing Unit (CPU)201, performs various functions defined in the methods and/or apparatus of the present application.
It should be noted that the computer readable storage medium shown in the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM) or flash Memory), an optical fiber, a portable compact disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable storage medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF (Radio Frequency), etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods, apparatus, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules and/or units and/or sub-units described in the embodiments of the present disclosure may be implemented by software, or may be implemented by hardware, and the described modules and/or units and/or sub-units may also be provided in a processor. Wherein the names of such modules and/or units and/or sub-units in some cases do not constitute a limitation on the modules and/or units and/or sub-units themselves.
As another aspect, the present application also provides a computer-readable storage medium, which may be contained in the electronic device described in the above embodiment; or may exist separately without being assembled into the electronic device. The computer-readable storage medium carries one or more programs which, when executed by an electronic device, cause the electronic device to implement the method described in the embodiments below. For example, the electronic device may implement the steps as shown in fig. 3.
In the related art, for example, a machine learning method, a deep learning method, or the like may be used to perform modeling of a picture descreening neural network model, and the application range of different methods is different.
FIG. 3 schematically illustrates a flow chart of a picture descreening neural network model modeling method according to an embodiment of the present disclosure. The method steps of the embodiment of the present disclosure may be executed by the terminal device, the server, or the terminal device and the server interactively, for example, the server 105 in fig. 1 described above, but the present disclosure is not limited thereto.
In step S310, a training set of the picture descreening neural network model is obtained.
In this step, the terminal device or the server obtains a training set of the picture descreening neural network model.
FIG. 4 shows a flow of a method for acquiring a training set of a picture descreening neural network model according to an embodiment of the present disclosure.
Referring to fig. 4, the method for acquiring the training set of the picture descreening neural network model includes:
step S410: acquiring a picture without reticulate patterns;
step S420: adding artificial textures to the pictures without textures to obtain pictures with textures;
wherein the training set comprises the artificial texture, the non-textured picture and the textured picture.
Fig. 5 shows a flow of a method for acquiring a textured picture used in the training set according to an embodiment of the present disclosure. Referring to fig. 5, the method for acquiring the textured image in the training set can be intuitively known.
In the embodiments of the present disclosure, the terminal device may be implemented in various forms. For example, the terminal described in the present disclosure may include mobile terminals such as a mobile phone, a tablet computer, a notebook computer, a palmtop computer, a Personal Digital Assistant (PDA), a Portable Media Player (PMP), a picture descreening neural network model modeling device, a wearable device, a smart band, a pedometer, a robot, an unmanned vehicle, and the like, and fixed terminals such as a digital TV (television), a desktop computer, and the like.
In step S320, training the picture descreening neural network model by using the training set;
the image descreening neural network model comprises a global branch and a local branch;
the global branch comprises a hole space convolution pooling network, parallel sampling a given input with a plurality of hole convolutions having different expansion rates;
the local branches include dense multi-scale fusion blocks that include convolutions with different convolution kernels, and hierarchical features extracted from different branch convolutions are combined and merged.
In the step, the terminal equipment or the server trains the picture descreening neural network model by using the training set;
the image descreening neural network model comprises a global branch and a local branch;
the global branch comprises a hole space convolution pooling network, parallel sampling a given input with a plurality of hole convolutions having different expansion rates;
the local branches include dense multi-scale fusion blocks that include convolutions with different convolution kernels, and hierarchical features extracted from different branch convolutions are combined and merged.
FIG. 6 shows a schematic diagram of a structure and a use flow of a picture descreening neural network model according to an embodiment of the present disclosure.
Referring to fig. 6, the picture descreening neural network model includes global branches and local branches.
The global branch includes a hole space convolution pooling network that samples a given input in parallel with a plurality of hole convolutions having different expansion rates. Referring to fig. 6, for example, after the net block 4, a given input is sampled in parallel with 4 hole convolutions having different expansion rates, the expansion rates of the 4 hole convolutions being 1, 6, 12 and 18, respectively. In one embodiment, referring to fig. 6, prior to the parallel sampling process, the expansion rate of the convolution of the previous net block of the 4 hole convolutions (net block 4) is 2. In one embodiment, referring to fig. 6, the net block may be a 3 × 3 hole convolution.
The local branches comprise dense multi-scale fusion blocks (blocks of cells) comprising convolutions with different convolution kernels, and hierarchical features extracted from different branch convolutions are combined and merged.
In one embodiment, referring to fig. 6, in the local branch, the number of the dense multi-scale fusion blocks is plural, for example, 3, 4, or 5. In fig. 6, the step size not shown in the local branch is 1, and the numbers 64,128, etc. at the top of the convolution represent the number of convolutions.
Fig. 7 shows a schematic structural diagram of a dense multi-scale fusion block according to an embodiment of the present disclosure. Referring to fig. 7, the dense multi-scale fusion block includes convolutions with different convolution kernels, and hierarchical features extracted from different branch convolutions are combined and merged.
In one embodiment, a new loss function is adopted to optimize the parameters of the picture descreening neural network model;
wherein the new loss function (loss) comprises a pixel level loss function and a feature level loss function, and the formula is as shown in the following formula (1);
loss mix =0.8*(loss pixel )+0.2*loss feature (1)
wherein the pixel level loss function is:
Figure BDA0002924354760000111
wherein Mean is error Is the average error of the pixel;
Figure BDA0002924354760000112
is generated by a picture without reticulate pattern on a picture desreticulate pattern neural network model layer I gt Activation profile of (I) gt A true view of the ground is shown,
Figure BDA0002924354760000113
is generated on the layer I after the picture with the reticulate pattern passes through the picture desreticulate pattern neural network model output Activation signature of (2), I output Which represents the output image(s) of the image,
Figure BDA0002924354760000114
is that
Figure BDA0002924354760000115
The number of elements (c); ☉ is an element product operator;
Figure BDA0002924354760000116
Figure BDA0002924354760000117
to represent
Figure BDA0002924354760000118
The number of channels outputting the feature map at model 1 level, 1e3 represents the number 1000; gt is an abbreviation of ground _ truth, i.e. ground truth map.
Figure BDA0002924354760000119
Mean error Is the average error of the pixel, C represents 3 color channels, I output,c Representing the output image of the c-th color channel, I gt,c A ground truth map for the c-th color channel.
Wherein the feature level loss function is:
Figure BDA0002924354760000121
where m is the number of representation picture feature values; e is a natural vector; s is a eigenvalue scaling ratio; theta is an included angle between the feature vector and the model weight; y is i Representing a real tag; n represents an angular interval, typically 0.5; i and j are variables.
In one embodiment, referring to fig. 6, the image descreening neural network model of the present disclosure further designs a facial geometric alignment constraint term for compensating the pixel-based distance between the predicted feature and the real feature.
In one embodiment, the present application provides a method of descreening a picture, comprising:
acquiring a picture with a reticulate pattern;
removing reticulate patterns in the reticulate pattern-carrying picture by using a picture reticulate pattern removing neural network model;
the image descreening neural network model comprises a global branch and a local branch;
the global branch comprises a hole space convolution pooling network, parallel sampling a given input with a plurality of hole convolutions having different expansion rates;
the local branches include dense multi-scale fusion blocks that include convolutions with different convolution kernels, and hierarchical features extracted from different branch convolutions are combined and merged.
Referring to fig. 6, in the method for descreening a picture provided by the present application, the picture with the screen is input into the neural network model for descreening the picture established by the method of fig. 3, so that the screen in the picture can be effectively removed.
The picture descreening neural network model disclosed by the invention adopts a global and local two-branch structure, improves the model accuracy and generalization capability, and has wide self-adaptive capability.
In the global branch, a cavity space convolution pooling network is used as a backbone network of the global branch, and the given input is subjected to parallel sampling by cavity convolutions with different expansion rates, so that the cavity convolutions with different expansion rates can effectively capture multi-scale information.
In local branches, dense multi-scale fusion blocks (DMFB) are used as unit blocks of the branches, and hierarchical features extracted from convolutions of different branches are combined and merged by utilizing dense convolution combination of multi-scale expansion so as to obtain better multi-scale features.
The facial geometric alignment constraint term may further assist in handling fine-grained image repairs.
The method and the device use a new fusion regression loss function to optimize the model parameters, are used for centralizing uncertain regions and enhancing semantic details, give consideration to pixel loss and facial feature loss, and can dynamically minimize the facial feature loss and the pixel loss between a synthesized generated picture and a true value. This loss may further aid in handling fine-grained repairs of images.
The method disclosed by the invention improves the generalization capability of the picture descreening neural network model, can keep a good elimination effect on identification pictures with different sizes and different random screen lines, basically keeps the human face characteristics of the original picture after descreening by the picture descreening neural network model, and the descreening recognition result is close to the original picture. The recognition results are shown in table 1 below.
FIG. 8 illustrates a pre-and post-strip effect graph of a striped picture removed by a picture descreened neural network model according to an embodiment of the present disclosure. As can be seen from fig. 8, the image descreening neural network model of the present disclosure has a better descreening function.
Table 1 below shows the statistical comparison of the descreening effect.
As can be seen from the comparison table 1, after the image texture removing neural network model is used for removing the image texture, the human face characteristics of the original image are basically kept, and the texture removing recognition result is close to the original image.
Wherein TPR is the true class ratio and FPR is the negative-positive class ratio.
TPR @ FPR ═ 1% indicates the correct recognition rate at one percent error rate.
TABLE 1
TPR@FPR=1% TPR@FPR=0.1% TPR@FPR=0.01%
Un-screened picture recognition 95.3 77.2 52.3
Screening image recognition 47.7 31.5 14.2
Re-identification of descreened pictures 87.3 71.4 37.2
FIG. 9 illustrates a flow diagram for training and using a picture descreening neural network model according to an embodiment of the present disclosure.
Referring to fig. 9:
step A, picture preprocessing: acquiring a certificate image from a production system (such as a historical picture library), removing abnormal pictures through a face detection model, and removing duplication of the pictures to obtain the certificate picture library, wherein a file name comprises a user unique ID, a picture type and service product type information;
and B: classifying the de-duplicated certificate photos by using a reticulate pattern discrimination network to respectively obtain a certificate photo library L1 with reticulate patterns and a certificate photo library L2 without the reticulate patterns;
step C, constructing a training data set of the picture descreening neural network model:
for the non-texture picture library L2, a random texture (artificial texture) generation network is used, and texture generation parameters such as texture pattern type, curvature, line width, affine factors and other various influence factors are randomly extracted; and simultaneously, randomly adding noise, cutting, blurring the reticulate pattern lines, and artificially generating a picture with reticulate patterns and a corresponding mask picture library L3 as a training set of the model.
The method comprises the following specific steps
C1: randomly extracting reticulate curve parameters including reticulate pattern type, curvature, line width, affine factors (including translation parameters, shearing parameters, scaling parameters and selection parameters) and noise information factors;
c2: generating a normalized basic net line;
c3: performing longitudinal adjustment- > performing the 'scaling + conversion' of the mesh lines;
c4: performing horizontal adjustment- > performing 'rotation/affine' of the mesh line;
selecting different affine matrixes according to the affine factor parameters for transformation:
such as [ [1, -0.5], [ -0.15,1] ] or [ [1.15,1.1], [ -0.45,0.7] ];
c5: fuzzification operation is carried out on the reticulate pattern line, and one of mean filtering, bilateral filtering, median filtering and Gaussian filtering is adopted to obtain different fuzzification effects;
c6: randomly cutting or adjusting the size of the image;
c7: noise addition: adding random noise points according to the noise information factors;
c8: adding the generated reticulate pattern to the original image;
c9: storing a reticulate pattern mask image and a corresponding reticulate pattern image;
step D: a method of constructing a multi-scale descreened neural network Model (MDN):
d1, defining a descreened neural network model structure: the whole model adopts a local branch form and a global branch form, and outputs a required characteristic diagram for representing the characteristics of an original face image through operations such as repeated convolution, cavity convolution, pooling, dense multi-scale fusion block, transposition convolution, batch normalization, upsampling and the like, wherein the detailed operation is shown in an attached figure 6;
d11, the global branch adopts a cavity space convolution pooling network as a backbone network, improves a cavity space convolution pooling method on the space dimension, and performs parallel sampling on the cavity convolutions of the given input with different expansion rates, which is equivalent to capturing the context of the image with a plurality of spatial scales. The characteristic top mapping graph uses four cavity convolutions with different expansion rates, and the cavity convolutions with different expansion rates can effectively capture multi-scale information.
D12, introducing a dense multi-scale fusion block (DMFB) at a local branch as a unit block of a model, wherein the unit block comprises convolutions with different convolution kernels, and hierarchical features extracted from the convolutions of different branches are combined and merged to obtain better multi-scale features.
D2: define a new loss function:
a new fused regression loss function (see equations 1-4) is used to optimize the model parameters for centralizing uncertain regions and enhancing semantic details, taking into account pixel loss and facial feature loss, so that the model can dynamically minimize facial feature loss and pixel loss between the synthetically generated picture and the true value. Such a loss function may further assist in handling fine-grained repairs of images.
And E, step E: the training method for the multi-scale descreening neural network model comprises the following steps:
e1: initializing the weight of the neural network;
e2: forward propagation: calculating a loss function value obtained by the neural network model according to the current weight parameter of the neural network model for the randomly extracted human face image with the reticulate pattern in the artificially generated training set C;
e3: determining a gradient vector by back propagation;
e4: finally, each weight is adjusted through the gradient vector, so that the error tends to 0 or the convergence trend is adjusted;
e5: repeating the above process until the set number of times or the average value of the error loss of the batch does not decrease;
step F, the subsequent steps: and removing the reticulate patterns of the pictures in the certificate photo library L1 with the reticulate patterns by using the trained picture reticulate pattern removing neural network model, and comparing the live photos with a reticulate pattern removing face photo library D.
FIG. 10 schematically illustrates a block diagram of a picture descreening device, according to an embodiment of the present disclosure. The picture descreening apparatus 1000 provided in the embodiment of the present disclosure may be disposed on a terminal device, or may be disposed on a server side, or may be partially disposed on the terminal device and partially disposed on the server side, for example, may be disposed on the server 105 in fig. 1, but the present disclosure is not limited thereto.
The picture descreening apparatus 1000 provided by the embodiments of the present disclosure may include an acquisition module 1010 and a descreening module 1020.
The acquisition module is configured to acquire a picture with a reticulate pattern;
the descreen module is configured to descreen the checkered picture using a picture descreen neural network model (trained complete);
the image descreening neural network model comprises a global branch and a local branch;
the global branch comprises a hole space convolution pooling network, parallel sampling a given input with a plurality of hole convolutions having different expansion rates;
the local branches include dense multi-scale fusion blocks that include convolutions with different convolution kernels, and hierarchical features extracted from different branch convolutions are combined and merged.
According to an embodiment of the present disclosure, the above-mentioned picture descreening apparatus 1000 may be used in the picture descreening method described in the present disclosure.
It is understood that the fetch module 1010 and the descreen module 1020 may be combined in one module, or any one of the modules may be split into multiple modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present invention, at least one of the acquisition module 1010 and the descreen module 1020 may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or any other reasonable manner of integrating or packaging a circuit, as hardware or firmware, or as a suitable combination of software, hardware, and firmware implementations. Alternatively, at least one of the fetch module 1010 and the descreen module 1020 may be implemented at least in part as a computer program module that, when executed by a computer, performs the functions of the respective module.
It should be noted that although several modules, units and sub-units of the apparatus for action execution are mentioned in the above detailed description, such division is not mandatory. Indeed, the features and functionality of two or more modules, units and sub-units described above may be embodied in one module, unit and sub-unit, in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module, unit and sub-unit described above may be further divided into embodiments by a plurality of modules, units and sub-units.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements that have been described above and shown in the drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A modeling method of a picture descreening neural network model is characterized by comprising the following steps:
acquiring a training set of a picture descreening neural network model;
training the picture descreening neural network model by using the training set;
the image descreening neural network model comprises a global branch and a local branch;
the global branch comprises a hole space convolution pooling network, parallel sampling a given input with a plurality of hole convolutions having different expansion rates;
the local branches include dense multi-scale fusion blocks that include convolutions with different convolution kernels, and hierarchical features extracted from different branch convolutions are combined and merged.
2. The method of claim 1, further comprising:
acquiring a picture without reticulate patterns;
adding artificial textures to the pictures without textures to obtain pictures with textures;
wherein the training set comprises the artificial texture, the non-textured picture and the textured picture.
3. The method of claim 1, wherein sampling a given input in parallel with a plurality of hole convolutions having different inflation rates comprises:
a given input is sampled in parallel with 4 hole convolutions having different expansion rates, the expansion rates of the 4 hole convolutions being 1, 6, 12 and 18, respectively.
4. The method of claim 3,
before the parallel sampling process, the expansion rate of the convolution of the previous net block of the 4 hole convolutions is 2.
5. The method of claim 1,
in the local branch, the number of the dense multi-scale fusion blocks is 3, 4 or 5.
6. The method of claim 1, further comprising:
optimizing the parameters of the image descreening neural network model by adopting a new loss function;
wherein the new loss function comprises a pixel-level loss function and a feature-level loss function;
wherein the pixel level loss function is:
Figure FDA0002924354750000011
wherein Mean is error Is the average error of the pixel;
Figure FDA0002924354750000021
is generated by a picture without reticulate pattern on a picture desreticulate pattern neural network model layer I gt Is generated by the activation feature map of (1),
Figure FDA0002924354750000022
is generated on the layer I after the picture with the reticulate pattern passes through the picture desreticulate pattern neural network model output Is generated by the activation feature map of (1),
Figure FDA0002924354750000023
is that
Figure FDA0002924354750000024
The number of elements (c); ☉ is an element product operator;
Figure FDA0002924354750000025
Figure FDA0002924354750000026
to represent
Figure FDA0002924354750000027
The number of channels outputting the feature map at model 1 level, 1e3 represents the number 1000;
wherein the feature level loss function is:
Figure FDA0002924354750000028
where m is the number of representation picture feature values; e is a natural vector; s is a eigenvalue scaling ratio; theta is an included angle between the feature vector and the model weight; y is i Representing a real tag; n represents an angular interval, typically 0.5; i and j are variables.
7. A method for descreening a picture, comprising:
acquiring a picture with a reticulate pattern;
removing the reticulate patterns in the reticulate pattern-carrying picture by using a picture desreticulate neural network model;
the image descreening neural network model comprises a global branch and a local branch;
the global branch comprises a hole space convolution pooling network, parallel sampling a given input with a plurality of hole convolutions having different expansion rates;
the local branches include dense multi-scale fusion blocks that include convolutions with different convolution kernels, and hierarchical features extracted from different branch convolutions are combined and merged.
8. A device for descreening a picture, comprising:
the acquisition module is configured to acquire a picture with a reticulate pattern;
a descreening module configured to remove the screen in the screen-added picture using a picture descreening neural network model;
the image descreening neural network model comprises a global branch and a local branch;
the global branch comprises a hole space convolution pooling network, parallel sampling a given input with a plurality of hole convolutions having different expansion rates;
the local branches include dense multi-scale fusion blocks that include convolutions with different convolution kernels, and hierarchical features extracted from different branch convolutions are combined and merged.
9. An electronic device, comprising:
one or more processors;
a storage device configured to store one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.
CN202110127054.4A 2021-01-29 2021-01-29 Method for modeling and using picture descreening neural network model and related equipment Pending CN114821216A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110127054.4A CN114821216A (en) 2021-01-29 2021-01-29 Method for modeling and using picture descreening neural network model and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110127054.4A CN114821216A (en) 2021-01-29 2021-01-29 Method for modeling and using picture descreening neural network model and related equipment

Publications (1)

Publication Number Publication Date
CN114821216A true CN114821216A (en) 2022-07-29

Family

ID=82525542

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110127054.4A Pending CN114821216A (en) 2021-01-29 2021-01-29 Method for modeling and using picture descreening neural network model and related equipment

Country Status (1)

Country Link
CN (1) CN114821216A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116664566A (en) * 2023-07-28 2023-08-29 成都数智创新精益科技有限公司 OLED panel screen printing quality control method, system and device and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116664566A (en) * 2023-07-28 2023-08-29 成都数智创新精益科技有限公司 OLED panel screen printing quality control method, system and device and medium
CN116664566B (en) * 2023-07-28 2023-09-26 成都数智创新精益科技有限公司 OLED panel screen printing quality control method, system and device and medium

Similar Documents

Publication Publication Date Title
WO2020199693A1 (en) Large-pose face recognition method and apparatus, and device
WO2020119527A1 (en) Human action recognition method and apparatus, and terminal device and storage medium
CN112183501B (en) Depth counterfeit image detection method and device
CN111860398B (en) Remote sensing image target detection method and system and terminal equipment
CN112232914B (en) Four-stage virtual fitting method and device based on 2D image
CN107688783B (en) 3D image detection method and device, electronic equipment and computer readable medium
WO2019226366A1 (en) Lighting estimation
CN112183541B (en) Contour extraction method and device, electronic equipment and storage medium
CN113689372B (en) Image processing method, apparatus, storage medium, and program product
JP7282474B2 (en) Encryption mask determination method, encryption mask determination device, electronic device, storage medium, and computer program
CN110838085B (en) Super-resolution reconstruction method and device for image and electronic equipment
CN114399814B (en) Deep learning-based occlusion object removing and three-dimensional reconstructing method
CN110163095B (en) Loop detection method, loop detection device and terminal equipment
CN114821216A (en) Method for modeling and using picture descreening neural network model and related equipment
CN113409307A (en) Image denoising method, device and medium based on heterogeneous noise characteristics
CN113506305A (en) Image enhancement method, semantic segmentation method and device for three-dimensional point cloud data
CN112884702A (en) Polyp identification system and method based on endoscope image
CN112991274A (en) Crowd counting method and device, computer equipment and storage medium
CN110288691B (en) Method, apparatus, electronic device and computer-readable storage medium for rendering image
CN116994245A (en) Space transcriptome analysis method, device and readable medium based on deep learning
CN113496138A (en) Dense point cloud data generation method and device, computer equipment and storage medium
CN113628349B (en) AR navigation method, device and readable storage medium based on scene content adaptation
CN114596209A (en) Fingerprint image restoration method, system, equipment and storage medium
CN115809959A (en) Image processing method and device
CN113781653A (en) Object model generation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination