CN117440172A - Picture compression method and device - Google Patents

Picture compression method and device Download PDF

Info

Publication number
CN117440172A
CN117440172A CN202311754551.2A CN202311754551A CN117440172A CN 117440172 A CN117440172 A CN 117440172A CN 202311754551 A CN202311754551 A CN 202311754551A CN 117440172 A CN117440172 A CN 117440172A
Authority
CN
China
Prior art keywords
picture
processed
input information
image
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311754551.2A
Other languages
Chinese (zh)
Other versions
CN117440172B (en
Inventor
韩思
王宗力
顾周彤
杜婧仪
侯大猷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Financial Leasing Co ltd
Original Assignee
Jiangsu Financial Leasing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Financial Leasing Co ltd filed Critical Jiangsu Financial Leasing Co ltd
Priority to CN202311754551.2A priority Critical patent/CN117440172B/en
Publication of CN117440172A publication Critical patent/CN117440172A/en
Application granted granted Critical
Publication of CN117440172B publication Critical patent/CN117440172B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The embodiment of the specification discloses a picture compression method and device, wherein the method comprises the following steps: inputting the picture to be processed into a content recognition model, outputting content input information of the picture to be processed, inputting the picture to be processed into a type recognition model, and outputting file type input information of the picture to be processed; inputting file type input information and content input information into a picture generation model, and outputting a processed picture conforming to a first image size threshold; and compressing according to the file type of the processed picture to obtain a compressed picture conforming to the second image size threshold. The apparatus is implemented based on a method.

Description

Picture compression method and device
Technical Field
One or more embodiments of the present disclosure relate to the field of image processing technologies, and in particular, to a method and an apparatus for compressing an image.
Background
With the development of internet technology and image processing technology, the application of image processing technology is more and more. The picture accessory upload and view functions often involve the processing of images.
The existing image compression processing effect is poor, and the quality definition of the compressed image is not high enough because the size requirement of the uploaded file is met. If a high-precision image compression algorithm is adopted, certain requirements are made on the processing capacity of the terminal equipment. That is, when the processing performance of the terminal device is not high enough, the terminal device may affect the processing efficiency of other processing tasks when performing image compression.
Disclosure of Invention
One or more embodiments of the present disclosure describe a method and apparatus for compressing a picture, which can implement higher quality compression of the picture.
In a first aspect, an embodiment of the present disclosure provides a method for compressing a picture, including:
inputting the picture to be processed into a content recognition model, outputting content input information of the picture to be processed, inputting the picture to be processed into a type recognition model, and outputting file type input information of the picture to be processed;
inputting file type input information and content input information into a picture generation model, and outputting a processed picture conforming to a first image size threshold;
compressing according to the file type of the processed picture to obtain a compressed picture conforming to the second image size threshold;
the second image size threshold value is smaller than the first image size threshold value and smaller than the original image size threshold value of the picture to be processed;
the content recognition model is obtained through the following training process:
training a content identification model by utilizing the historical original image and the marked content input information;
in the training process, a first loss function is utilized to calculate a first loss value between the historical content input information and the labeling content input information output by the content identification model based on the historical original image;
The type recognition model is obtained through the following training process:
training the type recognition model by utilizing the historical original image and the labeling type input information;
in the training process, a second loss function is utilized to calculate a second loss value between the type input information and the label type input information, wherein the type input information is output by the type identification model based on the original historical chart;
the picture generation model is obtained through the following training process:
training a picture generation model by utilizing the historical original image and marking input information; the annotation input information comprises annotation type input information and annotation content input information;
in the training process, calculating a third loss value between the historical picture and the historical original picture generated by the picture generation model based on the historical input information by using a third loss function;
and when the first loss value, the second loss value and the third loss value all meet convergence conditions, outputting the trained content recognition model, the trained type recognition model and the trained picture generation model.
In some embodiments, the content recognition model, the type recognition model, and the picture generation model are configured at a server side; the method is executed at a server side and compressed pictures are sent to terminal equipment; or the processing flow of the picture to be processed is executed at the server side, and the compression flow of the picture after processing is executed at the terminal equipment side.
In some embodiments, before inputting the image to be processed into the content recognition model, recognizing whether the image to be processed contains a graph, if yes, when the graph is circular or elliptical or rectangular or square, and the graph is red or blue, intercepting the area of the graph from the image to be processed and forming an accessory graph of the image to be processed; the accessory image and the corresponding image to be processed are marked in an associated mode; if not, inputting the picture to be processed into the content identification model.
In some embodiments, the method further comprises:
before the picture to be processed is input into the content recognition model, recognizing whether the picture to be processed contains the portrait, if so, dividing the picture to be processed into a portrait picture and a picture to be processed which does not contain the portrait, and then inputting the picture to be processed which does not contain the portrait into the content recognition model; if not, inputting the picture to be processed into a content identification model;
after the file type input information and the content input information are input into the picture generation model and the processed picture which accords with the first image size threshold is output, when the processed picture corresponding to the processed picture is judged to be subjected to segmentation processing, the segmented portrait picture is converted according to the pixel ratio between the processed picture which does not contain the portrait and the processed picture, and the compressed portrait picture is obtained; then, superposing the processed picture corresponding to the picture to be processed without the portrait with the compressed portrait picture to generate a new picture to be processed; and the new picture to be processed is subjected to a compression flow according to the file type compression.
In some embodiments, the method further comprises:
when the person image is identified, background filling is carried out on the area where the person image is located, and the space position of the area where the person image is identified and located on the whole picture to be processed is identified;
when the picture superposition flow is carried out, according to the pixel ratio between the picture to be processed and the picture after processing, which do not contain the portrait, the spatial position of the region where the portrait is located on the whole picture to be processed is converted into the spatial position of the portrait picture superimposed on the picture after processing corresponding to the picture to be processed, which do not contain the portrait.
In some embodiments, the method further comprises:
when the person image is identified and obtained, blank filling is carried out on the area where the person image is located;
and when the picture superposition flow is carried out, acquiring a blank area in the processed picture, and superposing the compressed portrait picture to the blank area.
In some embodiments, the compressing according to the file type of the processed picture to obtain the compressed picture meeting the second image size threshold includes:
based on the file type input information, acquiring a compression base number and a second image size threshold value of a corresponding file type from a file type library;
calculating a compression quality based on the compression base, the second image size threshold, and the first image size threshold;
And carrying out picture compression based on the compression quality and the compression base.
In some embodiments, the method further comprises:
and after the compressed picture meeting the second image size threshold is obtained according to the file type of the processed picture, storing the compressed picture meeting the second image size threshold, the processed picture meeting the first image size threshold and the picture to be processed in a picture storage library in a one-to-one correspondence manner.
In a second aspect, embodiments of the present disclosure provide a picture compression apparatus, including:
the model identification module is used for inputting the picture to be processed into the content identification model, outputting the content input information of the picture to be processed, inputting the picture to be processed into the type identification model, and outputting the file type input information of the picture to be processed;
the first processing module is used for inputting file type input information and content input information into the picture generation model and outputting a processed picture which accords with a first image size threshold;
the second processing module is used for compressing according to the file type of the processed picture so as to obtain a compressed picture which accords with the second image size threshold;
the second image size threshold value is smaller than the first image size threshold value and smaller than the original image size threshold value of the picture to be processed;
The model training module is used for training the content recognition model, the type recognition model and the picture generation model; the training process is as follows:
training a content identification model by utilizing the historical original image and the marked content input information; in the training process, a first loss function is utilized to calculate a first loss value between the historical content input information and the labeling content input information output by the content identification model based on the historical original image;
training the type recognition model by utilizing the historical original image and the labeling type input information; in the training process, a second loss function is utilized to calculate a second loss value between the type input information and the label type input information, wherein the type input information is output by the type identification model based on the original historical chart;
training a picture generation model by utilizing the historical original image and marking input information; the annotation input information comprises annotation type input information and annotation content input information; in the training process, calculating a third loss value between the historical picture and the historical original picture generated by the picture generation model based on the historical input information by using a third loss function;
and when the first loss value, the second loss value and the third loss value all meet convergence conditions, outputting the trained content recognition model, the trained type recognition model and the trained picture generation model.
In some embodiments, the apparatus further comprises:
the image recognition module is used for recognizing whether the image to be processed contains an image before the image to be processed is input into the content recognition model, if so, triggering the image segmentation module to work, and if not, sending the image to be processed to the content recognition model of the model recognition module;
the image segmentation module is used for segmenting the image to be processed into a portrait image and a to-be-processed image which does not contain a portrait image, and then sending the to-be-processed image which does not contain a portrait image to the content recognition model of the model recognition module;
the first processing module is further configured to, after the file type input information and the content input information are input into the picture generation model and the processed picture corresponding to the first image size threshold is output, convert the image picture obtained by segmentation according to a pixel ratio between the image to be processed and the processed picture, which does not include the image, when it is determined that the image to be processed corresponding to the processed picture is subjected to segmentation processing, and obtain a compressed image picture; and then, superposing the processed picture corresponding to the picture to be processed without the portrait with the compressed portrait picture, generating a new picture to be processed and sending the new picture to the second processing module.
In some embodiments, the apparatus further comprises:
and the image storage library is used for storing the compressed images meeting the second image size threshold value, the processed images meeting the first image size threshold value and the images to be processed in a one-to-one correspondence manner.
The technical scheme provided by some embodiments of the present specification has the following beneficial effects:
in one or more embodiments of the present disclosure, identifying, by using an identification model, information on both content and type of a picture to be processed, and generating, by using an image generation model, a picture that meets a first image size threshold based on the information on both content and type; in the process, an AI model is utilized to extract information of a picture to be processed with the original image size so as to generate a picture after processing with smaller image size consistent with the content and the type of the picture to be processed, and the generated picture is reduced in size and simultaneously restores all information of the original image; further, the generated picture is compressed by taking the second image size threshold as a target, the complex high-precision compression method is not needed, and the compression is only needed to be realized simply; meanwhile, the image compression precision can be improved by adopting a twice progressive compression mode; in addition, the compression method in the embodiment of the specification can be executed in the server end in the whole process, or the compression is executed at the server end and the terminal equipment respectively, so that the processing performance of the terminal equipment is greatly improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present description, the drawings that are required in the embodiments will be briefly described below, it being apparent that the drawings in the following description are only some embodiments of the present description, and it is not necessary for a person skilled in the art to exercise inventive work,
other figures may also be obtained from these figures.
Fig. 1 is a schematic use diagram of a picture processing method provided in an embodiment of the present disclosure in an actual application scenario;
fig. 2 is a flowchart of a picture processing method according to an embodiment of the present disclosure;
fig. 3 is a block diagram of a picture processing apparatus according to an embodiment of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification.
The terms first, second, third and the like in the description and in the claims and in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
The embodiment of the specification provides a picture compression method, which collects picture information by using a model configured in a server and generates a picture with smaller size, and then performs simple compression to obtain a compressed picture with small size and high image quality without occupying a large amount of computing resources of terminal equipment.
Referring to fig. 2, fig. 2 is a flowchart illustrating a picture compression method according to an embodiment of the present disclosure.
As shown in fig. 2, the picture compression method specifically includes the following steps:
102, inputting a picture to be processed into a content recognition model, outputting content input information of the picture to be processed, inputting the picture to be processed into a type recognition model, and outputting file type input information of the picture to be processed;
104, inputting file type input information and content input information into a picture generation model, and outputting a processed picture conforming to a first image size threshold;
step 106, compressing according to the file type of the processed picture to obtain a compressed picture conforming to the second image size threshold;
the second image size threshold value is smaller than the first image size threshold value and smaller than the original image size threshold value of the picture to be processed.
The method of the embodiment of the specification can be executed at a server side, the server side receives the picture to be processed uploaded from the terminal equipment side, performs small-size picture conversion on the picture to be processed by using a content identification model, a type identification model and a picture generation model which are configured at the server side, and then simply compresses the picture based on the file type. After the server completes the flow, the compressed picture is returned to the terminal equipment, and the terminal equipment stores the compressed picture and displays the compressed picture to the user.
The method of the embodiment of the present specification may be divided into two links, which are executed at a server side and a terminal device side, respectively. Specifically, step 102 to step 104 are executed at the server side, and step 106 is executed at the terminal device side. The server side receives the picture to be processed uploaded from the terminal equipment side and converts the picture to be processed into a small-size picture by utilizing a content identification model, a type identification model and a picture generation model which are configured at the server side. And then, the processed picture is sent to terminal equipment, the terminal equipment simply compresses the processed picture based on the file type, and the obtained compressed picture is stored and displayed to a user.
Fig. 1 shows a communication architecture between a server 100 and a terminal device 200, under which the method of the embodiment of the present specification completes the above-described flow.
Before describing the flow in detail, the models used will be described.
The content recognition model, the type recognition model and the picture generation model are configured at a server side. The model may be trained before being put into the compression method of the embodiments of the present disclosure, and may be trained at the server.
The training is performed based on training samples comprising historical artwork and annotated input information.
The history artwork includes content information and genre information. The uploaded pictures are various files (such as universal format certificates) or certificates (such as identity cards, birth cards, graduation cards, property cards, legal person cards and the like). The content information of the picture mainly refers to text information in the picture, and the type information mainly refers to which file or certificate the picture belongs to, namely a file name or certificate name.
The annotation input information comprises annotation content input information and annotation type input information. The annotation input information may be marked manually or by machine by way of key fields. For example, when the picture is an identity card, the labeling content input information is name three, sex men, 10 months and 2 days of 1990 of birth, the residential address Hangzhou development district east cell 1 unit X102 room, and the identity card number 33XXXXXXXXXXXX0001; the input information of the labeling type is an identity card. For another example, when the picture is a property certificate, the labeling content input information is property certificate number 000102X, property person opens three, the property address Hangzhou city development area eastern district 1 unit X building 102 room, the building area 102.3 is flat, and the property certificate time 2011 is 1 month 2 days; the input information of the labeling type is a real estate certificate.
The content recognition model is obtained through the following training process:
training a content identification model by utilizing the historical original image and the marked content input information;
in the training process, a first loss function is utilized to calculate a first loss value between the historical content input information and the labeling content input information output by the content identification model based on the historical original image;
the type recognition model is obtained through the following training process:
training the type recognition model by utilizing the historical original image and the labeling type input information;
in the training process, a second loss function is utilized to calculate a second loss value between the type input information and the label type input information, wherein the type input information is output by the type identification model based on the original historical chart;
the picture generation model is obtained through the following training process:
training a picture generation model by using the historical original pictures and marking input information; the annotation input information comprises annotation type input information and annotation content input information;
in the training process, calculating a third loss value between the historical picture and the historical original picture generated by the picture generation model based on the historical input information by using a third loss function;
and when the first loss value, the second loss value and the third loss value all meet convergence conditions, outputting the trained content recognition model, the trained type recognition model and the trained picture generation model.
The content recognition model and the type recognition model can adopt an inverse solution model, such as a caption model. And inputting the historical original image to a content recognition model, and outputting historical content input information by the content recognition model. And inputting the history original image to a type recognition model, wherein the type recognition model outputs history type input information. The picture generation model is an AIGC model for generating a picture conforming to a description based on input information (prompt). And inputting the history input information (including the history content input information and the history type input information) into a picture generation model, wherein the picture generation model generates a history picture.
The first loss function, the second loss function and the third loss function can be distance loss functions. A loss value is calculated using the loss function, the loss value being used to characterize the degree of difference between the predicted value output by the model and the actual value of the callout. When the loss value approaches zero, training is converged, otherwise, the weight parameters are continuously updated for training until convergence. The first and second loss functions are mainly used for calculating the loss value between the predicted input information and the marked input information, and the third loss function is mainly used for calculating the loss value between the predicted picture and the original picture.
Next, each step will be described in detail.
In step 102, the picture to be processed is identified using the trained content identification model and the type identification model. The picture to be processed is respectively input into a content recognition model and a type recognition model, the content recognition model outputs content input information, and the type recognition model outputs file type input information.
In step 104, a picture is generated using the trained picture generation model. The content input information and the file type input information obtained in step 102 are input as a prompt to a picture generation model, which generates a processed picture smaller than the image size of the picture to be processed. The processed picture restores the content and the type information of the picture to be processed to a large extent.
Notably, the processed picture is not 1:1 restore the information of the picture to be processed (e.g. unreduced signature), i.e. the processed picture is not the original picture. The processed picture is further compressed in step 106 and finally displayed at the terminal device. The processing purpose is that the compressed picture is higher in quality and cannot be reused by lawbreakers at the terminal equipment because the compressed picture is not an original, and the processing speed of the terminal equipment is not affected.
In order to facilitate tracing the authenticity of the picture, the method in the embodiment of the present disclosure further includes: and storing the compressed picture which accords with the second image size threshold, the processed picture which accords with the first image size threshold and the picture to be processed in a picture storage library in a one-to-one correspondence manner. The picture repository may be configured at the server side with access rights only limited to background administrators. Thus, on one hand, the authenticity of the picture can be traced, and on the other hand, the computing resource of the terminal equipment is not occupied.
In step 106, since different file types are used, if the same standard is used for decompression, files with more content information may be compressed and then cannot be clearly browsed. For this purpose, a compression is performed according to the file type, which meets the quality requirements after the compression of the file.
Specifically, the compressing according to the file type of the processed picture to obtain a compressed picture meeting the second image size threshold includes:
step 1061, based on the file type input information, obtaining a compression base and a second image size threshold value corresponding to the file type from the file type library;
the file type input information from step 102 is obtained. The file type library may be configured in a server, and when step 106 is executed in the terminal device, the file type library in the server may be called to execute step 1061, or an acquisition instruction is sent to the server, and the server acquires related information from the file type library according to the acquisition instruction based on the file type input information, and then feeds back to the terminal device. The file type library may also be configured at the terminal device, which performs step 1061.
The file type library is stored in a one-to-one correspondence mode according to the file type, the compression base and the second image size threshold. The file types are distinguished by file names, for example, the file types can be identity cards, house property cards and the like. The file type input information is typically a file name. Therefore, the file type can be determined by directly obtaining the file type input information in step 102 without performing field recognition on the picture to be processed to determine the file type. Each file type has a particular compression base and a second image size threshold.
Step 1063, calculating a compression quality based on the compression radix and the second image size threshold;
firstly, using a formulaThe quality level of the compression is calculated. The quality class is assumed to be rated 100. R is R n For the compressed quality level, n is the number corresponding to the quality level, S is the size of the processed picture (i.e., the first image size threshold), W is the wide pixel value in the second image size threshold for the file type, and H is the long pixel value in the second image size threshold for the file type.
Next, the formula is usedThe compression quality q is calculated.
In step 1065, picture compression is performed based on the compression quality and the compression base.
The processed pictures are compressed using an image processing tool class, such as the thumb class.
According to the method, the non-difference conversion treatment is carried out on the picture to be processed by utilizing the steps 102-104, namely, after any file type is converted into the first image size threshold value, the picture quality is good. The process of the picture generation can remove unnecessary information and noise in the original picture. Then, in step 106, targeted compression is performed according to the file type.
Notably, embodiments of the present description relate to the new generation and compression of file pictures, aimed at forming pictures of clear quality and that do not occupy terminal resources for review by users. The compressed picture is only referred to in the specific handling, and the real file can be obtained from the server.
It is contemplated that some documents contain graphical information, such as portrait photos, official stamps, other background graphics, etc., in addition to textual information. Considering that other background graphics have little meaning on file information presentation, and the background graphics are often related to corresponding file presentation units and cannot be uniformly shown in a standard file. For this reason, the method of the embodiment of the present specification does not consider other background patterns.
For other graphic information, the method of the embodiment of the present specification performs specific processing.
When the graphic information is a official seal, the method in the embodiment of the present specification further includes:
before inputting the picture to be processed into the content recognition model, recognizing whether the picture to be processed contains a graph, if yes, intercepting the region where the graph is located from the picture to be processed and forming an accessory graph of the picture to be processed when the graph is round, oval, rectangular or square and the graph is red or blue, and carrying out association marking on the accessory graph and the corresponding picture to be processed; inputting the picture to be processed into a content identification model; if not, inputting the picture to be processed into the content identification model.
The pattern recognition process described above is performed at the server. After the server executes step 106, the attachment diagram is cut and compressed to obtain a compressed picture only including the official seal diagram. The compressed accessory map is fed back to the terminal equipment along with the compressed map and displayed on the terminal equipment. Or after the server finishes the step 104, clipping and compressing the attachment graph to obtain a compressed picture only comprising the official seal graph. The compressed attachment diagram is fed back to the terminal device along with the processed picture, and the terminal device executes step 106. Or after the server finishes the step 104, the accessory map is sent to the terminal device along with the processed picture, the terminal device executes the step 106 and also performs clipping and compression processing on the accessory map to obtain a compressed picture only containing the official seal graph. And the accessory map and the compressed accessory map are correspondingly stored in a picture storage library at the position of the picture to be processed. The compressed picture (without official seal) and the compressed accessory picture obtained by the processing in step 106 can be stored and displayed at the terminal device in the form of a main part and an accessory, or the compressed accessory picture and the compressed picture can be overlapped and synthesized by using a picture processing tool.
When the graphic information is a portrait photo, the method in the embodiment of the present disclosure further includes:
before the picture to be processed is input into the content recognition model, recognizing whether the picture to be processed contains the portrait, if so, dividing the picture to be processed into a portrait picture and a picture to be processed which does not contain the portrait, and then inputting the picture to be processed which does not contain the portrait into the content recognition model; if not, inputting the picture to be processed into the content identification model.
And the portrait pictures after the pictures to be processed are segmented and the pictures to be processed which do not contain the portrait are stored in a picture storage library in a one-to-one correspondence manner.
In this example, after performing step 104 and before step 106, the method further comprises: when the picture to be processed corresponding to the picture after processing is judged to be subjected to over-segmentation processing, converting the segmented portrait picture according to the pixel ratio between the picture to be processed and the picture after processing, wherein the portrait picture does not contain a portrait, and obtaining a compressed portrait picture; and then, superposing the processed picture corresponding to the picture to be processed without the portrait with the compressed portrait picture to generate a new picture to be processed. Step 106 compresses the new picture to be processed.
In one embodiment, the process of identifying the image to be processed includes: and when the acquired portrait is identified, background filling is carried out on the region where the portrait is located, and the spatial position of the region where the portrait is acquired on the whole picture to be processed is identified.
When the picture superposition flow is carried out, according to the pixel ratio between the picture to be processed and the picture after processing, which do not contain the portrait, the spatial position of the region where the portrait is located on the whole picture to be processed is converted into the spatial position of the portrait picture superimposed on the picture after processing corresponding to the picture to be processed, which do not contain the portrait.
In another embodiment, the process of identifying the image to be processed includes: and when the person image is identified and obtained, filling the blank of the area where the person image is located.
And when the picture superposition flow is carried out, acquiring a blank area in the processed picture, and superposing the compressed portrait picture to the blank area.
The judgment of the segmentation process can inquire information in a picture storage library to determine that the picture to be processed is segmented when the picture to be processed is also stored with a portrait picture and the picture to be processed which does not contain the portrait under the corresponding item of the picture to be processed; or marking the picture to be processed which is subjected to the over-segmentation processing in the picture storage library, and determining that the picture to be processed is subjected to the over-segmentation when the picture to be processed is identified to have the segmentation mark.
Please refer to fig. 3, which illustrates a schematic structure of a picture compression apparatus according to an embodiment of the present disclosure.
As shown in fig. 3, the picture compression apparatus 1000 may at least include a model identification module 1001, a first processing module 1002, and a second processing module 1003, wherein:
the model identification module 1001 is configured to input a picture to be processed into the content identification model, output content input information of the picture to be processed, input the picture to be processed into the type identification model, and output file type input information of the picture to be processed;
the first processing module 1002 is configured to input file type input information and content input information into a picture generation model, and output a processed picture that meets a first image size threshold;
the second processing module 1003 is configured to compress according to the file type of the processed picture, so as to obtain a compressed picture that meets a second image size threshold;
the second image size threshold value is smaller than the first image size threshold value and smaller than the original image size threshold value of the picture to be processed.
The device of the embodiment of the specification collects the picture information by using the model configured on the server and generates the picture with smaller size, and then performs simple compression, so as to obtain the compressed picture with small size and high image quality without occupying a large amount of computing resources of the terminal equipment.
The device of the embodiment of the specification further comprises a model training device which is used for training the content recognition model, the type recognition model and the picture generation model; the training process is as follows:
training a content identification model by utilizing the historical original image and the marked content input information; in the training process, a first loss function is utilized to calculate a first loss value between the historical content input information and the labeling content input information output by the content identification model based on the historical original image;
training the type recognition model by utilizing the historical original image and the labeling type input information; in the training process, a second loss function is utilized to calculate a second loss value between the type input information and the label type input information, wherein the type input information is output by the type identification model based on the original historical chart;
training a picture generation model by utilizing the historical original image and marking input information; the annotation input information comprises annotation type input information and annotation content input information; in the training process, calculating a third loss value between the historical picture and the historical original picture generated by the picture generation model based on the historical input information by using a third loss function;
and when the first loss value, the second loss value and the third loss value all meet convergence conditions, outputting the trained content recognition model, the trained type recognition model and the trained picture generation model.
The content recognition model, the type recognition model and the picture generation model are configured at a server side. The model may be trained before being put into the compression method of the embodiments of the present disclosure, and may be trained at the server.
The training is performed based on training samples comprising historical artwork and annotated input information.
The history artwork includes content information and genre information. The uploaded pictures are various files (such as universal format certificates) or certificates (such as identity cards, birth cards, graduation cards, property cards, legal person cards and the like). The content information of the picture mainly refers to text information in the picture, and the type information mainly refers to which file or certificate the picture belongs to, namely a file name or certificate name.
The annotation input information comprises annotation content input information and annotation type input information. The annotation input information may be marked manually or by machine by way of key fields.
The content recognition model and the type recognition model may employ inverse solution models, such as a caption model. And inputting the historical original image to a content recognition model, and outputting historical content input information by the content recognition model. And inputting the history original image to a type recognition model, wherein the type recognition model outputs history type input information. The picture generation model is an AIGC model for generating a picture conforming to a description based on input information (prompt). And inputting the history input information (including the history content input information and the history type input information) into a picture generation model, wherein the picture generation model generates a history picture.
The first loss function, the second loss function and the third loss function can be distance loss functions. A loss value is calculated using the loss function, the loss value being used to characterize the degree of difference between the predicted value output by the model and the actual value of the callout. When the loss value approaches zero, training is converged, otherwise, the weight parameters are continuously updated for training until convergence. The first and second loss functions are mainly used for calculating the loss value between the predicted input information and the marked input information, and the third loss function is mainly used for calculating the loss value between the predicted picture and the original picture.
The device of the embodiment of the specification further comprises a storage module configured with a picture storage library and a file type library. The picture storage library is used for storing pictures to be processed, processed pictures conforming to a first image size threshold value and compressed pictures conforming to a second image size threshold value. The file type store stores file types, compression cardinality, and a second image size threshold.
The device of the embodiment of the specification further comprises a human image recognition module and an image segmentation module.
The image recognition module is used for recognizing whether the image to be processed contains an image before the image to be processed is input into the content recognition model, if so, triggering the image segmentation module to work, and if not, sending the image to be processed to the content recognition model of the model recognition module;
The image segmentation module is used for segmenting the image to be processed into a portrait image and a to-be-processed image which does not contain a portrait image, and then sending the to-be-processed image which does not contain a portrait image to the content recognition model of the model recognition module.
The first processing module is further configured to, after the file type input information and the content input information are input into the picture generation model and the processed picture corresponding to the first image size threshold is output, convert the image picture obtained by segmentation according to a pixel ratio between the image to be processed and the processed picture, which does not include the image, when it is determined that the image to be processed corresponding to the processed picture is subjected to segmentation processing, and obtain a compressed image picture; and then, superposing the processed picture corresponding to the picture to be processed without the portrait with the compressed portrait picture, generating a new picture to be processed and sending the new picture to the second processing module.
The device of the embodiment of the specification further comprises a pattern recognition module, wherein the pattern recognition module is used for recognizing whether the picture to be processed contains a pattern or not before the picture to be processed is input into the content recognition model, and if yes, when the pattern is round, oval, rectangular or square, and the pattern is red or blue, the region where the pattern is located is intercepted from the picture to be processed, and an accessory picture of the picture to be processed is formed; the accessory image and the corresponding image to be processed are marked in an associated mode; then, inputting the picture to be processed into a content identification model; if not, inputting the picture to be processed into the content identification model.
The above-described embodiments are merely illustrative of the preferred embodiments of the present invention and are not intended to limit the scope of the present invention, and various modifications and improvements made by those skilled in the art to the technical solution of the present invention should fall within the scope of protection defined by the claims of the present invention without departing from the design spirit of the present invention.

Claims (10)

1. A picture compression method, comprising:
inputting the picture to be processed into a content recognition model, outputting content input information of the picture to be processed, inputting the picture to be processed into a type recognition model, and outputting file type input information of the picture to be processed;
inputting file type input information and content input information into a picture generation model, and outputting a processed picture conforming to a first image size threshold;
compressing according to the file type of the processed picture to obtain a compressed picture conforming to the second image size threshold;
the second image size threshold value is smaller than the first image size threshold value and smaller than the original image size threshold value of the picture to be processed;
the content recognition model is obtained through the following training process:
training a content identification model by utilizing the historical original image and the marked content input information;
In the training process, a first loss function is utilized to calculate a first loss value between the historical content input information and the labeling content input information output by the content identification model based on the historical original image;
the type recognition model is obtained through the following training process:
training the type recognition model by utilizing the historical original image and the labeling type input information;
in the training process, a second loss function is utilized to calculate a second loss value between the type input information and the label type input information, wherein the type input information is output by the type identification model based on the original historical chart;
the picture generation model is obtained through the following training process:
training a picture generation model by utilizing the historical original image and marking input information; the annotation input information comprises annotation type input information and annotation content input information;
in the training process, calculating a third loss value between the historical picture and the historical original picture generated by the picture generation model based on the historical input information by using a third loss function;
and when the first loss value, the second loss value and the third loss value all meet convergence conditions, outputting the trained content recognition model, the trained type recognition model and the trained picture generation model.
2. The method according to claim 1, wherein the content recognition model, the type recognition model, and the picture generation model are configured at a server side; the method is executed at a server side and compressed pictures are sent to terminal equipment; or the processing flow of the picture to be processed is executed at the server side, and the compression flow of the picture after processing is executed at the terminal equipment side.
3. The method according to claim 1, wherein the method further comprises:
before the picture to be processed is input into the content recognition model, recognizing whether the picture to be processed contains the portrait, if so, dividing the picture to be processed into a portrait picture and a picture to be processed which does not contain the portrait, and then inputting the picture to be processed which does not contain the portrait into the content recognition model; if not, inputting the picture to be processed into a content identification model;
after the file type input information and the content input information are input into the picture generation model and the processed picture which accords with the first image size threshold is output, when the processed picture corresponding to the processed picture is judged to be subjected to segmentation processing, the segmented portrait picture is converted according to the pixel ratio between the processed picture which does not contain the portrait and the processed picture, and the compressed portrait picture is obtained; then, superposing the processed picture corresponding to the picture to be processed without the portrait with the compressed portrait picture to generate a new picture to be processed; and the new picture to be processed is subjected to a compression flow according to the file type compression.
4. A method according to claim 3, wherein the method further comprises:
when the person image is identified, background filling is carried out on the area where the person image is located, and the space position of the area where the person image is identified and located on the whole picture to be processed is identified;
when the picture superposition flow is carried out, according to the pixel ratio between the picture to be processed and the picture after processing, which do not contain the portrait, the spatial position of the region where the portrait is located on the whole picture to be processed is converted into the spatial position of the portrait picture superimposed on the picture after processing corresponding to the picture to be processed, which do not contain the portrait.
5. A method according to claim 3, wherein the method further comprises:
when the person image is identified and obtained, blank filling is carried out on the area where the person image is located;
and when the picture superposition flow is carried out, acquiring a blank area in the processed picture, and superposing the compressed portrait picture to the blank area.
6. A method according to claim 1 or 3, wherein said compressing according to the file type of the processed picture to obtain a compressed picture meeting the second image size threshold comprises:
based on the file type input information, acquiring a compression base number and a second image size threshold value of a corresponding file type from a file type library;
Calculating a compression quality based on the compression base, the second image size threshold, and the first image size threshold;
and carrying out picture compression based on the compression quality and the compression base.
7. The method according to claim 1, wherein the method further comprises:
and after the compressed picture meeting the second image size threshold is obtained according to the file type of the processed picture, storing the compressed picture meeting the second image size threshold, the processed picture meeting the first image size threshold and the picture to be processed in a picture storage library in a one-to-one correspondence manner.
8. A picture compression apparatus, comprising:
the model identification module is used for inputting the picture to be processed into the content identification model, outputting the content input information of the picture to be processed, inputting the picture to be processed into the type identification model, and outputting the file type input information of the picture to be processed;
the first processing module is used for inputting file type input information and content input information into the picture generation model and outputting a processed picture which accords with a first image size threshold;
the second processing module is used for compressing according to the file type of the processed picture so as to obtain a compressed picture which accords with the second image size threshold;
The second image size threshold value is smaller than the first image size threshold value and smaller than the original image size threshold value of the picture to be processed;
the model training module is used for training the content recognition model, the type recognition model and the picture generation model; the training process is as follows:
training a content identification model by utilizing the historical original image and the marked content input information; in the training process, a first loss function is utilized to calculate a first loss value between the historical content input information and the labeling content input information output by the content identification model based on the historical original image;
training the type recognition model by utilizing the historical original image and the labeling type input information; in the training process, a second loss function is utilized to calculate a second loss value between the type input information and the label type input information, wherein the type input information is output by the type identification model based on the original historical chart;
training a picture generation model by utilizing the historical original image and marking input information; the annotation input information comprises annotation type input information and annotation content input information; in the training process, calculating a third loss value between the historical picture and the historical original picture generated by the picture generation model based on the historical input information by using a third loss function;
And when the first loss value, the second loss value and the third loss value all meet convergence conditions, outputting the trained content recognition model, the trained type recognition model and the trained picture generation model.
9. The apparatus as recited in claim 8, further comprising:
the image recognition module is used for recognizing whether the image to be processed contains an image before the image to be processed is input into the content recognition model, if so, triggering the image segmentation module to work, and if not, sending the image to be processed to the content recognition model of the model recognition module;
the image segmentation module is used for segmenting the image to be processed into a portrait image and a to-be-processed image which does not contain a portrait image, and then sending the to-be-processed image which does not contain a portrait image to the content recognition model of the model recognition module;
the first processing module is further configured to, after the file type input information and the content input information are input into the picture generation model and the processed picture corresponding to the first image size threshold is output, convert the image picture obtained by segmentation according to a pixel ratio between the image to be processed and the processed picture, which does not include the image, when it is determined that the image to be processed corresponding to the processed picture is subjected to segmentation processing, and obtain a compressed image picture; and then, superposing the processed picture corresponding to the picture to be processed without the portrait with the compressed portrait picture, generating a new picture to be processed and sending the new picture to the second processing module.
10. The apparatus as recited in claim 8, further comprising:
and the image storage library is used for storing the compressed images meeting the second image size threshold value, the processed images meeting the first image size threshold value and the images to be processed in a one-to-one correspondence manner.
CN202311754551.2A 2023-12-20 2023-12-20 Picture compression method and device Active CN117440172B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311754551.2A CN117440172B (en) 2023-12-20 2023-12-20 Picture compression method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311754551.2A CN117440172B (en) 2023-12-20 2023-12-20 Picture compression method and device

Publications (2)

Publication Number Publication Date
CN117440172A true CN117440172A (en) 2024-01-23
CN117440172B CN117440172B (en) 2024-03-19

Family

ID=89553850

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311754551.2A Active CN117440172B (en) 2023-12-20 2023-12-20 Picture compression method and device

Country Status (1)

Country Link
CN (1) CN117440172B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109903350A (en) * 2017-12-07 2019-06-18 上海寒武纪信息科技有限公司 Method for compressing image and relevant apparatus
CN110189385A (en) * 2019-06-05 2019-08-30 京东方科技集团股份有限公司 Model training and picture compression, decompressing method, device, medium and electronic equipment
US20190354811A1 (en) * 2017-12-07 2019-11-21 Shanghai Cambricon Information Technology Co., Ltd Image compression method and related device
CN111598190A (en) * 2020-07-21 2020-08-28 腾讯科技(深圳)有限公司 Training method of image target recognition model, image recognition method and device
CN112241764A (en) * 2020-10-23 2021-01-19 北京百度网讯科技有限公司 Image recognition method and device, electronic equipment and storage medium
CN114219725A (en) * 2021-11-25 2022-03-22 中国科学院深圳先进技术研究院 Image processing method, terminal equipment and computer readable storage medium
US20220222532A1 (en) * 2021-01-13 2022-07-14 Adobe Inc. Compressing generative adversarial neural networks
JP2022132072A (en) * 2021-02-26 2022-09-07 南方科技大学 Image processing method and apparatus, electronic device and storage medium
CN117115287A (en) * 2023-08-25 2023-11-24 维沃移动通信有限公司 Image generation method, device, electronic equipment and readable storage medium
CN117241092A (en) * 2023-09-28 2023-12-15 北京字跳网络技术有限公司 Video processing method and device, storage medium and electronic equipment

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109903350A (en) * 2017-12-07 2019-06-18 上海寒武纪信息科技有限公司 Method for compressing image and relevant apparatus
US20190354811A1 (en) * 2017-12-07 2019-11-21 Shanghai Cambricon Information Technology Co., Ltd Image compression method and related device
CN110189385A (en) * 2019-06-05 2019-08-30 京东方科技集团股份有限公司 Model training and picture compression, decompressing method, device, medium and electronic equipment
CN111598190A (en) * 2020-07-21 2020-08-28 腾讯科技(深圳)有限公司 Training method of image target recognition model, image recognition method and device
CN112241764A (en) * 2020-10-23 2021-01-19 北京百度网讯科技有限公司 Image recognition method and device, electronic equipment and storage medium
US20220222532A1 (en) * 2021-01-13 2022-07-14 Adobe Inc. Compressing generative adversarial neural networks
JP2022132072A (en) * 2021-02-26 2022-09-07 南方科技大学 Image processing method and apparatus, electronic device and storage medium
CN114219725A (en) * 2021-11-25 2022-03-22 中国科学院深圳先进技术研究院 Image processing method, terminal equipment and computer readable storage medium
CN117115287A (en) * 2023-08-25 2023-11-24 维沃移动通信有限公司 Image generation method, device, electronic equipment and readable storage medium
CN117241092A (en) * 2023-09-28 2023-12-15 北京字跳网络技术有限公司 Video processing method and device, storage medium and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
甘润东等: ""适用于输电铁塔监测的北斗短报文图像压缩技术"", 重庆理工大学学报(自然科学), 15 December 2023 (2023-12-15) *

Also Published As

Publication number Publication date
CN117440172B (en) 2024-03-19

Similar Documents

Publication Publication Date Title
TWI374651B (en) Method and apparatus for adding signature information to electronic documents
US20130238968A1 (en) Automatic Creation of a Table and Query Tools
WO2022134584A1 (en) Real estate picture verification method and apparatus, computer device and storage medium
EP4109332A1 (en) Certificate authenticity identification method and apparatus, computer-readable medium, and electronic device
CN110414502B (en) Image processing method and device, electronic equipment and computer readable medium
CN110942061A (en) Character recognition method, device, equipment and computer readable medium
CN111738280A (en) Image identification method, device, equipment and readable storage medium
US9298685B2 (en) Automatic creation of multiple rows in a table
CN112686015A (en) Chart generation method, device, equipment and storage medium
CN101689308A (en) In two-dimensional space, provide three-dimensional character
JP6046501B2 (en) Feature point output device, feature point output program, feature point output method, search device, search program, and search method
CN111290684A (en) Image display method, image display device and terminal equipment
CN117440172B (en) Picture compression method and device
CN113641776A (en) Method, system and storage medium for displaying space coding of service based on block chain
CN112699646A (en) Data processing method, device, equipment and medium
CN110689063B (en) Training method and device for certificate recognition based on neural network
CN112329757A (en) Method, device and system for desensitizing acquisition of bill information
CN111222517A (en) Test sample generation method, system, computer device and storage medium
CN110782390A (en) Image correction processing method and device and electronic equipment
CN113434912B (en) Material compliance verification method and device
CN106598983A (en) Information display method and device
CN114419637A (en) Cross-border supply chain form generation method based on AI and related device
CN112733518A (en) Table template generation method, device, equipment and storage medium
CN111582143A (en) Student classroom attendance method and device based on image recognition and storage medium
WO2015012820A1 (en) Method and system for data identification and extraction using pictorial representations in a source document

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant