CN117156152A

CN117156152A - Model training method, encoding method, decoding method and equipment

Info

Publication number: CN117156152A
Application number: CN202210550999.1A
Authority: CN
Inventors: 常勤伟; 杨天舒; 刘绍腾; 刘华罗; 黄磊超
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-05-18
Filing date: 2022-05-18
Publication date: 2023-12-01

Abstract

The application discloses a model training method, an encoding method, a decoding method and equipment, which can be applied to various scenes such as network media, images, data security, cloud technology, artificial intelligence, intelligent traffic, auxiliary driving and the like. The model training method comprises the following steps: acquiring a sample carrier image and sample watermark information; encoding the sample carrier image and the sample watermark information through an initial encoder model to obtain an encoded image; decoding the coded image through an initial decoder model to obtain decoding watermark information; determining loop consistency loss information according to the coded image and the sample carrier image; determining second loss information using the sample carrier image, the encoded image, the sample watermark information, and the decoded watermark information; the initial encoder model is trained based on the cyclic consistency loss information and the second loss information to obtain the target encoder model, so that the influence of the target encoder model on the image quality of the image to be encoded after the target watermark information is embedded into the image to be encoded can be reduced.

Description

Model training method, encoding method, decoding method and equipment

Technical Field

The application relates to the technical field of computers, in particular to a model training method, an encoding method, a decoding method and equipment.

Background

With the explosive growth of the mobile internet, massive data starts to be transmitted through such an open channel. Data can be easily converted, altered, copied, and distributed during network transmission. The various processing and propagation means make the true ownership and copyright of the data difficult to trace back, and others can easily prove the ownership thereof through secondary authoring means such as modification, deletion and the like. Therefore, watermarking, a technique for protecting copyrights and maintaining content authentication, is introduced into scenes related to copyright protection of various information.

Compared with the traditional identification type single clear watermark, the digital watermark has the characteristics of invisible, large carrying information quantity, customizable content and attractive appearance. The method can not only reduce the influence of the user's look and feel to the greatest extent, but also play a role in marking copyright attribution and tracing the leakage source.

In order to protect the copyright of an image or video, a digital watermark is often added to the image, and in the related art, an encoder trained by a training method of a dependent encoder has a great influence on the quality of the image when the digital watermark is embedded into the image.

Disclosure of Invention

The embodiment of the application provides a model training method, an encoding method, a decoding method and equipment, which can train an initial encoder model based on the cyclic consistency loss of the obtained encoded image and a sample carrier image after encoding sample watermark information and the sample carrier image through the initial encoder model, and can effectively reduce the influence of the target encoder model on the original image quality of the image to be encoded after embedding the target watermark information into the image to be encoded through the target encoder model obtained through training.

In one aspect, a model training method is provided, the method comprising: acquiring a sample carrier image and sample watermark information;

encoding the sample carrier image and the sample watermark information through an initial encoder model to obtain an encoded image corresponding to the sample carrier image and the sample watermark information;

decoding the coded image through an initial decoder model to obtain decoding watermark information corresponding to the coded image, wherein the initial decoder model corresponds to the initial encoder model;

determining first loss information corresponding to the coded image and the sample carrier image according to the coded image and the sample carrier image, wherein the first loss information is cyclic consistency loss information;

determining second loss information using the sample carrier image, the encoded image, the sample watermark information, and the decoded watermark information;

training the initial encoder model based on the first loss information and the second loss information to obtain a target encoder model, wherein the target encoder model is used for encoding an image to be encoded and target watermark information to obtain a target encoded image embedded with the target watermark information.

In another aspect, there is provided an encoding method, including:

acquiring an image to be encoded and target watermark information;

encoding the image to be encoded and the target watermark information through the target encoder model to obtain a target encoded image embedded with the target watermark information;

the target encoder model is obtained by training an initial encoder model according to the model training method described in the first aspect.

In another aspect, there is provided a decoding method, including:

acquiring a target coding image embedded with target watermark information;

decoding the target encoded image by a target decoder model to extract the target watermark information from the target encoded image;

the target decoder model is model training equipment for acquiring a sample carrier image and sample watermark information; encoding the sample carrier image and the sample watermark information through an initial encoder model to obtain an encoded image corresponding to the sample carrier image and the sample watermark information; decoding the coded image through an initial decoder model to obtain decoding watermark information corresponding to the coded image, wherein the initial decoder model corresponds to the initial encoder model; determining first loss information corresponding to the coded image and the sample carrier image according to the coded image and the sample carrier image, wherein the first loss information is cyclic consistency loss information; determining second loss information using the sample carrier image, the encoded image, the sample watermark information, and the decoded watermark information; training the initial decoder model based on the first loss information and the second loss information.

In another aspect, there is provided a model training apparatus comprising:

the first acquisition unit is used for acquiring the sample carrier image and the sample watermark information;

the first coding unit is used for coding the sample carrier image and the sample watermark information through an initial coder model to obtain a coded image corresponding to the sample carrier image and the sample watermark information;

the first decoding unit is used for decoding the coded image through an initial decoder model to obtain decoding watermark information corresponding to the coded image, wherein the initial decoder model corresponds to the initial encoder model;

a first determining unit, configured to determine first loss information corresponding to the encoded image and the sample carrier image according to the encoded image and the sample carrier image, where the first loss information is cyclic consistency loss information;

a second determining unit configured to determine second loss information using the sample carrier image, the encoded image, the sample watermark information, and the decoded watermark information;

the training unit is used for training the initial encoder model based on the first loss information and the second loss information to obtain a target encoder model, and the target encoder model is used for encoding an image to be encoded and target watermark information to obtain a target encoded image embedded with the target watermark information.

In another aspect, there is provided an encoding apparatus including:

the second acquisition unit is used for acquiring the image to be encoded and the target watermark information;

the second coding unit is used for coding the image to be coded and the target watermark information through the target coder model to obtain a target coded image embedded with the target watermark information;

In another aspect, there is provided a decoding apparatus including:

a third acquisition unit for acquiring a target encoded image in which target watermark information has been embedded;

a second decoding unit for decoding the target encoded image through a target decoder model to extract the target watermark information from the target encoded image;

In another aspect, a computer readable storage medium is provided, the computer readable storage medium storing a computer program adapted to be loaded by a processor for performing the steps of the method according to any of the embodiments above.

In another aspect, a computer device is provided, the computer device comprising a processor and a memory, the memory having stored therein a computer program, the processor being configured to perform the steps of the method according to any of the embodiments above by calling the computer program stored in the memory.

In another aspect, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method according to any of the embodiments above.

The embodiment of the application obtains the sample carrier image and the sample watermark information; then, encoding the sample carrier image and sample watermark information through an initial encoder model to obtain an encoded image corresponding to the sample carrier image and the sample watermark information; then decoding the coded image through an initial decoder model to obtain decoding watermark information corresponding to the coded image, wherein the initial decoder model corresponds to the initial encoder model; determining first loss information corresponding to the coded image and the sample carrier image according to the coded image and the sample carrier image, wherein the first loss information is cyclical consistency loss information; then determining second loss information by using the sample carrier image, the encoded image, the sample watermark information, and the decoded watermark information; and training the initial encoder model based on the first loss information and the second loss information to obtain a target encoder model, wherein the target encoder model is used for encoding the image to be encoded and the target watermark information to obtain a target encoded image embedded with the target watermark information. According to the embodiment of the application, the initial encoder model can be trained based on the obtained encoded image after the sample watermark information and the sample carrier image are encoded through the initial encoder model, and the cyclic consistency loss of the encoded image and the sample carrier image is achieved, and the influence of the target encoder model on the original image quality of the image to be encoded after the target watermark information is embedded into the image to be encoded can be effectively reduced through the target encoder model obtained through training.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic structural diagram of a model training system according to an embodiment of the present application.

Fig. 2a is a schematic flow chart of a model training method according to an embodiment of the application.

Fig. 2b is a schematic diagram of a second flow chart of a model training method according to an embodiment of the present application.

Fig. 3 is a flowchart of an encoding method according to an embodiment of the present application.

Fig. 4 is a flow chart of a decoding method according to an embodiment of the present application.

Fig. 5 is a schematic structural diagram of a model training device according to an embodiment of the present application.

Fig. 6 is a schematic structural diagram of an encoding apparatus according to an embodiment of the present application.

Fig. 7 is a schematic structural diagram of a decoding device according to an embodiment of the present application.

Fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to fall within the scope of the application.

The embodiment of the application provides a model training method, an encoding method, a decoding method and equipment. Specifically, the model training method of the embodiment of the application can be executed by a computer device, wherein the computer device can be a terminal or a server. The terminal can be smart phones, tablet computers, notebook computers, intelligent voice interaction equipment, intelligent household appliances, wearable intelligent equipment, aircrafts, intelligent vehicle-mounted terminals and other equipment, and the terminal can also comprise a client, wherein the client can be a video client, a browser client or an instant messaging client and the like. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), basic cloud computing services such as big data and artificial intelligent platforms, and the like.

The embodiment of the application can be applied to various scenes such as network media, images, data security, cloud technology, artificial intelligence, intelligent traffic, auxiliary driving and the like.

First, partial terms or terminology appearing in the course of describing the embodiments of the application are explained as follows:

artificial intelligence (Artificial Intelligence, AI): the system is a theory, a method, a technology and an application system which simulate, extend and extend human intelligence by using a digital computer or a machine controlled by the digital computer, sense environment, acquire knowledge and acquire an optimal result by using the knowledge. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

Machine Learning (ML): is a multi-domain interdisciplinary, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

Deep Learning (DL): is a branch of machine learning, an algorithm that attempts to abstract data at a high level using multiple processing layers, either comprising complex structures or consisting of multiple nonlinear transformations. Deep learning is the inherent law and expression hierarchy of learning training sample data, and the information obtained in the learning process is greatly helpful to the interpretation of data such as characters, images, sounds and the like. The final goal of deep learning is to enable a machine to analyze learning capabilities like a person, and to recognize text, images, and sound data. Deep learning is a complex machine learning algorithm that achieves far greater results in terms of speech and image recognition than prior art.

Super-parameters-in the context of machine learning, super-parameters are parameters that are set to values prior to starting the learning process, rather than parameter data obtained through training. In general, the super parameters need to be optimized, and a group of optimal super parameters are selected for the learning machine so as to improve the learning performance and effect.

Neural Networks (NN): a deep learning model imitating the structure and function of a biological neural network is disclosed in the fields of machine learning and cognitive science.

Image steganography techniques: is a skill and science about information hiding, which refers to the transfer event or the content of information that does not let anyone other than the intended recipient know about the information. Image steganography refers to a technique of hiding information in an image and recovering the original information from the image by additional means.

Watermark information: information to be embedded into an image, commonly in the form of a picture of a station logo or a platform logo, a character string of a company or department name, an identity character string of a video or the image itself, and the like. CNN Convolutional Neural Networks, convolutional neural network, is a type of feedforward neural network that includes convolutional computations and has a deep structure. The convolutional neural network has characteristic learning capability and can carry out translation invariant classification on input information according to a hierarchical structure of the convolutional neural network.

GAN: generative Adversarial Nets, one of the deep learning models is generated for use in unsupervised learning over complex distributions. The model frame typically contains at least two modules: the generation module and the discrimination module are used for generating better output finally through mutual game learning of the two modules.

Encoder: and an encoder for embedding the watermark information in the carrier image and reducing the influence on the image quality of the carrier image as much as possible.

Decoder: and a decoder for extracting watermark information from the embedded image.

Robustness: the method is used for evaluating the retention degree of watermark information after the image embedded with the watermark is subjected to various attacks. The high robustness means that the watermark information can still be better preserved after the image is subject to attacks (noise, coding, image compression, etc.).

Detection rate: the number of images from which watermark information was successfully extracted is proportional to the total number of test images.

Pattern collapse (GAN) in training, the generated image has single pattern and no diversity due to unscientific parameter initialization or defect of loss function design.

PSNR: peak signal-to-noise ratio is an engineering term that represents the ratio of the maximum possible power of a signal to the destructive noise power affecting its accuracy of representation. Because many signals have a very wide dynamic range, peak signal-to-noise ratios are often expressed in logarithmic decibels units. The peak signal-to-noise ratio is often used as a measure of the quality of the signal reconstruction in the field of image compression etc., which is often defined simply by MSE (Mean Square Error ).

The intelligent transportation is a new generation information technology such as the Internet of things, space perception, cloud computing, mobile Internet and the like in the whole transportation field, and the theories and tools such as traffic science, system methods, artificial intelligence, knowledge mining and the like are comprehensively utilized, the comprehensive perception, deep fusion, active service and scientific decision making are taken as targets, and the related data of the transportation are deeply mined by constructing a real-time dynamic information service system to form a problem analysis model, so that the improvement of the industry resource allocation optimizing capability, public decision making capability, industry management capability and public service capability is realized, the transportation is promoted to be safer, more efficient, more convenient, more economical, more environment-friendly and more comfortable to operate and develop, and the transportation related industry is driven to be transformed and upgraded.

The intelligent transportation system (Intelligent Traffic System, ITS), also called intelligent transportation system (Intelligent Transportation System), is a comprehensive transportation system which uses advanced scientific technology (information technology, computer technology, data communication technology, sensor technology, electronic control technology, automatic control theory, operation study, artificial intelligence, etc.) effectively and comprehensively for transportation, service control and vehicle manufacturing, and enhances the connection among vehicles, roads and users, thereby forming a comprehensive transportation system for guaranteeing safety, improving efficiency, improving environment and saving energy.

Cloud technology (Cloud technology): the hosting technology is used for integrating hardware, software, network and other series resources in a wide area network or a local area network to realize calculation, storage, processing and sharing of data. The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied by the cloud computing business mode, can form a resource pool, and is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical networking systems require a large amount of computing, storage resources, such as video websites, picture-like websites, and more portals. Along with the high development and application of the internet industry, each article possibly has an own identification mark in the future, the identification mark needs to be transmitted to a background system for logic processing, data with different levels can be processed separately, and various industry data needs strong system rear shield support and can be realized only through cloud computing.

The scheme develops a set of digital watermark embedding-detecting algorithm with high concealment, strong attack resistance and good stability based on the countermeasure generation network, can be applied to various copyright protection scenes such as images, videos and the like, and provides technical support for copyright protection of a plurality of services.

Fig. 1 is a schematic structural diagram of a model training system according to an embodiment of the present application, where the model training system includes a terminal 10, a server 20, and the like; the terminal 10 and the server 20 are connected to each other through a network, for example, a wired or wireless network connection.

Wherein the terminal 10 may be used to display a graphical user interface. The terminal 10 is configured to interact with a user through a graphical user interface, for example, the terminal 10 downloads and installs a corresponding client and executes the corresponding client, for example, invokes a corresponding applet and executes the corresponding applet, for example, presents a corresponding graphical user interface through a login website, and the like. In the embodiment of the present application, the terminal 10 may be used for uploading the image to be encoded and the target watermark information to be embedded in the image to be encoded by related objects, such as related personnel. In the model training stage, the terminal 10 may further upload the sample carrier image and sample watermark information to be embedded in the sample carrier image to the relevant object, and the terminal 10 may send the sample carrier image and the sample watermark information to the server 20, so that the server 20 trains the initial encoder model, and feeds back the target encoder model obtained by training to the terminal 10.

Optionally, the server 20 may also store the target encoder model locally, so as to obtain the original image to be encoded and the target watermark information to be embedded in the original image, and then encode the image to be encoded and the target watermark information to obtain an encoding result including the target encoded image. After determining the encoding result, the server 20 may store, distribute, etc. the encoding result.

Alternatively, the terminal 10 may directly train the initial encoder model according to the sample carrier image uploaded by the related object and sample watermark information to be embedded in the sample carrier image to obtain a target encoder model, and encode the image to be encoded and the target watermark information to obtain an encoding result including the target encoded image after obtaining the image to be encoded and the target watermark information to be embedded in the image to be encoded.

Optionally, the terminal 10 may further send the target encoder model obtained by training to the server 20, so that after the server 20 obtains the image to be encoded and the target watermark information to be embedded in the image to be encoded, the image to be encoded and the target watermark information are encoded to obtain an encoding result including the target encoded image.

Taking the following description of the present embodiment by taking the server 20 as an example for training an initial encoder model, specifically, when model training is performed, the server 20 may be specifically configured to: acquiring a sample carrier image and sample watermark information; encoding the sample carrier image and the sample watermark information through an initial encoder model to obtain an encoded image corresponding to the sample carrier image and the sample watermark information; decoding the coded image through an initial decoder model to obtain decoding watermark information corresponding to the coded image, wherein the initial decoder model corresponds to the initial encoder model; determining first loss information corresponding to the coded image and the sample carrier image according to the coded image and the sample carrier image, wherein the first loss information is cyclic consistency loss information; determining second loss information using the sample carrier image, the encoded image, the sample watermark information, and the decoded watermark information; training the initial encoder model based on the first loss information and the second loss information to obtain a target encoder model.

The following describes the specific embodiments in detail. It should be noted that the following description order of embodiments is not a limitation of the priority order of embodiments.

The embodiment of the application provides a model training method which can be executed by a terminal or a server or can be executed by the terminal and the server together; the embodiment of the application is described by taking a model training method executed by a server as an example. Fig. 2a is a schematic flow chart of a model training method according to an embodiment of the present application, where the method includes:

s201, acquiring a sample carrier image and sample watermark information.

The sample carrier image is any image in which sample watermark information is to be embedded, and the sample watermark information can be text information, image information and the like. The aforementioned sample carrier image may be any one of a set of sample carrier images, wherein the set of sample carrier images is a set of sample carrier images used to train out the target encoder model.

S202, encoding the sample carrier image and the sample watermark information through an initial encoder model to obtain an encoded image corresponding to the sample carrier image and the sample watermark information.

The initial encoder model may include a first network unit and a second network unit, and in S202, encoding the sample carrier image and the sample watermark information by the initial encoder model to obtain an encoded image corresponding to the sample carrier image and the sample watermark information, including:

extracting the characteristics of the sample carrier image through the first network unit to obtain a characteristic image corresponding to the sample carrier image;

and encoding the characteristic image and the sample watermark information through the second network unit to obtain an encoded image corresponding to the sample carrier image and the sample watermark information.

Optionally, the first network unit performs feature extraction on the sample carrier image to obtain image features of the sample carrier image, and generates a corresponding feature image according to the image features. The image features mainly comprise color features, texture features, shape features and spatial relationship features of the image. The color feature is a global feature describing the surface properties of the scene to which the image or image area corresponds; texture features are also global features that also describe the surface properties of the scene to which an image or image region corresponds; the shape features have two types of representation methods, one is outline features, the other is area features, the outline features of the image are mainly aimed at the outer boundary of the object, and the area features of the image relate to the whole shape area; the spatial relationship features may be the mutual spatial position or relative direction relationship between the multiple targets segmented in the image, and these relationships may also be divided into connection/adjacent relationship, overlapping/overlapping relationship, inclusion/inclusion relationship, and the like.

Alternatively, the number of the feature images may be one or more.

Optionally, in S202, the encoding of the sample carrier image and the sample watermark information by the initial encoder model may be achieved by the following formula:

Y＝F ₂ (F ₁ (X)+m ₁ )；

wherein X is a sample carrier image; m is m ₁ Watermark information for samples; f (F) ₁ () Determining a function of a corresponding feature image for a first network element corresponding to the sample carrier image; f (F) ₂ () Determining a function of the coded image according to the characteristic image and the sample watermark information corresponding to a second network element; y is the encoded image.

Optionally F ₁ () Or the first network element itself, F ₂ () Or the second network unit itself, the first network unit and the second networkThe units may be two independent CNN networks.

By the F described above ₁ () After the image features of the sample carrier image are extracted, the regions and features which are easily damaged in the sample carrier image and are easily observed after the sample watermark information is overlapped can be removed, so that stable regions and features with small influence on the image quality can be screened out, wherein the regions with small influence on the image quality can comprise regions with rich textures or smoother regions.

By the F described above ₂ () The characteristic image and the sample watermark information can be overlapped, and the overlapped result is encoded to generate an encoded image.

S203, decoding the coded image through an initial decoder model to obtain decoding watermark information corresponding to the coded image, wherein the initial decoder model corresponds to the initial encoder model.

Wherein an initial decoder model corresponds to the initial encoder model. For example, the functions corresponding to the functional modules in the initial decoder model are reciprocal to the functions corresponding to the functional modules in the initial encoder model to implement the decoding process. For example, an initial encoder model is used to embed sample watermark information into a sample carrier image, resulting in an encoded image; the initial decoder model is used to extract the decoded watermark information from the encoded image, where the decoded watermark information is ideally the same as the sample watermark information.

Optionally, in S203, the decoding of the encoded image by the initial decoder model may be performed to obtain the decoded watermark information corresponding to the encoded image by the following formula:

wherein Y is an encoded image, and can also be a noisy encoded image obtained after further embedding preset noise information; d () is a decoding function corresponding to the initial decoder model, and may be a CNN network with decoding function; In order to decode watermark information, the decoded watermark information is watermark information obtained after the encoded image Y with watermark is decoded.

S204, according to the coding image and the sample carrier image, determining first loss information corresponding to the coding image and the sample carrier image, wherein the first loss information is cyclic consistency loss information.

Specifically, in S204, first loss information corresponding to the encoded image and the sample carrier image is determined according to the encoded image and the sample carrier image, including S2041-S2043 (not shown):

s2041, processing the coded image through a third network unit to obtain a first image;

s2042, processing the sample carrier image through the third network unit to obtain a second image;

s2043, determining first loss information corresponding to the encoded image and the sample carrier image using the first image, the second image, and the sample carrier image.

Optionally, in S2041, the processing, by the third network unit, of the encoded image to obtain a first image includes:

and taking the coded image as the input of a function corresponding to the third network element, and executing the function corresponding to the third network element to obtain the first image.

Alternatively, the third network element may be a CNN network.

Specifically, the processing of the encoded image by the third network element, to obtain the first image, may be implemented by the following formula:

wherein Y is an encoded image; FF is a function corresponding to the third network element, and may be specifically used forRemoving a function of watermark information in the encoded image;is the first image.

Optionally, in S2042, the sample carrier image is processed by a third network element to obtain a second image, including:

and taking the sample carrier image as the input of the function corresponding to the third network element, and executing the function corresponding to the third network element to obtain the second image.

Specifically, the processing of the sample carrier image by the third network element, obtaining the second image may be achieved by the following formula:

wherein X is a sample carrier image; FF is a function corresponding to the third network element, and may specifically be a function for removing watermark information in the sample carrier image;is the second image.

For the aforementioned third network element, which is an auxiliary network, it may be used to remove watermark information in the encoded image, and also to keep the sample carrier image as it is. For example, when the input is an encoded image, the first image is output after watermark information in the encoded image is removed; when the input is a sample carrier image, the output is a second image. By the method, when the input of the third network unit is the coded image, the output first image is the image with watermark information in the coded image removed as far as possible, the output image (the first image) needs to be infinitely close to the input image (the coded image) under the constraint of a loss function, and the process forms a closed loop, namely the cyclical consistency of the network, namely the output first image is close to the sample carrier image as far as possible, so that the influence on the image quality of the sample carrier image is reduced. When the input is a sample carrier image, the second image to be output is made as intact as possible, and no redundant operation is applied to the sample carrier image, so that the output image (second image) thereof is an original image (sample carrier image) without watermark information, but not other scrambled images. After the third network unit is added, the network can better restrict the encoder by decoding the image, so that the encoder is prevented from neglecting the maintenance of the image quality in the watermark embedding process, and the generation of pictures containing noise points and mosaics is avoided.

Optionally, the first loss information includes: the first sub-loss information and the second sub-loss information; in S2043, first loss information corresponding to the encoded image and the sample carrier image is determined using the first image, the second image, and the sample carrier image, including S31-S33 (not shown):

s31, determining first sub-loss information corresponding to the coded image by using the first image and the sample carrier image;

s32, determining second sub-loss information corresponding to the sample carrier image by using the second image and the sample carrier image;

s33, taking the sum of the first sub-loss information and the second sub-loss information as the first loss information.

In some optional embodiments of the application, determining first sub-loss information corresponding to the encoded image using the first image and the sample carrier image may comprise:

calculating the two norms of the first image and the sample carrier image to obtain a first numerical value;

and taking the first value as first sub-loss information.

For example, determining the first sub-loss information corresponding to the encoded image using the first image and the sample carrier image may be accomplished by the following equation:

Wherein X is a sample carrier image;is a first image; LC1 is the first sub-loss information.

In some alternative embodiments of the present application, determining second sub-loss information corresponding to the sample carrier image using a second image and the sample carrier image may include:

calculating the second norms of the second image and the sample carrier image to obtain a second value;

and taking the second value as second sub-loss information.

For example, determining second sub-loss information corresponding to the sample carrier image using the second image and the sample carrier image may be accomplished by the following equation:

wherein X is a sample carrier image;is a second image; LC1 is the second sub-loss information.

S205, determining second loss information using the sample carrier image, the encoded image, the sample watermark information, and the decoded watermark information.

Optionally, in the foregoing S205, the second loss information is determined using the sample carrier image, the encoded image, the sample watermark information, and the decoding watermark information, including S2051-S2054 (not shown in the figure):

s2051, determining image reconstruction loss information corresponding to the sample carrier image and the encoded image by using the sample carrier image and the encoded image.

In some optional embodiments of the present application, in S2051, determining the image reconstruction loss information corresponding to the sample carrier image and the encoded image using the sample carrier image and the encoded image may include:

calculating the difference between the sample carrier image and the coded image to obtain a first difference value;

calculating the second norm of the first difference value to obtain a third numerical value;

calculating a peak signal-to-noise value for the sample carrier image and the encoded image;

summing the third numerical value and the peak signal-to-noise ratio value to obtain a summation result;

and taking the summation result as the image reconstruction loss information.

For example, determining image reconstruction loss information corresponding to the sample carrier image and the encoded image using the sample carrier image and the encoded image may be accomplished by the following equation:

LI＝||X-Y|| ₂ +PSNR(X,Y)；

wherein X is a sample carrier image; y is the coded image; PSNR () is a peak signal-to-noise ratio function; LI is image reconstruction loss information.

S2052, determining counterdamage information corresponding to the sample carrier image and the coded image by using the sample carrier image, the coded image and a preset discriminator.

Optionally, in S2052, determining, using the sample carrier image, the encoded image, and a preset arbiter, countermeasures loss information corresponding to the sample carrier image and the encoded image may include:

inputting the sample carrier image and the coded image into the discriminator, and executing the discriminator to obtain a discrimination result;

and determining countermeasures loss information corresponding to the sample carrier image and the coded image according to the discrimination result.

Alternatively, determining the countermeasures loss information corresponding to the sample carrier image and the encoded image according to the discrimination result may be achieved by the following formula:

LG＝E[logD _X ]+E[log(1-D _Y )]；

wherein D is _X The input of the discriminator is the corresponding output result when the sample carrier image is input; d (D) _Y The input of the discriminator is the corresponding output result when the coded image; e () is a desired function; LG is countering loss information.

S2053, determining watermark reconstruction loss information corresponding to the sample watermark information and the decoding watermark information by using the sample watermark information and the decoding watermark information.

Optionally, in S2053, determining watermark reconstruction loss information corresponding to the sample watermark information and the decoded watermark information by using the sample watermark information and the decoded watermark information includes:

Calculating a second norm of a second difference value between the sample watermark information and the decoded watermark information to obtain a fourth numerical value;

and taking the fourth numerical value as watermark reconstruction loss information corresponding to the sample watermark information and the decoding watermark information.

For example, using the sample watermark information and the decoded watermark information, determining watermark reconstruction loss information corresponding to the sample watermark information and the decoded watermark information may be achieved by the following formula:

wherein m is ₁ Watermark information for samples;to decode watermark information; LM is watermark reconstruction loss information.

S2054, determining second loss information based on the image reconstruction loss information, the counterloss information, and the watermark reconstruction loss information.

Optionally, in S2054, determining second loss information based on the image reconstruction loss information, the counterloss information, and the watermark reconstruction loss information includes:

acquiring a first preset weight corresponding to the image reconstruction loss information;

acquiring a second preset weight corresponding to the countermeasures loss information;

acquiring a third preset weight corresponding to the watermark reconstruction loss information;

determining second loss information according to the first preset weight, the second preset weight, the third preset weight, the image reconstruction loss information, the counterloss information and the watermark reconstruction loss information.

Specifically, determining second loss information according to the first preset weight, the second preset weight, the third preset weight, the image reconstruction loss information, the counterloss information, and the watermark reconstruction loss information includes:

calculating the product of the first preset weight and the image reconstruction loss information to obtain a first product result;

calculating the product of the second preset weight and the countermeasures loss information to obtain a second product result;

calculating the product of the third preset weight and the watermark reconstruction loss information to obtain a third product result;

and taking the sum of the first product result, the second product result and the third product result as the third loss information.

For example, determining second loss information from the first preset weight, the second preset weight, the third preset weight, the image reconstruction loss information, the contrast loss information, and the watermark reconstruction loss information may be implemented by the following formula:

LOSS＝w1*LI+w2*LG+w3*LM；

wherein w1 is a first preset weight; w2 is a second preset weight; w3 is a third preset weight, LI is image reconstruction loss information, LG is counterloss information, and LM is watermark reconstruction loss information.

S206, training the initial encoder model based on the first loss information and the second loss information to obtain a target encoder model, wherein the target encoder model is used for encoding an image to be encoded and target watermark information to obtain a target encoded image embedded with the target watermark information.

Optionally, in S206, training the initial encoder model based on the first loss information and the second loss information includes:

acquiring a fourth preset weight corresponding to the first sub-loss information;

obtaining a fifth preset weight corresponding to the second sub-loss information;

determining a total loss result according to the fourth preset weight, the fifth preset weight, the first sub-loss information, the second sub-loss information and the second loss information;

if the total loss result is smaller than a preset threshold value, training the initial encoder model is completed;

and if the total loss result is not smaller than a preset threshold value, adjusting the super-parameters related to the parameters in the initial encoder model, and returning to the execution step to encode the sample carrier image and the sample watermark information through the initial encoder model to obtain an encoded image corresponding to the sample carrier image and the sample watermark information until the loss result is smaller than the preset threshold value, and completing training of the initial encoder model.

Wherein the purpose of adjusting the super-parameters related to the parameters in the initial encoder model is to adjust the parameters in the initial encoder model.

Optionally, the super-parameters related to the parameters in the initial encoder model may also be learning rates corresponding to the initial encoder model training process.

The learning rate is an important super-parameter in supervised learning and deep learning, and determines whether the objective function can converge to a local minimum and when it converges to a minimum. The appropriate learning rate enables the objective function to converge to a local minimum at an appropriate time. The learning rate is a super-parameter that is located before the gradient of the loss function, updating the network weights. The lower the learning rate, the slower the weight update speed, and the slower the loss function change speed.

In some optional embodiments of the application, determining the total loss result according to the fourth preset weight, the fifth preset weight, the first sub-loss information, the second sub-loss information, and the second loss information comprises:

calculating the product of the fourth preset weight and the first sub-loss information to obtain a fourth product result;

calculating the product of the fifth preset weight and the second sub-loss information to obtain a fifth product result;

And summing the first product result, the fifth product result and the second loss information to obtain the total loss result.

Wherein the total loss result may comprise a plurality of loss results.

Optionally, the first product result, the second product result, the third product result, the fourth product result, and the fourth product result are a first loss result, a second loss result, a third loss result, a fourth loss result, and a fifth loss result, respectively, in a total loss result. I.e. the total loss result comprises a plurality of loss results: a first loss result, a second loss result, a third loss result, a fourth loss result, and a fifth loss result.

In some alternative embodiments of the application, adjusting the super-parameters related to the parameters in the initial encoder model includes:

acquiring the loss results corresponding to the first sub-loss information, the second sub-loss information, the image reconstruction loss information, the contrast loss information and the watermark reconstruction loss information respectively to obtain a plurality of loss results;

determining a target loss result with the largest numerical value from the multiple loss results;

Adjusting the target weight corresponding to the target loss result;

wherein the target weight corresponding to the target loss result is a hyper-parameter related to a parameter in the initial encoder model.

Specifically, adjusting the target weight corresponding to the target loss result includes controlling the target weight corresponding to the target loss result to increase.

Optionally, the target weight corresponding to the target loss result is a weight corresponding to loss information for determining the target loss result.

For example, when the target loss result is the fifth product result, the target weight is a fifth preset weight corresponding to the second sub-loss information.

Optionally, the method further comprises: training the initial decoder model based on the first loss information and the second loss information to obtain a target decoder model, wherein the target decoder model is used for decoding the target encoded image so as to extract the target watermark information from the target encoded image.

In some optional embodiments of the application, optionally, training the initial decoder model based on the first loss information and the second loss information comprises:

if the total loss result is smaller than a preset threshold value, training the initial decoder model is completed;

and if the total loss result is not smaller than a preset threshold value, adjusting the super-parameters related to the parameters in the initial decoder model, and returning to the execution step to encode the sample carrier image and the sample watermark information through the initial encoder model to obtain an encoded image corresponding to the sample carrier image and the sample watermark information until the loss result is smaller than the preset threshold value, and completing training of the initial decoder model.

Wherein the purpose of adjusting the super-parameters related to the parameters in the initial decoder model is to adjust the parameters in the initial decoder model.

Optionally, the super-parameters related to the parameters in the initial decoder model may also be learning rates corresponding to the initial decoder model training process.

Wherein the total loss result may comprise a plurality of loss results.

Optionally, the first product result, the second product result, the third product result, the fourth product result, and the fourth product result are a first loss result, a second loss result, a third loss result, a fourth loss result, and a fifth loss result, respectively, in a total loss result. I.e.

The total loss results include a plurality of loss results: a first loss result, a second loss result, a third loss result, a fourth loss result, and a fifth loss result.

In some alternative embodiments of the application, adjusting the super-parameters related to the parameters in the initial decoder model comprises:

adjusting the target weight corresponding to the target loss result;

wherein the target weight corresponding to the target loss result is a hyper-parameter related to a parameter in the initial decoder model.

It should be noted that, in the present application, the parameters of the initial encoder model and the parameters of the initial decoder model are adjusted correspondingly, so that the adjustment of the parameters of the initial encoder model and the adjustment of the parameters of the initial decoder model may be performed not only in succession but also simultaneously.

Alternatively, in determining the aforementioned LI, the PSNR is used to represent the ratio of the maximum possible power of the signal and the destructive noise power affecting its accuracy of representation. Because many signals have a very wide dynamic range, peak signal-to-noise ratios are often expressed in logarithmic decibels units.

PSNR may be defined by Mean Square Error (MSE). Two mxn monochrome images X and Y (sample carrier image and encoded image), if one is noise approximation of the other, their mean square error can be defined as:

wherein X is a sample carrier image; y is an encoded image, and the sizes of the images of X and Y are the same; m is m ₂ Is the length of the image; n is the width of the image; (i, j) is the pixel coordinates in the image, i is the abscissa and j is the ordinate.

For example, the specific calculation of PSNR is as follows:

where MAX is the maximum value representing the color of the image point, and 255 if each sample point is represented by 8 bits. The peak signal-to-noise ratio is similar for color images with three values of RGB per point, except that the mean square error is the sum of all variances divided by the image size divided by 3.

According to the mathematical property of the mean square error, the mean square error tends to be compared with the error of each pixel in the training process of the encoder, so that the distortion degree of each pixel is reduced as much as possible, and the local computing mode which is too focused can keep the image in specific detail, but the whole picture is easy to have a large-area color lump. Thus, the present application additionally introduces a PSNR penalty function to ensure the degree of retention of image quality at the macroscopic level.

In the embodiment of the application, from the data flow path of the input image, namely the sample carrier image, the input image is encoded to generate a watermark-embedded image, and the watermark-embedded image is third generated to decode the image. The decoded image needs to approach the input image indefinitely under the constraint of the loss function. The process forms a closed loop, i.e. the cyclic consistency of the network. Furthermore, the third network element accepts as input not only the encoded image but also the sample carrier image, constraining the decoded image as close as possible to the sample carrier image. The purpose is to normalize the effect of the third network element so that its output image is a sample carrier image without watermark information, but not other scrambled images. After the third network unit is added, the network can better restrict the encoder by decoding the image, so that the encoder is prevented from neglecting the maintenance of the image quality in the watermark embedding process, and the generation of pictures containing noise points and mosaics is avoided.

Furthermore, in the cyclical consistency loss, the present scheme employs a squaring loss to constrain the network so that the encoded image passing through the encoder model is as close as possible to the sample carrier image.

In the application, the loss functions are specifically analyzed to see that the watermark reconstruction loss information is used for comparing the watermark information extracted by the decoder model with the original watermark information, thereby helping the decoder model to obtain better watermark extraction effect. The image reconstruction loss information is used for comparing the difference between the coded image generated by the coder model and the sample carrier image, so that the carrier image generated by the coder model is similar to the sample carrier image as far as possible, and the image quality influence caused by watermark embedding is reduced. The counterdamage information is a conventional damage function of the counterdamage generation network, and is used for guiding the network training trend, converging the network and helping the encoder model to generate conventional images instead of messy codes.

According to the scheme of the application, the details of the embedding and detecting algorithm are not required to be designed artificially, only the loss function is required to be designed, the training data are designated, and the watermark embedding module and the watermark extracting module with the ideal effect can be obtained through training. The end-to-end mode greatly improves the research and development efficiency and saves the research and development cost. And the parameters of the watermark algorithm are not required to be manually adjusted, the parameters can be adjusted in a deep learning mode, and a group of optimal parameters can be quickly obtained.

And, introducing PSNR as an evaluation criterion in the loss function can effectively reduce the image quality loss. In addition, the third network unit provided by the scheme guides the data to flow back to form a cycle by comparing the output image of the encoder model with the original image, so that the encoder model is prevented from focusing on hidden information only in training, and influence on image quality is ignored.

Optionally, a noise layer may also be added between the encoder model and the decoder model, and may be added to other locations, such as before the sample carrier image is fed into the encoder model, and different noise may be applied to the sample carrier image to expand the diversity of the data. Or before the coded image or the sample carrier image is sent to the discriminator, a noise layer can be added to interfere with the judgment of the discriminator on the real image or the generated image so as to improve the quality of the final generated image.

Optionally, the first network element, the second network element, or the third network element in the present application is not limited to a specific CNN structure, such as ResNet, mobileNet. Any network can be suitable for watermark embedding detection results provided by the scheme, and a network with the magnitude corresponding to the data volume can be selected according to the quantity of training data.

It should be noted that, in the embodiment of the present application, the initial decoder model corresponds to the initial encoder model, and when the target encoder model is obtained through training, the target decoder model is also obtained correspondingly.

The following further describes the scheme with reference to fig. 2b, fig. 2b is a schematic diagram of a second flow of the model training method provided by the embodiment of the present application, and referring to fig. 2b, the model training device may obtain a sample carrier image and sample watermark information; extracting the characteristics of the sample carrier image through a first network unit in the encoder model to obtain a characteristic image corresponding to the sample carrier image; encoding the characteristic image and the sample watermark information through a second network unit in the encoder model to obtain an encoded image; decoding the coded image through a decoder model to obtain decoding watermark information corresponding to the coded image;

determining first loss information LC corresponding to the coded image and the sample carrier image according to the coded image and the sample carrier image, wherein the first loss information is cyclic consistency loss information; determining image reconstruction loss information LI corresponding to the sample carrier image and the encoded image using the sample carrier image and the encoded image; determining counterdamage information LG corresponding to the sample carrier image and the code image by using the sample carrier image, the code image and a preset discriminator; determining watermark reconstruction loss information LM corresponding to the sample watermark information and the decoding watermark information by utilizing the sample watermark information and the decoding watermark information; and training an encoder model based on the image reconstruction loss information LI, the counterloss information LG, the watermark reconstruction loss information LM and the first loss information LC to obtain a target encoder model.

All the above technical solutions may be combined to form an optional embodiment of the present application, and will not be described in detail herein.

The embodiment of the application obtains the sample carrier image and the sample watermark information; encoding the sample carrier image and the sample watermark information through an initial encoder model to obtain an encoded image corresponding to the sample carrier image and the sample watermark information; decoding the coded image through an initial decoder model to obtain decoding watermark information corresponding to the coded image, wherein the initial decoder model corresponds to the initial encoder model; determining first loss information corresponding to the coded image and the sample carrier image according to the coded image and the sample carrier image, wherein the first loss information is cyclic consistency loss information; determining second loss information using the sample carrier image, the encoded image, the sample watermark information, and the decoded watermark information; the initial encoder model is trained based on the first loss information and the second loss information to obtain a scheme of the target encoder model, the initial encoder model can be trained based on the cyclic consistency loss of the obtained coded image and the sample carrier image after the sample watermark information and the sample carrier image are coded through the initial encoder model, and the influence of the target encoder model on the original image quality of the image to be coded after the target watermark information is embedded into the image to be coded can be effectively reduced through the target encoder model obtained through training.

The embodiments of the present application provide an encoding method, which may be executed by a terminal or a server, or may be executed by the terminal and the server together; the embodiment of the application is explained by taking the encoding method executed by the terminal as an example. Fig. 3 is a schematic flow chart of an encoding method according to an embodiment of the present application, where the encoding method includes S301-S302:

s301, obtaining an image to be encoded and target watermark information;

s302, encoding the image to be encoded and the target watermark information through the target encoder model to obtain a target encoded image embedded with the target watermark information;

the target encoder model is obtained by training an initial encoder model according to any model training method.

Optionally, the method further comprises: and transmitting the target coded image to target equipment.

The target device may be a terminal, or may be other devices.

By embedding the target watermark information into the image to be encoded by the encoding method, the influence of the target encoder model on the original image quality of the encoded image after the target watermark information is embedded into the image to be encoded can be effectively reduced.

The foregoing may be referred to in the specific implementation manner corresponding to this embodiment, and will not be described herein.

The embodiments of the present application provide a decoding method, which may be executed by a terminal or a server, or may be executed by the terminal and the server together; the embodiment of the application is explained by taking the decoding method executed by the terminal as an example. Fig. 4 is a flowchart of a decoding method according to an embodiment of the present application, where the decoding method includes S401 to S402:

s401, acquiring a target coded image embedded with target watermark information;

s402, decoding the target coded image through a target decoder model to extract the target watermark information from the target coded image;

The target coding image is obtained by coding the image to be coded and the target watermark information through a target coder model. Training of the target encoder model and training of the target decoder model are realized by means of cyclic consistency loss determined according to the sample carrier image and the encoder image, wherein the encoded image is obtained by encoding the sample carrier image and sample watermark information through the initial encoder model; and decoding the target coded image by the target decoder model obtained through the cyclic consistency loss information training, so that the authenticity of the recovered target watermark information after the target coded image is decoded by the target decoder model is ensured.

It should be noted that, when the embodiment of the present application is applied to a specific product or technology, user permission or consent needs to be obtained, and the collection, use and processing of the related data need to comply with related laws and regulations and standards of related countries and regions.

Fig. 5 is a schematic structural diagram of a model training apparatus according to an embodiment of the present application, where the model training apparatus 50 includes:

a first obtaining unit 51 for obtaining a sample carrier image and sample watermark information;

a first encoding unit 52, configured to encode the sample carrier image and the sample watermark information by using an initial encoder model, so as to obtain an encoded image corresponding to the sample carrier image and the sample watermark information;

a first decoding unit 53, configured to decode the encoded image through an initial decoder model, where the initial decoder model corresponds to the initial encoder model, to obtain decoded watermark information corresponding to the encoded image;

a first determining unit 54, configured to determine first loss information corresponding to the encoded image and the sample carrier image according to the encoded image and the sample carrier image, where the first loss information is cyclic consistency loss information;

a second determining unit 55 for determining second loss information using the sample carrier image, the encoded image, the sample watermark information, and the decoded watermark information;

the training unit 56 is configured to train the initial encoder model based on the first loss information and the second loss information to obtain a target encoder model, where the target encoder model is used to encode an image to be encoded and target watermark information to obtain a target encoded image in which the target watermark information is embedded.

Optionally, the initial encoder model may include a first network unit and a second network unit, where the first encoding unit 52 is specifically configured to, when configured to encode the sample carrier image and the sample watermark information by using the initial encoder model, obtain an encoded image corresponding to the sample carrier image and the sample watermark information:

Alternatively, the foregoing first determining unit 54 is specifically configured to, when configured to determine, from the encoded image and the sample carrier image, first loss information corresponding to the encoded image and the sample carrier image:

processing the coded image through a third network unit to obtain a first image;

processing the sample carrier image through the third network unit to obtain a second image;

first loss information corresponding to the encoded image and the sample carrier image is determined using the first image, the second image, and the sample carrier image.

Optionally, the first loss information includes: the first sub-loss information and the second sub-loss information; the first determining unit 54, when configured to determine the first loss information corresponding to the encoded image and the sample carrier image using the first image, the second image, and the sample carrier image, is specifically configured to:

determining first sub-loss information corresponding to the encoded image using the first image and the sample carrier image;

determining second sub-loss information corresponding to the sample carrier image using the second image and the sample carrier image;

and taking the sum of the first sub-loss information and the second sub-loss information as the first loss information.

Optionally, the aforementioned second determining unit 55, when configured to determine the second loss information using the sample carrier image, the encoded image, the sample watermark information, and the decoded watermark information, is specifically configured to:

determining image reconstruction loss information corresponding to the sample carrier image and the encoded image using the sample carrier image and the encoded image;

determining counterdamage information corresponding to the sample carrier image and the code image by using the sample carrier image, the code image and a preset discriminator;

Determining watermark reconstruction loss information corresponding to the sample watermark information and the decoding watermark information by using the sample watermark information and the decoding watermark information;

second loss information is determined based on the image reconstruction loss information, the contrast loss information, and the watermark reconstruction loss information.

Optionally, the aforementioned second determining unit 55, when configured to determine the countermeasures loss information corresponding to the sample carrier image and the encoded image using the sample carrier image, the encoded image, and a preset discriminator, is specifically configured to:

Optionally, the second determining unit 55, when configured to determine second loss information based on the image reconstruction loss information, the counterloss information, and the watermark reconstruction loss information, is specifically configured to:

Optionally, the foregoing training unit 56 is specifically configured to, when used for training the initial encoder model based on the first loss information and the second loss information:

Optionally, the foregoing training unit 56, when used for adjusting the super-parameters related to the parameters in the initial encoder model, is specifically configured to:

adjusting the target weight corresponding to the target loss result;

Optionally, the model training device 50 is further configured to:

training the initial decoder model based on the first loss information and the second loss information to obtain a target decoder model, wherein the target decoder model is used for decoding the target encoded image so as to extract the target watermark information from the target encoded image.

Optionally, the foregoing model training apparatus 50 is further configured to:

acquiring an image to be encoded and target watermark information;

and transmitting the target coded image to target equipment.

The various elements of the model training apparatus 50 described above may be implemented in whole or in part by software, hardware, or a combination thereof. The above units may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor invokes and executes operations corresponding to the above units.

The model training apparatus 50 may be integrated in a terminal or a server having a memory and a processor installed to have an operation capability, or the model training apparatus 50 may be the terminal or the server.

Fig. 6 is a schematic structural diagram of an encoding apparatus according to an embodiment of the present application, where the encoding apparatus 60 includes:

a second obtaining unit 61, configured to obtain an image to be encoded and target watermark information;

a second encoding unit 62, configured to encode the image to be encoded and the target watermark information through the target encoder model, so as to obtain a target encoded image in which the target watermark information is embedded;

The respective units in the above-described encoding apparatus 60 may be implemented in whole or in part by software, hardware, and combinations thereof. The above units may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor invokes and executes operations corresponding to the above units.

The encoding apparatus 60 may be integrated in a terminal or a server having a memory and a processor mounted therein and having an operation capability, or the encoding apparatus 60 may be the terminal or the server.

Fig. 7 is a schematic structural diagram of a decoding apparatus according to an embodiment of the present application, where the decoding apparatus 70 includes:

a third acquisition unit 71 for acquiring a target encoded image in which target watermark information has been embedded;

a second decoding unit 72 for decoding the target encoded image by a target decoder model to extract the target watermark information from the target encoded image;

The various elements of the decoding device 70 described above may be implemented in whole or in part by software, hardware, and combinations thereof. The above units may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor invokes and executes operations corresponding to the above units.

The decoding device 70 may be integrated in a terminal or a server having a memory and a processor mounted therein and having an operation capability, or the decoding device 70 may be the terminal or the server.

Optionally, the present application further provides a computer device, including a memory and a processor, where the memory stores a computer program, and the processor implements the steps in the above method embodiments when executing the computer program.

Fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present application, where the computer device may be a terminal or a server shown in fig. 1. As shown in fig. 8, the computer device 800 may include: a communication interface 801, a memory 802, a processor 803, and a communication bus 804. Communication interface 801, memory 802, and processor 803 communicate with each other via communication bus 804. The communication interface 801 is used for data communication between the apparatus 700 and an external device. The memory 802 may be used to store software programs and modules that the processor 803 may operate by running the software programs and modules stored in the memory 802, such as the software programs for corresponding operations in the foregoing method embodiments.

Alternatively, the processor 803 may invoke a software program and modules stored in the memory 802 to perform the following operations: acquiring a sample carrier image and sample watermark information; encoding the sample carrier image and the sample watermark information through an initial encoder model to obtain an encoded image corresponding to the sample carrier image and the sample watermark information; decoding the coded image through an initial decoder model to obtain decoding watermark information corresponding to the coded image, wherein the initial decoder model corresponds to the initial encoder model; determining first loss information corresponding to the coded image and the sample carrier image according to the coded image and the sample carrier image, wherein the first loss information is cyclic consistency loss information; determining second loss information using the sample carrier image, the encoded image, the sample watermark information, and the decoded watermark information; training the initial encoder model based on the first loss information and the second loss information to obtain a target encoder model, wherein the target encoder model is used for encoding an image to be encoded and target watermark information to obtain a target encoded image embedded with the target watermark information.

Optionally, the processor 803 may also invoke software programs and modules stored in the memory 802 to perform the following operations: acquiring an image to be encoded and target watermark information; encoding the image to be encoded and the target watermark information through a target encoder model to obtain a target encoded image embedded with the target watermark information; the target encoder model is obtained by training an initial encoder model according to the model training method of any one of the above.

Optionally, the processor 803 may also invoke software programs and modules stored in the memory 802 to perform the following operations: acquiring a target coding image embedded with target watermark information; decoding the target encoded image by a target decoder model to extract the target watermark information from the target encoded image; the target decoder model is model training equipment for acquiring a sample carrier image and sample watermark information; encoding the sample carrier image and the sample watermark information through an initial encoder model to obtain an encoded image corresponding to the sample carrier image and the sample watermark information; decoding the coded image through an initial decoder model to obtain decoding watermark information corresponding to the coded image, wherein the initial decoder model corresponds to the initial encoder model; determining first loss information corresponding to the coded image and the sample carrier image according to the coded image and the sample carrier image, wherein the first loss information is cyclic consistency loss information; determining second loss information using the sample carrier image, the encoded image, the sample watermark information, and the decoded watermark information; training the initial decoder model based on the first loss information and the second loss information.

The present application also provides a computer-readable storage medium storing a computer program. The computer readable storage medium may be applied to a computer device, and the computer program causes the computer device to execute a corresponding flow in the model training method in the embodiment of the present application, which is not described herein for brevity.

The present application also provides a computer program product comprising a computer program stored in a computer readable storage medium. The processor of the computer device reads the computer program from the computer readable storage medium, and the processor executes the computer program, so that the computer device executes a corresponding flow in the model training method in the embodiment of the present application, which is not described herein for brevity.

The present application also provides a computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the computer device executes the corresponding flow in the model training method in the embodiment of the present application, which is not described herein for brevity.

It should be appreciated that the processor of an embodiment of the present application may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method embodiments may be implemented by integrated logic circuits of hardware in a processor or instructions in software form. The processor may be a general purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), an off-the-shelf programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.

It will be appreciated that the memory in embodiments of the application may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM) which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (Double Data Rate SDRAM), enhanced SDRAM (ESDRAM), synchronous DRAM (SLDRAM), and Direct RAM (DR RAM). It should be noted that the memory of the systems and methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.

It should be understood that the above memory is illustrative but not restrictive, and for example, the memory in the embodiments of the present application may be Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), direct RAM (DR RAM), and the like. That is, the memory in embodiments of the present application is intended to comprise, without being limited to, these and any other suitable types of memory.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.

In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, etc.

The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of model training, comprising:

acquiring a sample carrier image and sample watermark information;

2. The method according to claim 1, wherein the initial encoder model includes a first network element and a second network element, the encoding of the sample carrier image and the sample watermark information by the initial encoder model results in an encoded image corresponding to the sample carrier image and the sample watermark information, comprising:

3. The method according to claim 1 or 2, wherein said determining first loss information corresponding to said encoded image and said sample carrier image from said encoded image and said sample carrier image comprises:

4. A method according to claim 3, wherein the first loss information comprises: the first sub-loss information and the second sub-loss information; the determining, using the first image, the second image, and the sample carrier image, first loss information corresponding to the encoded image and the sample carrier image, comprises:

5. The method of claim 4, wherein said determining second loss information using said sample carrier image, said encoded image, said sample watermark information, and said decoded watermark information comprises:

6. The method of claim 5, wherein determining the countermeasures loss information corresponding to the sample carrier image and the encoded image using the sample carrier image, the encoded image, and a predetermined arbiter comprises:

7. The method of claim 5, wherein the determining second loss information based on the image reconstruction loss information, the contrast loss information, and the watermark reconstruction loss information comprises:

8. The method of claim 5, wherein the training the initial encoder model based on the first loss information and the second loss information comprises:

9. The method of claim 8, wherein said adjusting the super-parameters related to parameters in the initial encoder model comprises:

adjusting the target weight corresponding to the target loss result;

10. The method according to claim 1, wherein the method further comprises:

11. A method of encoding, comprising:

Acquiring an image to be encoded and target watermark information;

encoding the image to be encoded and the target watermark information through a target encoder model to obtain a target encoded image embedded with the target watermark information;

wherein the target encoder model is obtained by training an initial encoder model according to the model training method of any one of claims 1-9.

12. A decoding method, comprising:

acquiring a target coding image embedded with target watermark information;

13. A model training apparatus, comprising:

14. An encoding apparatus, comprising:

the second coding unit is used for coding the image to be coded and the target watermark information through a target coder model to obtain a target coded image embedded with the target watermark information;

15. A decoding apparatus, comprising:

16. A computer readable storage medium, characterized in that it stores a computer program adapted to be loaded by a processor for performing the steps of the method according to any one of claims 1-10, or claim 11, or claim 12.

17. A computer device, characterized in that it comprises a processor and a memory, in which a computer program is stored, the processor being adapted to perform the steps of the method of any of claims 1-10, or 11, or 12, by calling the computer program stored in the memory.

18. A computer program product comprising a computer program, characterized in that the computer program when executed by a processor realizes the steps in the method of any one of claims 1-10, or claim 11, or claim 12.