CN115271033B

CN115271033B - Medical image processing model construction and processing method based on federal knowledge distillation

Info

Publication number: CN115271033B
Application number: CN202210783921.4A
Authority: CN
Inventors: 刘贵松; 刘哲通; 解修蕊; 黄鹂; 蒋太翔; 杨新
Original assignee: Kashgar Electronic Information Industry Technology Research Institute; Southwestern University Of Finance And Economics
Current assignee: Kashgar Electronic Information Industry Technology Research Institute; Southwestern University Of Finance And Economics
Priority date: 2022-07-05
Filing date: 2022-07-05
Publication date: 2023-11-21
Anticipated expiration: 2042-07-05
Also published as: CN115271033A

Abstract

The invention relates to the field of medical image processing, and provides a federal knowledge-based medical image processing model construction method, which comprises the steps of training a child node network by using a private data set, and forward transmitting the trained child node network on a public data set to obtain a first pulse tensor and uploading the first pulse tensor to a central node; after the center node receives the data, carrying out distillation training based on the public data set to obtain a distillation product; the distillation products of all the child nodes are aggregated to obtain global parameters, after the central node network is updated by using the global parameters, the global parameters are transmitted forward on a public data set, and the obtained second pulse tensor is distributed to all the child nodes; the child node receives a second pulse tensor for distillation training on the public data set, synchronously updates the network parameters of the child node, and enters circulation training; and stopping training until the preset number of wheels or the preset value is reached. The invention also provides a processing method for processing the medical image to be processed by using the constructed model.

Description

Medical image processing model construction and processing method based on federal knowledge distillation

Technical Field

The invention belongs to the field of medical image processing, and particularly relates to a medical image processing model construction and a processing method thereof based on federal knowledge distillation.

Background

With the development and improvement of medical imaging technology and deep learning technology, medical image processing based on a deep neural network has become an important technology in medical research and clinical diagnosis. In recent years, federal learning (Federated Learning, FL) has been paid attention to by researchers in the field of medical image processing, and it can realize aggregate learning of scattered medical image data on the premise of privacy security, and fully references various kinds of patient data. In medical research, researchers often need to know specific information of certain internal tissue organs based on imaging of medical technology in order to give decisions about the most accurate treatment plan possible when performing quantitative analysis, real-time monitoring or treatment planning of such tissue organs. Therefore, biomedical images play an extremely important role in treatment, and medical images generated by various patients are gradually accumulated. However, due to the decentralized nature of institutions such as hospitals and the specificity of medical images, the distribution of these medical image data is extremely decentralized and has strict privacy regulations, and it is extremely difficult to use them directly in a central application to training of neural networks. If training is performed in a scattered manner, the problems of insufficient data amount, insufficient labels and the like are faced. Therefore, research on how to train the neural network for medical image processing by using scattered data under the guarantee of privacy security has extremely high value and important significance.

Federal learning (Federated Learning, FL) is a distributed training mode that allows devices to participate in co-training of deep neural network models without exchanging local privacy data with other devices or central nodes. In traditional distributed learning, data for network training is typically required to be transmitted to a central node or cloud data node. However, such a procedure will lead to data leakage, which is difficult to apply in fields with high privacy requirements such as medical image processing. Federal learning is based on communication without participation and encrypted transmission, which provides a solution to the privacy disclosure problem.

The impulse neural network (Spiking Neural Network, SNN) is a new generation neural network model that is distinguished from traditional artificial neural networks. The method achieves the purpose of reducing training power consumption by simulating the potential change and nerve pulse of biological neurons and replacing real number output in the traditional artificial neural network with discrete binary sequences. The leak-discharge model (LIF) is a classical pulsed neuron model, widely cited in SNN studies. Researchers have proposed an explicit, discrete mathematical formula of the LIF model that can be implemented on a computing device, namely:

approximating a gradient function. In a pulsed neural network, since the derivative of the unit step function describing the pulsing at zero tends to infinity, it is determined that it cannot be directly gradient-dropped as in a conventional neural network, and an approximation function needs to be found instead. Some impulse neural network training schemes employ a rectangular function (rectangular function) to serve this role. The rectangular function is mathematically defined as:

wherein a is a rectangular shape parameter, sign is a truth function, V _th Is the discharge threshold.

The federal learning framework (Federated Learning on Spiking Neural Network, FLSNN) based on a pulsed neural network is a distributed training mode applied to pulsed neural networks that allows devices to participate in co-training of the SNN model without exchanging local privacy data with other devices or central nodes. How to use scattered privacy data to train the deep neural network with low power consumption is a problem to be solved by the federal learning of the impulse neural network.

Knowledge distillation (Knowledge Distillation, KD) is a network knowledge extraction scheme that can transfer knowledge of a trained larger scale neural network to a smaller scale network, such that small networks exhibit very close effects to large networks. In knowledge distillation, a network that obtains knowledge is called a student network (student network), and a network that transfers knowledge is called a teacher network (teacher network). Knowledge distillation reflects that knowledge of the network is not only present in the parameters, but can also be embodied via the output. Knowledge distillation can be divided into three categories, depending on the knowledge source used for the distillation:

response-based distillation

Feature-based distillation

Relationship-based distillation

Researchers in traditional artificial neural networks have proposed federal distillation (Federated Distillation, FD) framework based on output communications, optimizing the communication consumption of bang learning.

Distillation loss function. In knowledge distillation, a teacher network and a student network communicate information through specific forms of knowledge to assist training. During training, the matching of knowledge is reflected by the loss function, so that the selection and definition of an appropriate loss function plays a decisive role in the effect of distillation. In general, the distillation loss function is defined as follows:

L＝L _hard +λL _soft

where L represents the final loss of the network, L _hard The loss of the hard tag is determined by the output value of the student network and the original tag of the training data set; l (L) _soft Representing soft label losses, is determined by the knowledge form of the student network and the knowledge of the teacher network, typically using a Cross-Entropy (Cross-Entropy) loss function. λ is a weight parameter that coordinates the specific gravity of the teacher's network to participate in the training. In the design of a scheme for knowledge distillation, defining a soft label loss function is often a core problem.

The computerized tomography (Computed Tomography, CT) is a cross-section scanning technique which uses precisely collimated X-ray beams, gamma rays, ultrasonic waves and the like, and surrounds a certain part of a human body together with a detector with extremely high sensitivity, and has the characteristics of quick scanning time, clear images and the like, and can be used for checking various diseases. The CT uses X-ray beam to scan the layer of a certain thickness of human body, the detector receives X-ray transmitted through the layer, and after converting into visible light, the visible light is converted into electric signal by photoelectric conversion, and then converted into digital by analog/digital converter, and input into computer for processing.

Disclosure of Invention

The invention provides a medical image processing model construction method based on federal knowledge distillation and an image processing method thereof, aiming at solving the problems of high privacy requirement and low model reliability caused by distillation loss in federal distillation and solving the problems of cost and effect matching in the electronic computer tomography medical image recognition technology.

The invention solves the technical problems and adopts the following technical scheme: a medical image processing model construction method based on federal knowledge distillation comprises the following steps:

step 1, collecting training data, constructing a training set, including: based on the common data set required by distillation obtained by preprocessing and sorting the open medical image data; based on the privacy CT image data of each medical institution participating in training, carrying out coordination pretreatment according to a public data set to obtain a private data set of the medical institution; the private data sets are in one-to-one correspondence with the medical institutions participating in training, and are mutually independent;

step 2, respectively constructing pulse neural networks of child nodes and central nodes, wherein the child nodes are in one-to-one correspondence with the private data sets;

training the corresponding sub-node pulse neural network by using each private data set, obtaining a first pulse tensor corresponding to the sub-node based on forward propagation of the public data set by using each sub-node pulse neural network obtained after training, and uploading the first pulse tensor to a central node;

step 4, the center node carries out distillation training based on a public data set according to the received first pulse tensor of each child node to obtain distillation products corresponding to each child node;

step 5, all distillation products are aggregated to obtain global parameters, the central node impulse neural network is updated by the global parameters, and a second impulse tensor is obtained based on forward propagation of a public data set by the updated central node impulse neural network and distributed to all child nodes;

step 6, each child node receives the second pulse tensor, carries out distillation training based on the public data set, and updates the pulse neural network parameters;

step 7, judging whether the number of the preset rounds of cyclic training or the model reaches a preset value, if so, stopping training, wherein the trained pulse neural network of the central node is a medical image processing model based on federal knowledge distillation; otherwise, returning to the step 3.

Further, in the step 3, training the child node impulse neural network corresponding to the private data sets by using the private data sets includes the following steps: the child node pulse neural network calculates the hard tag loss function L in the forward direction based on the corresponding private data set _hard And gradient, and training the pulse neural network by using a back propagation algorithmThe number of rounds, update the parameter;

wherein the hard tag loss function is as follows:

wherein L is _hard For the hard tag loss function, v represents the frequency vector calculated based on the output pulse,representing the true label vector, the calculation method is as follows:

wherein tar represents the category of the real label, onehot represents the one-time coding, and the coding mode is as follows:

where i represents the element index of the tag vector.

Further, in the step 3, the first pulse tensor is uploaded to the central node after binary compression; in the step 4, after receiving the child node compression tensor, the central node obtains a first pulse tensor after decompression.

Specifically, the binary compressing the first pulse tensor includes the following steps:

step 31: zeroing and initializing tensor sc for storing compression results;

step 32: the following calculation is performed for each element value sc of sc in the time window order:

wherein s is _t For pairs in the pulse tensor to be compressedT time window pulse values of the position element; sc is the element value in sc, and the variable is iteratively calculated as t;the element value of sc after calculation;

step 33: after all the calculation is completed, binary compression is completed;

in the step 4, the compressing tensor is decompressed to obtain a first pulse tensor, which includes the following steps:

step 41: zeroing and initializing a decompression tensor for storing a decompression result;

step 42: performing a reduction operation on each element value in the compression tensor to obtain an ordered pulse sequence, and storing the obtained result in a position corresponding to the decompression tensor in an inverted order;

step 43: after all the calculation execution is completed, the decompressed tensor is the first pulse tensor.

Further, in the step 4, the center node performs distillation training based on the public data set according to the received first pulse tensor of each child node to obtain a distillation product corresponding to each child node, and the method includes the following steps:

defining a distillation loss function, and stopping training when training the pulse neural network for a preset number of rounds by using a back propagation algorithm; wherein, distillation loss function is:

L _soft ＝L _T +λL _F

wherein L is _soft L is a distillation loss function _T To mean square error like loss of pulses, L _F For class cross entropy loss for relaxation, λ represents the relaxation variable, which is a preset parameter;

wherein, the mean square error-like loss for the pulse:

class cross entropy loss for relaxation:

wherein, C is the class number of the training data, T is the size of the impulse neural network time window; s is(s) _ct Andthe values of elements in a batch of predicted pulse matrix and a target pulse matrix are respectively; p is p _c And->Respectively representing a predicted frequency vector and a target frequency vector, and calculating through a pulse matrix, wherein the calculation mode is as follows:

further, in the step 5, all distillation products are polymerized to obtain global parameters, including:

the central node reserves an aggregation network buffer area and performs aggregation according to the following steps:

step 51, generating a copy of the pulse neural network of the current central node and placing the copy in an aggregation buffer area of the central node; selecting any first pulse tensor to be aggregated, performing distillation training on the copy based on the public data set, and updating to obtain a first network parameter;

step 52, judging whether all the child nodes are aggregated, if yes, taking the first network parameter as a global parameter; otherwise, go to step 53;

step 53, updating the central node pulse neural network copy based on the first network parameter, and copying the first copy in the aggregation buffer area to generate a copy thereof as a first copy;

step 54, selecting any first pulse tensor to be aggregated, performing distillation training on the copy based on the public data set, and updating to obtain a second copy and a second network parameter;

step 55, randomly selecting part of the public data set to have tag data, and generating a temporary test data set;

step 56, based on the temporary test data set, testing the first copy and the second copy respectively to obtain the test accuracy of the first copy and the second copy, generating an aggregation weight based on the test accuracy of the first copy and the second copy, and then calculating and updating the first network parameter by using the aggregation weight;

step 57, return to step 52.

Further, in the step 56, generating an aggregate weight based on the test accuracy of the two, and then calculating and updating the first network parameter by using the aggregate weight includes:

step 561, generating an aggregation weight based on the test accuracy rates a and a':

wherein (alpha, alpha') is an aggregation weight, delta is a preset retention factor, tau is a preset difference factor, and softmax represents a normalized exponential function;

step 562, calculating and updating a first network parameter by using the aggregation weight:

wherein,is a new first network parameter; w is a first network parameter corresponding to the first copy;

w' is a second network parameter corresponding to the second copy.

The invention also provides a medical image processing method based on federal knowledge distillation, a medical image processing model based on federal knowledge distillation is constructed according to the medical image processing model construction method based on federal knowledge distillation, and the medical image to be processed is processed by using the model.

The invention designs a new federal aggregation scheme, wherein a central node in the federal distillation of the impulse neural network needs to integrate information uploaded by sub-nodes so as to properly aggregate characteristics and output, thereby solving the problem of reduced model reliability caused by distillation loss. The distillation loss function is set, and the central node and the child nodes can utilize the output of the pulse neural network of the other party to carry out model training on the central node and the child nodes, so that information contained in the output can be extracted in a targeted mode. The invention can update the neural network parameters in training by utilizing a back propagation algorithm when training the impulse neural network, and improve the classification accuracy and training speed of the impulse neural network model. Meanwhile, when the output of both sides is utilized, the invention carries out lossless compression on the pulse output tensor of the pulse neural network, reduces federal communication overhead, and adjusts the demand proportion between the model accuracy and the communication cost.

Drawings

Fig. 1 is a flow chart of model construction in embodiment 1 of the present invention.

Fig. 2 is a flowchart of training a impulse neural network using a back propagation algorithm in embodiment 1 of the present invention.

Detailed Description

All of the features disclosed in this specification, or all of the steps in a method or process disclosed, may be combined in any combination, except for mutually exclusive features and/or steps. For a better understanding of the present invention, reference is made to the following description of the invention, taken in conjunction with the accompanying drawings and the following examples.

Example 1

The example provides a medical image processing model construction method based on federal knowledge distillation, as shown in fig. 1, comprising the following steps:

s101, training data are collected, and a training set is constructed.

In this example, the training data includes two training sets. The first method is based on preprocessing and sorting open medical image data, including image overturning, cutting, translation, normalization and the like, and determining the specification of the image, and finally obtaining a public data set required by distillation. In another aspect, the private CT image data of each participating medical institution is coordinated and preprocessed according to the public data set to obtain the private data set of the medical institution. The coordination preprocessing, namely, the image data of the medical institution are correspondingly adjusted according to the specifications, the channel number and other formats of the public data set.

The private data sets corresponding to each medical institution participating in training are set, the private data sets are mutually independent, the private data sets are not mutually propagated, and the privacy requirement is guaranteed.

S102, respectively constructing impulse neural networks of child nodes and central nodes, and using a classical deep neural network model VGGNet; the central node reserves an aggregation network buffer area; each medical institution participating in the training corresponds to a child node and to the private data set one-to-one.

According to the federal aggregation scheme, the child nodes are arranged for each medical institution participating in training, so that each medical institution has an independent pulse neural network to perform independent training, wherein the central node in federal distillation of the pulse neural network only needs to integrate information uploaded by the child nodes, does not need to participate in training of private data sets held by the child nodes, and meets the requirement of high privacy.

The construction of the pulse neural network of the child node and the central node comprises the steps of initializing parameters of each layer of the network and setting training super parameters, and the method is as follows:

definition of a pulsed neural network structure: comprises a total layer number, a convolution layer and a full connection layer; defining parameters of each layer of the convolution layer, an activation function used, whether a pooling layer exists or not, and the like; defining parameters of the full connection layer, whether a Dropout layer is used or not, and the like;

defining impulse neural network super parameters: including a network discharge threshold V _th A time window length T, etc.;

defining federal learning hyper-parameters including global total round number epoch _g Number of partial training rounds epoch _l ；

Training hyper-parameters are defined, including learning rate, batch size, and optimizers, etc.

The solution of the present invention does not depend on a specific impulse neural network model, where the network structure can be embedded in a common impulse neural network model, for example: the network structure can be set by adopting a VGG network in the traditional artificial neural network, and LIF nodes are added to form a pulse neural network; the discharge threshold value is V _th Window t=8, =0.1; the global total number of wheels and the local number of wheels are respectively set as epoch _g =50 and epoch _l =8; selecting cross entropy as a training loss function, adam as an optimization algorithm, and setting super parameters: the learning rate was 0.001 and the batch size was 64.

And S103, training the child node impulse neural network corresponding to the private data set by utilizing each private data set, obtaining a first impulse tensor corresponding to the child node based on forward propagation of the public data set by utilizing each child node impulse neural network obtained after training, and carrying out binary compression on the first impulse tensor and uploading the first impulse tensor to the central node. The first pulse tensor, namely, the output obtained by utilizing the child node pulse neural network obtained after training is based on the forward propagation of the public data set.

The method for training the child node impulse neural network by utilizing each private data set comprises the following steps of: the child node pulse neural network calculates the hard tag loss function L in the forward direction based on the corresponding private data set _hard And gradient, and training the pulse neural network by using a back propagation algorithm for a preset number of epoch rounds _l And updating the parameters.

Wherein the hard tag loss function is as follows:

where tar is the class of real tags, onehot represents one-hot coding, i.e.:

where i is the element index of the tag vector.

Binary compression is performed on the first pulse tensor, and the method comprises the following steps:

step 1: zeroing and initializing tensor sc for storing the compression result;

step 2: operations are performed on sc in accordance with a time window order:wherein s is _t For a t time window pulse value of a corresponding position element in a pulse tensor to be compressed, sc is an element variable in sc, and each element value is calculated by t iteration of the variable; />Element variables in the calculated sc;

step 3: after all calculations are performed, sc is the binary compression tensor required.

The example is a lossless compression algorithm for pulse output tensor of a pulse neural network. The algorithm can compress the pulse in a lossless manner, and reduce federal communication overhead.

The decompression method for the compression algorithm is as follows:

step 1: zeroing and initializing a decompression tensor for storing a decompression result;

step 2: performing a reduction operation on each element value in the compression tensor to obtain an ordered pulse sequence, and storing the obtained result in a position corresponding to the decompression tensor in an inverted order;

step 3: after all the calculation execution is completed, the decompressed tensor is the first pulse tensor.

The element reduction operation comprises the following specific steps:

copy element values into buffer variable sr and initialize pulse sequence array s _i ],i＝1,2,…,T；

Initializing i=1, and performing loop calculation: s is(s) _i =srmod 2, sr=srdiv2, i=i+1, where mod represents a remainder operation, div represents an integer division operation; cycling until i=t+1;

after the calculation is completed, the sequence array [ s ] _i ]I.e. the required pulse sequence.

And S104, the central node decompresses the received compression tensor to obtain a first pulse tensor of each child node, and performs distillation training based on the public data set according to the first pulse tensor to obtain a distillation product corresponding to each child node.

That is, the central node receives the first pulse tensor output by all the child nodes, and calculates the soft tag loss function L in an orderly forward direction on the common data set _soft And gradient, training the central node impulse neural network by using a back propagation algorithm for a preset training round number epoch _l And obtaining a distillation product corresponding to each sub-node.

Specifically, the method comprises the following steps:

first, a distillation loss function, i.e., a soft label loss function L, is defined _soft The method comprises the following steps:

L _soft ＝L _T +λL _F

wherein L is _soft L is a distillation loss function _T To mean square error like loss of pulses, L _F Cross entropy loss for class for relaxation; λ represents a relaxation variable, which is a preset parameter, and generally may be 0.1 to 10, and the lower the value, the higher the degree of strictness in fitting the output pulse tensor to the target pulse tensor, but the problem of overfitting may occur.

Wherein, the mean square error-like loss for the pulse:

class cross entropy loss for relaxation:

wherein, C is the class number of the training data, T is the size of the impulse neural network time window; s is(s) _ct Andthe values of the elements in the batch of predicted pulse matrices and the target pulse matrices, respectively. P is p _c And->The method is characterized in that the method comprises the steps of predicting a frequency vector and a target frequency vector respectively, and calculating through a pulse matrix, wherein the calculation mode is as follows:

based on the defined loss function, the central node impulse neural network is iteratively trained by using a back propagation algorithm to achieve the local training round number epoch in the super-parameters _l And stopping training when the training is stopped.

In the above steps S103 and S104, the method flow for training the neural network for the back propagation algorithm of the child node and the central node impulse neural network is shown with reference to fig. 2, and the method flow includes the following steps:

s201, setting the current training wheel number variable to 0;

s202, selecting a part of (Batch) training data in a data set as the training data of the round;

s203, acquiring an output predicted value through forward propagation;

s204, calculating a loss function value;

s205, performing optimization through a back propagation algorithm, and updating various parameters of the neural network;

s206, adding 1 to the training round number variable;

s207, judging whether a preset local training round is reached, if so, jumping to S208, and if not, jumping to S202;

s208 ends the training.

Based on the BP back propagation algorithm, the neural network parameters can be updated in training, and the classification accuracy and training speed of the impulse neural network model are improved.

Then, the distillation products of all the child nodes are polymerized to obtain global parameters, which are specifically as follows:

s301, dividing an aggregation network buffer area reserved by a central node into a first aggregation buffer area and a second aggregation buffer area;

s302, generating a copy of the current central node impulse neural network, and placing the copy in a first aggregation buffer area;

s303 selects any one of the first pulse tensors to be aggregated and substitutes it into L _soft Distilling and training the copy on the public data set, and updating the network parameters of the copy as first network parameters;

s303, judging whether all child nodes are aggregated, if yes, taking the first network parameter as a global parameter; otherwise, go to step S304;

s304, updating a central node impulse neural network copy placed in a first aggregation buffer area based on a first network parameter and taking the central node impulse neural network copy as a first copy; copying the first copy to generate a copy thereof, and placing the copy in a second polymerization buffer area;

s305, selecting any first pulse tensor to be aggregated, carrying out distillation training on the copy based on the public data set, and updating to obtain a second copy and a second network parameter;

s306, randomly selecting part of the public data set to have tag data, and generating a temporary test data set;

s307, based on the temporary test data set, testing the first copy and the second copy respectively to obtain test accuracy a and a' of the first copy and the second copy, generating an aggregation weight based on the test accuracy of the first copy and the second copy, and then calculating and updating a first network parameter by using the aggregation weight;

s308 returns to step S303.

The method for generating the aggregation weight based on the test accuracy rates a and a' is as follows:

wherein, (alpha, alpha') is an aggregation weight, delta is a preset retention factor, and delta is usually more than 0.5; τ is a preset difference factor; softmax represents the softmax function used for normalization of weights;

new network parameters are calculated using the aggregate weights, as follows:

wherein,is a new network parameter; w is a first network parameter corresponding to the first copy; w' is a second network parameter corresponding to the second copy.

Finally, after updating the central node impulse neural network by using the global parameter, forward transmitting the central node impulse neural network on a public data set to obtain a second impulse tensor, and distributing the second impulse tensor to all the child nodes; the child node receives a second pulse tensor for carrying out distillation training on the public data set, synchronously updates the child node pulse neural network parameters, and enters step S103 as a trained child node pulse neural network for cyclic training; until the global total round number epoch is circularly trained _g After the model reaches a preset value, stopping training; the pulsed neural network of the central node is a medical image processing model based on federal knowledge distillation.

Example 2

The present example proposes a medical image processing method specifically related to federal knowledge distillation, comprising constructing the federal knowledge distillation-based medical image processing model in example 1, and applying it to medical image processing.

According to the above flow, after the algorithm is finished, the accuracy of classifying and identifying the medical images is improved by about 5% compared with the independent training of the mechanism, and meanwhile, the communication cost on federal learning is reduced by 90%, which is a great optimization.

Claims

1. The medical image processing model construction method based on federal knowledge distillation is characterized by comprising the following steps of:

step 1, collecting training data, constructing a training set, including: based on the common data set required by distillation obtained by preprocessing and sorting the open medical image data; based on the privacy CT image data of each medical institution participating in training, carrying out coordination pretreatment according to a public data set to obtain a private data set of the medical institution;

the private data sets are in one-to-one correspondence with the medical institutions participating in training, and are mutually independent;

step 7, judging whether the number of the preset rounds of cyclic training or the model reaches a preset value, if so, stopping training, wherein the trained pulse neural network of the central node is a medical image processing model based on federal knowledge distillation; otherwise, returning to the step 3;

in step 5, all distillation products are polymerized to obtain global parameters, including:

step 57, returning to step 52;

in step 56, generating an aggregate weight based on the test accuracy of both, and then updating the first network parameter using the aggregate weight calculation, including:

wherein,is a new first network parameter; w is a first network parameter corresponding to the first copy; w' is a second network parameter corresponding to the second copy.

2. The method for constructing a medical image processing model based on federal knowledge distillation according to claim 1, wherein in the step 3, each private data set is used to train the corresponding sub-node impulse neural network, and the method comprises the following steps: the child node pulse neural network calculates the hard tag loss function L in the forward direction based on the corresponding private data set _hard And gradient, training the pulse neural network by using a back propagation algorithm for a preset number of rounds, and updating parameters;

wherein the hard tag loss function is as follows:

wherein L is _hard As a hard tag loss function, v denotes based on output pulseThe calculated frequency vector is used to determine the frequency of the signal,representing the true label vector, the calculation method is as follows:

where i represents the element index of the tag vector.

3. The method for constructing a medical image processing model based on federal knowledge distillation according to claim 1, wherein in step 3, the first pulse tensor is uploaded to the central node after binary compression; in the step 4, after receiving the child node compression tensor, the central node obtains a first pulse tensor after decompression.

4. The method for constructing a federal knowledge distillation based medical image processing model according to claim 3,

the binary compression of the first pulse tensor comprises the following steps:

step 31: zeroing and initializing tensor sc for storing compression results;

wherein s is _t For corresponding position elements in the pulse tensor to be compressedt time window pulse value; sc is the element value in sc, and the variable is iteratively calculated as t;the element value of sc after calculation;

5. The method for constructing a medical image processing model based on federal knowledge distillation according to any one of claims 1, 3 and 4, wherein in the step 4, the center node performs distillation training based on a common data set according to the received first pulse tensor of each child node to obtain a distillation product corresponding to each child node, and the method comprises the following steps:

L _soft ＝L _T +λL _F

wherein, the mean square error-like loss for the pulse:

class cross entropy loss for relaxation:

6. a medical image processing method based on federal knowledge distillation is characterized in that: the medical image processing model based on federal knowledge distillation constructed according to the medical image processing model construction method based on federal knowledge distillation as set forth in any one of claims 1 to 5, and image processing is performed on a medical image to be processed using the model.