CN116228255A

CN116228255A - Cereal origin tracing method based on multi-information ticket simulation mechanism and DAN algorithm

Info

Publication number: CN116228255A
Application number: CN202211551253.9A
Authority: CN
Inventors: 韩宇星; 林鹏; 郭阳; 白兵宜; 张梦杰
Original assignee: South China Agricultural University
Current assignee: South China Agricultural University
Priority date: 2022-12-05
Filing date: 2022-12-05
Publication date: 2023-06-06

Abstract

The disclosure relates to a valley origin tracing method based on a multi-information ticket mechanism and a DAN algorithm, which is used for improving the recognition accuracy of a target domain grain origin recognition model while reducing target domain grain sample data. The method comprises the following steps: acquiring source domain sample data and target domain sample data; training a neural network model to be trained according to source domain sample data and a first loss function to obtain a source domain grain origin identification model, wherein the first loss function comprises a first loss item, a second loss item and a third loss item, the first loss item represents differences between a predicted origin of the sample data and an origin label of the sample data, the second loss item represents original characteristics of various target spectrum information contained in the sample data, and the third loss item represents field distribution differences between the source domain sample data and the target domain sample data; and performing migration learning on the source domain grain origin identification model according to the target domain sample data to obtain the target domain grain origin identification model.

Description

Cereal origin tracing method based on multi-information ticket simulation mechanism and DAN algorithm

Technical Field

The disclosure relates to the technical field of artificial intelligence, in particular to a valley origin tracing method based on a multi-information ticket simulation mechanism and a DAN algorithm.

Background

In general, because the climates, soil, water resources and other natural conditions in different areas are obviously different, the quality of grains such as rice, millet, barley, oat and the like is also obviously different, so that the geographic tag authentication has important significance for guaranteeing the quality of the grains. However, for some reasons, there are cases where the geographic labeling of grains is not true. Therefore, it is necessary to propose some methods of origin tracing of cereal origin or identification of cereal origin.

In the related art, there is a method for helping to trace the origin of the cereal product by combining the deep learning technology with the spectrum data, however, the cereal product tracing method in the related art needs to train through a large amount of sample data carrying labels, so that better identification accuracy can be obtained in specific cereal types.

However, the collection of a large number of spectral data meeting the training quality requirements is a time consuming and labor intensive task, as the high cost of manual labeling creates a large, high quality database that is also challenging. In addition, the expensive precious data set is easily outdated and is difficult to reuse in new tasks. Therefore, the grain tracing method in the related art limits the development of deep learning in the field of grain origin identification based on spectral data due to the problem that sample data is difficult to acquire.

Disclosure of Invention

An object of the present disclosure is to provide a method, an apparatus, a storage medium and an electronic device for tracing a cereal origin, so as to at least partially solve the above-mentioned problems in the related art.

To achieve the above object, in a first aspect, the present disclosure provides a method for tracing a cereal origin, the method comprising:

acquiring a first number of source domain sample data and a second number of target domain sample data, wherein each source domain sample data comprises various target spectrum information of the source domain grain and a production place label corresponding to the source domain grain, and each target domain sample data comprises various target spectrum information of the target domain grain and a production place label corresponding to the target domain grain;

training the neural network model to be trained according to the first number of source domain sample data and a first loss function to obtain a source domain grain origin identification model, wherein the first loss function comprises a first loss item, a second loss item and a third loss item, the first loss item represents differences between a predicted origin of the sample data and origin marks of the sample data, the second loss item represents original characteristics of various target spectrum information included in the sample data, and the third loss item represents field distribution differences between the source domain sample data and the target domain sample data;

and performing transfer learning on the source domain grain origin recognition model according to the second number of target domain sample data to obtain a target domain grain origin recognition model, wherein the target domain grain origin recognition model is used for performing origin recognition on the target domain grains.

Optionally, the performing migration learning on the source domain grain origin identification model according to the second number of target domain sample data to obtain a target domain grain origin identification model includes:

training the source domain grain origin recognition model according to the second number of target domain sample data and a second loss function to obtain a target domain grain origin recognition model, wherein the second loss function comprises the first loss term and the second loss term.

Optionally, the method further comprises:

acquiring multiple original spectrum information of each source domain grain and multiple original spectrum information of each target domain grain;

performing data preprocessing on each piece of original spectrum information by adopting a z-score standardization method to obtain intermediate spectrum information corresponding to each piece of original spectrum information;

and performing dimension reduction processing on each intermediate spectrum information to obtain each target spectrum information.

Optionally, the second loss term is expressed by the following formula:

wherein loss2 represents a second loss term, which is a regularization term, lambda represents regularization parameters, a, b and c are respectively ticket coefficients of target spectrum information of corresponding types, and the target spectrum information is obtained through model training,

p-th target spectral information representing sample data, < >>

Representation l ² And solving norms, wherein the sample data is the source domain sample data or the target domain sample data.

Optionally, the third loss term is expressed by the following formula:

wherein MMD (X) _S ，X _T ) X represents _kS Represents the kth source domain sample data, X _iT Represents the i-th target domain sample data, m represents the number of source domain sample data, n represents the number of target domain sample data,

represents the Hilbert space, phi (·) represents the distance from the sample data space χ to the Hilbert space +.>

Is described.

Optionally, the plurality of target spectral information of the source domain cereal comprises laser induced breakdown spectral information, near infrared spectral information and hyperspectral information, and the plurality of target spectral information of the target domain cereal comprises laser induced breakdown spectral information, near infrared spectral information and hyperspectral information.

Optionally, the neural network model to be trained is a one-dimensional ResNet-18 network.

To achieve the above object, in a second aspect, the present disclosure provides a cereal origin tracing apparatus, the apparatus comprising:

the system comprises an acquisition module, a storage module and a storage module, wherein the acquisition module is used for acquiring a first number of source domain sample data and a second number of target domain sample data, each source domain sample data comprises various target spectrum information of the source domain grains and a production place label corresponding to the source domain grains, and each target domain sample data comprises various target spectrum information of the target domain grains and a production place label corresponding to the target domain grains;

the first training module is used for training the neural network model to be trained according to the first number of source domain sample data and a first loss function to obtain a source domain grain origin identification model, the first loss function comprises a first loss item, a second loss item and a third loss item, the first loss item represents differences between a predicted origin of the sample data and origin marks of the sample data, the second loss item represents original characteristics of various target spectrum information included in the sample data, and the third loss item represents field distribution differences between the source domain sample data and the target domain sample data;

and the second training module is used for performing migration learning on the source domain grain origin recognition model according to the second number of target domain sample data to obtain a target domain grain origin recognition model, wherein the target domain grain origin recognition model is used for performing origin recognition on the target domain grains.

In a third aspect, the present disclosure provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method of any of the first aspects.

In a fourth aspect, the present disclosure provides an electronic device comprising:

a memory having a computer program stored thereon;

a processor for executing the computer program in the memory to implement the steps of the method of any of the first aspects.

According to the technical scheme, the first number of source domain sample data and the second number of target domain sample data are obtained, each source domain sample data comprises multiple kinds of target spectrum information of the source domain grains and the production place tags corresponding to the source domain grains, and each target domain sample data comprises multiple kinds of target spectrum information of the target domain grains and the production place tags corresponding to the target domain grains; training the neural network model to be trained according to the first number of source domain sample data and a first loss function to obtain a source domain grain origin identification model, wherein the first loss function comprises a first loss item, a second loss item and a third loss item, the first loss item represents differences between a predicted origin of the sample data and origin marks of the sample data, the second loss item represents original characteristics of various target spectrum information included in the sample data, and the third loss item represents field distribution differences between the source domain sample data and the target domain sample data; and performing transfer learning on the source domain grain origin recognition model according to the second number of target domain sample data to obtain a target domain grain origin recognition model, wherein the target domain grain origin recognition model is used for performing origin recognition on the target domain grains. According to the method, the problem that sample data acquisition is difficult is solved based on a grain origin identification task, through a migration learning method, more sample data of one grain is required to be acquired, a target domain grain origin identification model finally obtained through training can be better in various target domain grain origin identifications, so that acquisition difficulty of a target domain grain sample is reduced, in addition, through introducing a second loss item, detailed information of a lost original spectrum can be introduced into consideration, namely, the original spectrum information is also saved, identification accuracy of the finally obtained target domain grain origin identification model is improved, namely, origin tracing accuracy of the grain origin is improved, in addition, through introducing a third loss item, spectrum difference between different grains is reduced, the migration learning model is enabled to be more attached to target domain data, identification accuracy of a target domain data set is improved, and identification accuracy of the finally obtained target domain grain origin identification model is further improved, namely, origin tracing accuracy of the grain origin is further improved.

Additional features and advantages of the present disclosure will be set forth in the detailed description which follows.

Drawings

The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate the disclosure and together with the description serve to explain, but do not limit the disclosure. In the drawings:

FIG. 1 is a flow diagram illustrating a method of tracing a cereal product origin according to an exemplary embodiment of the present disclosure;

FIG. 2 is a block diagram of a cereal production tracing apparatus, shown according to an exemplary embodiment of the present disclosure;

fig. 3 is a block diagram of an electronic device, according to an exemplary embodiment of the present disclosure.

Detailed Description

Specific embodiments of the present disclosure are described in detail below with reference to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating and illustrating the disclosure, are not intended to limit the disclosure.

Fig. 1 is a flow chart illustrating a method of tracing a grain yield according to an exemplary embodiment of the present disclosure. Referring to fig. 1, the valley origin tracing method includes:

s110, acquiring a first number of source domain sample data and a second number of target domain sample data, wherein each source domain sample data comprises various target spectrum information of source domain grains and a production place label corresponding to the source domain grains, and each target domain sample data comprises various target spectrum information of the target domain grains and a production place label corresponding to the target domain grains.

In the embodiment of the disclosure, the source domain sample data may be a large amount of acquired sample data, and the target domain sample data may be a small amount of acquired sample data. Thus, a large amount of sample data carrying the place of origin tag is acquired for training without the need for each grain to be predicted.

In an embodiment of the present disclosure, the first number is greater than the second number.

It should be noted that, the method of the embodiment of the present disclosure may be applicable to the field of identifying a grain origin, and when multiple grains need to be identified, only a first amount of sample data carrying a origin tag of any grain and a second amount of sample data carrying a origin tag of other grains except the grain need to be obtained, so that a better effect can be obtained in prediction of each grain origin.

It should be noted that the spectrum types of the various kinds of the target spectrum information of the source domain grain and the various kinds of the spectrum information of the target domain grain should be identical. Illustratively, in some embodiments, the plurality of target spectral information of the source domain cereal includes Laser-induced breakdown spectral information (Laser-Induced Breakdown Spectroscopy, LIBS), near-infrared spectral information (Near-Infrared Rpectroscopy, NIRS) and hyperspectral information, and the plurality of target spectral information of the target domain cereal includes Laser-induced breakdown spectral information, near-infrared spectral information and hyperspectral information.

In the embodiment of the disclosure, the grain origin recognition processing is performed by extracting various spectrum information of grains, and the characteristic expression advantages of different spectrum information can be comprehensively considered, so that the effect of information complementation and fusion is realized, and the recognition accuracy of a target domain grain origin recognition model obtained by subsequent transfer learning is improved.

S120, training a neural network model to be trained according to a first number of source domain sample data and a first loss function to obtain a source domain grain origin identification model, wherein the first loss function comprises a first loss item, a second loss item and a third loss item, the first loss item represents differences between a predicted origin of the sample data and an origin label of the sample data, the second loss item represents original characteristics of various target spectrum information contained in the sample data, and the third loss item represents field distribution differences between the source domain sample data and the target domain sample data.

In an embodiment of the present disclosure, the first penalty function includes a first penalty term, a second penalty term, and a third penalty term.

Wherein the first penalty term characterizes the difference between the predicted origin of the sample data and the origin label of the sample data, i.e. the classification error penalty term included in the conventional penalty function, the first penalty term may be expressed in some embodiments by the following formula:

where loss1 represents a first loss term, h _w (X) represents the predicted origin of the network model to the sample data, Y represents the origin label of the sample data, w represents the weight matrix, and z represents the number of the sample data.

Wherein the sample data is source domain sample data when training the neural network model to be trained according to the first source domain sample data, in which case,

wherein h is _w (X _kS ) Representing the predicted origin of the network model to the kth source domain sample data, Y _kS A place of origin tag representing kth source domain sample data, and m represents the number of source domain sample data. />

Wherein the sample data is target domain sample data when performing transfer learning of the source domain valley origin identification model based on a second number of target domain sample data, in which case,

wherein h is _w (X _iT ) Representing a network model to an ith targetPredictive origin of domain sample data, Y _iT A place of origin tag indicating the i-th target domain sample data, and n indicating the number of target domain sample data.

Wherein the second loss term characterizes original features of a plurality of target spectral information included in the sample data. In some embodiments, the second loss term is expressed by the following formula:

wherein loss2 represents a second loss term, a, b and c are respectively ticket coefficients of target spectrum information of corresponding types, and are obtained through model training,

p-th target spectrum information representing sample data.

When the neural network model to be trained is trained according to the first source domain sample data, the sample data is the source domain sample data, and in this case:

wherein (1)>

The first target spectrum information, the second target spectrum information and the p-th target spectrum information of the kth source domain sample data are respectively represented.

wherein (1)>

Respectively represent the ith target domainFirst target spectrum information, second target spectrum information, and p-th target spectrum information of sample data.

In the embodiment of the disclosure, by introducing the second loss term, when the source domain sample data is utilized to perform source domain valley origin identification model training, the characteristic extracted by the characteristic extractor such as the convolution layer of the model is a characteristic helpful to identify the source domain sample data (such as rice), and common characteristics among some grains are discarded, and by adding the second loss term, the detail information of the lost original spectrum can be taken into consideration because the second loss term records the original characteristic information of the source domain sample data, namely, the original spectrum information is also saved, so that the accuracy of the model after the subsequent migration learning is higher.

Wherein the third penalty term characterizes a domain distribution difference between the source domain sample data and the target domain sample data as a penalty term to constrain the model to be trained, the third penalty term being expressed in some embodiments by the following formula:

wherein loss3 represents a third loss term, X _kS Represents the kth source domain sample data, X _iT Represents the i-th target domain sample data, m represents the number of source domain sample data, n represents the number of target domain sample data,

And γ represents a trade-off parameter. In some embodiments, γ=0.25.

Wherein the DAN (DeepAdaptation Networks) algorithm is based on the MMD (X _S ，X _T ) Algorithm for calculating the formula.

In the embodiment of the disclosure, because the spectrum one-dimensional data is adopted, the spectrum difference between different grains is larger, and therefore, by introducing the third loss item representing the field distribution difference between the source field sample data and the target field sample data, the spectrum difference between different grains can be reduced, so that the migration learning model is more attached to the target field data, and the identification precision of the target field data set is improved.

In connection with the foregoing, it can be appreciated that in some embodiments, the first loss function can be expressed by the following equation:

in some embodiments, the neural network model to be trained may be a one-dimensional convolutional neural network (L1D-CNN).

Alternatively, the one-dimensional CNN (L1D-CNN) includes four convolutional layers, four fully-connected layers, and one output layer. The L1D-CNN structure is to determine parameters such as the number of convolution layers, the number of full connection layers and the like by using a trail-and-error method. The convolution layer is a core module for realizing the CNN feature extraction function, and features of different layers of input samples are obtained through stacking the convolution layers.

This is because, as the structure level of the CNN model deepens, the classification performance of the model is better, the amount of parameters required to be calculated is multiplied, and a network structure too deep is liable to cause an overfitting phenomenon due to the shortage of the amount of data. Therefore, the one-dimensional CNN with the structure is designed, and all spectrum variables participate in convolution operation, so that too much spectrum information is prevented from being lost. The first few layers of the convolution layer can retain more structural information, while the higher layers of the convolution layer retain more semantic information. Likewise, designing a suitable convolution kernel size can preserve more information and all features can participate in the operation when the step size is less than or equal to the convolution kernel size, so that in some embodiments, the convolution kernel size of the one-dimensional CNN of the above structure is designed to be 3 and the step size is designed to be 2. Batch Normalization (BN) and activation functions are then used, rectified linear unit (ReLU), following each convolution layer, to reduce the risk of overfitting and speed up the fitting process.

In addition, considering that as the structure of the CNN model goes deep, the CNN can learn more characteristic information and more characteristic distribution can be fitted, the CNN can be made more expressive by properly increasing the model structure level, and thus, in some embodiments, the neural network model to be trained can be a one-dimensional ResNet-18 network.

ResNet introduces a residual function, so that the model is easier to optimize, and the feature map is more sensitive to the change of output, so that convergence can be achieved more quickly. ResNet connects the input and output by skipping certain layers through shortcut connections, which solves the problem of the network performance degradation as network depth increases, and avoids the occurrence of overfitting to some extent.

In some embodiments, the first part of a one-dimensional ResNet-18 network is a convolution layer with a convolution kernel size of 1×7, a step size of 1, and a padding of 1, the convolution filter is set to 64, followed by a convolutional layer with BN operation and with a ReLU as an activation function, and then feature dimension reduction with a max pooling operation. The second part consists of four residual modules consisting of two convolution layers with convolution kernels of 1×3, BN operation, and an activation function ReLU, the first residual module convolution filter is set to 64, the second residual module convolution filter is set to 128, the third residual module convolution filter is set to 256, and the last residual module convolution filter is set to 512. The averaging pooling layer follows the last residual module to average the extracted features. And finally, connecting a full connection layer.

In the embodiment of the disclosure, after the first number of source domain sample data is obtained, the neural network model to be trained may be trained by using the first number of source domain sample data with the goal of minimizing the first loss function, so as to obtain the source domain grain origin identification model.

And S130, performing transfer learning on the source domain grain origin recognition model according to the second number of target domain sample data to obtain a target domain grain origin recognition model, wherein the target domain grain origin recognition model is used for performing origin recognition on the target domain grains.

In the embodiment of the disclosure, considering that similar modes exist between spectral vectors of different grains, the identification of the origin of different grains belongs to different but similar tasks in the same field, so that a migration learning method is provided in the embodiment of the disclosure, and a model with a better identification effect can be obtained even under the condition of less sample data in a target field. Thus, when the target domain grain origin recognition model is obtained, the target domain grain origin recognition model can be used for origin recognition of the target domain grain.

In some embodiments, according to the second number of target domain sample data, performing migration learning on the source domain grain origin identification model to obtain a target domain grain origin identification model, including:

As can be seen from the foregoing, when the migration learning is performed on the source domain valley origin identification model based on the second number of target domain sample data, the sample data is the target domain sample data, in this case,

wherein h is _w (X _iT ) Representing the predicted origin of the network model on the ith target domain sample data, Y _iT A place of origin tag indicating the i-th target domain sample data, and n indicating the number of target domain sample data.

wherein (1)>

The first target spectrum information, the second target spectrum information and the p-th target spectrum information of the i-th target domain sample data are respectively represented.

In connection with the foregoing, it can be appreciated that in some embodiments, the second loss function can be expressed by the following equation:

in the embodiment of the disclosure, after obtaining the second number of target domain sample data, the second loss function may be minimized, and the source domain cereal origin recognition model may be trained by using the second number of target domain sample data, so as to obtain the target domain cereal origin recognition model.

In some embodiments, when training the source domain valley origin identification model using the second number of target domain sample data, parameters of the full-connection layer and the target task layer may be fine-tuned while keeping the convolution layer parameters frozen from parameter updating.

In the embodiment of the disclosure, similarly, by introducing the second loss term, the detail information of the lost original spectrum can be taken into consideration, that is, the original spectrum information is also saved, so that the accuracy of the target domain grain origin identification model obtained through training is higher.

In some implementations, the method of the disclosed embodiments may further include the steps of:

In the embodiment of the disclosure, the original spectrum information can be understood as spectrum information which is directly acquired and is not subjected to data preprocessing and feature transformation.

In the embodiment of the disclosure, after obtaining multiple kinds of original spectrum information of each source domain grain and multiple kinds of original spectrum information of each target domain grain, a z-score standardization method may be adopted to perform data preprocessing on each kind of original spectrum information of each of the source domain grain and the target domain grain, so as to obtain intermediate spectrum information corresponding to each kind of original spectrum information.

In some embodiments, the z-score normalization method can be expressed by the following formula:

wherein,,

representing intermediate spectral information after pretreatment by the z-score normalization method, with an expected value of 0,/and 0>

Representing the p-th original spectral feature information corresponding to the r-th sample,/th sample>

Represents the mean calculated from the p-th raw spectral information for each of the z samples, and σ represents the variance calculated from the p-th raw spectral information for each of the z samples.

Wherein, when the sample is a source domain grain, the above formula is expressed as:

wherein,,

representing intermediate spectral information, expected to be 0,

representing the p-th original spectral feature information corresponding to the kth source domain sample,/for>

Represents the mean calculated from the p-th raw spectral information for each of the m source domain samples, and σ represents the variance calculated from the p-th raw spectral information for each of the m source domain samples.

Wherein, when the sample is a target domain grain, the above formula is expressed as:

wherein,,

representing intermediate spectral information, expected to be 0,

representing the p-th original spectral feature information corresponding to the i-th target domain sample,/and (ii)>

Represents the mean calculated from the p-th raw spectral information for each of the n target domain samples, and σ represents the variance calculated from the p-th raw spectral information for each of the n target domain samples.

The original spectrum information of the grains is preprocessed through the z-score standardization method, on one hand, the influence of fluctuation on the spectrum can be reduced, on the other hand, the expected value of the intermediate spectrum information after preprocessing is 0, and the expected value of the loss function during model training is also 0, and the expected value are suitable for each other, so that the preprocessing process can more keep the information of the original spectrum.

In the embodiment of the disclosure, after each intermediate spectrum information of each of the source domain grain and the target domain grain is obtained, the dimension reduction processing may be continuously performed on each intermediate spectrum information, so as to obtain each target spectrum information.

In some embodiments, the dimension reduction process may be performed using PCA techniques.

Illustratively, each intermediate spectral information dimension after preprocessing is reduced to 500 using PCA techniques, leaving only the first 500 features of greater impact, thereby obtaining target spectral information, where the target spectral information is used

Representing target spectral information, wherein +.>

It is indicated that +.>

And (3) representing.

By adopting the method, the first number of source domain sample data and the second number of target domain sample data are obtained, each source domain sample data comprises various target spectrum information of the source domain grains and a production place label corresponding to the source domain grains, and each target domain sample data comprises various target spectrum information of the target domain grains and a production place label corresponding to the target domain grains; training the neural network model to be trained according to the first number of source domain sample data and a first loss function to obtain a source domain grain origin identification model, wherein the first loss function comprises a first loss item, a second loss item and a third loss item, the first loss item represents differences between a predicted origin of the sample data and origin marks of the sample data, the second loss item represents original characteristics of various target spectrum information included in the sample data, and the third loss item represents field distribution differences between the source domain sample data and the target domain sample data; and performing transfer learning on the source domain grain origin recognition model according to the second number of target domain sample data to obtain a target domain grain origin recognition model, wherein the target domain grain origin recognition model is used for performing origin recognition on the target domain grains. According to the method, the problem that sample data acquisition is difficult is solved based on a grain origin identification task, through a migration learning method, more sample data of one grain is required to be acquired, a target domain grain origin identification model finally obtained through training can be better in various target domain grain origin identifications, so that acquisition difficulty of a target domain grain sample is reduced, in addition, through introducing a second loss item, detailed information of a lost original spectrum can be introduced into consideration, namely, the original spectrum information is also saved, identification accuracy of the finally obtained target domain grain origin identification model is improved, namely, origin tracing accuracy of the grain origin is improved, in addition, through introducing a third loss item, spectrum difference between different grains is reduced, the migration learning model is enabled to be more attached to target domain data, identification accuracy of a target domain data set is improved, and identification accuracy of the finally obtained target domain grain origin identification model is further improved, namely, origin tracing accuracy of the grain origin is further improved.

Based on the same conception, the disclosure also provides a cereal origin tracing device, which can be part or all of the electronic equipment by means of software, hardware or a combination of the two. Referring to fig. 2, the grain origin tracing apparatus 200 may include:

an obtaining module 210, configured to obtain a first number of source domain sample data and a second number of target domain sample data, where each source domain sample data includes multiple kinds of target spectrum information of the source domain grain and a production place tag corresponding to the source domain grain, and each target domain sample data includes multiple kinds of target spectrum information of the target domain grain and a production place tag corresponding to the target domain grain;

the first training module 220 is configured to train the neural network model to be trained according to the first number of source domain sample data and a first loss function, so as to obtain a source domain grain origin identification model, where the first loss function includes a first loss term, a second loss term and a third loss term, the first loss term characterizes differences between a predicted origin of the sample data and origin labels of the sample data, the second loss term characterizes original features of multiple kinds of target spectrum information included in the sample data, and the third loss term characterizes differences in domain distribution between the source domain sample data and the target domain sample data;

the second training module 230 is configured to perform transfer learning on the source domain cereal origin recognition model according to the second number of target domain sample data, so as to obtain a target domain cereal origin recognition model, where the target domain cereal origin recognition model is used for performing origin recognition on the target domain cereal.

Optionally, the second training module 230 is further configured to train the source domain grain origin recognition model according to the second number of target domain sample data and a second loss function, where the second loss function includes the first loss term and the second loss term, to obtain a target domain grain origin recognition model.

Optionally, the apparatus 200 further comprises:

the original spectrum information acquisition module is used for acquiring various original spectrum information of each source domain grain and various original spectrum information of each target domain grain;

the preprocessing module is used for preprocessing data of each original spectrum information by adopting a z-score standardization method to obtain intermediate spectrum information corresponding to each original spectrum information;

and the dimension reduction processing module is used for carrying out dimension reduction processing on each intermediate spectrum information to obtain each target spectrum information.

Optionally, the second loss term is expressed by the following formula:

representing sample dataP-th target spectral information of +.>

Optionally, the third loss term is expressed by the following formula:

Is described.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

Based on the same inventive concept, the present disclosure also provides an electronic device, including:

a memory having a computer program stored thereon;

and a processor for executing the computer program in the memory to implement the steps of any of the data determination methods described above.

In a possible manner, the block diagram of the electronic device may be as shown in fig. 3. Referring to fig. 3, the electronic device 300 may include: a processor 301, a memory 302. The electronic device 300 may also include one or more of a multimedia component 303, an input/output (I/O) interface 304, and a communication component 305.

The processor 301 is configured to control the overall operation of the electronic device 300 to complete all or part of the steps in the valley product tracing method described above. The memory 302 is used to store various types of data to support operation at the electronic device 300, which may include, for example, instructions for any application or method operating on the electronic device 300, as well as application-related data, such as contact data, transceived messages, pictures, audio, video, and the like. The Memory 302 may be implemented by any type or combination of volatile or non-volatile Memory devices, such as static random access Memory (Static Random Access Memory, SRAM for short), electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM for short), erasable programmable Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM for short), programmable Read-Only Memory (Programmable Read-Only Memory, PROM for short), read-Only Memory (ROM for short), magnetic Memory, flash Memory, magnetic disk, or optical disk. The multimedia component 303 may include a screen and an audio component. Wherein the screen may be, for example, a touch screen, the audio component being for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signals may be further stored in the memory 302 or transmitted through the communication component 305. The audio assembly further comprises at least one speaker for outputting audio signals. The I/O interface 304 provides an interface between the processor 301 and other interface modules, which may be a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 305 is used for wired or wireless communication between the electronic device 300 and other devices. Wireless communication, such as Wi-Fi, bluetooth, near field communication (Near Field Communication, NFC for short), 2G, 3G, 4G, NB-IOT, eMTC, or other 5G, etc., or one or a combination of more of them, is not limited herein. The corresponding communication component 305 may thus comprise: wi-Fi module, bluetooth module, NFC module, etc.

In an exemplary embodiment, the electronic device 300 may be implemented by one or more application specific integrated circuits (Application Specific Integrated Circuit, abbreviated as ASIC), digital signal processors (Digital Signal Processor, abbreviated as DSP), digital signal processing devices (Digital Signal Processing Device, abbreviated as DSPD), programmable logic devices (Programmable Logic Device, abbreviated as PLD), field programmable gate arrays (Field Programmable Gate Array, abbreviated as FPGA), controllers, microcontrollers, microprocessors, or other electronic components for performing the valley origin tracing method described above.

In another exemplary embodiment, a computer readable storage medium is also provided that includes program instructions that, when executed by a processor, implement the steps of the grain origin tracing method described above. For example, the computer readable storage medium may be the memory 302 including program instructions described above, which are executable by the processor 301 of the electronic device 300 to perform the valley origin tracing method described above.

In another exemplary embodiment, a computer program product is also provided, comprising a computer program executable by a programmable apparatus, the computer program having code portions for performing the above-described grain origin tracing method when executed by the programmable apparatus.

The preferred embodiments of the present disclosure have been described in detail above with reference to the accompanying drawings, but the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solutions of the present disclosure within the scope of the technical concept of the present disclosure, and all the simple modifications belong to the protection scope of the present disclosure.

In addition, the specific features described in the foregoing embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, the present disclosure does not further describe various possible combinations.

Moreover, any combination between the various embodiments of the present disclosure is possible as long as it does not depart from the spirit of the present disclosure, which should also be construed as the disclosure of the present disclosure.

Claims

1. A method for tracing a cereal origin, the method comprising:

acquiring a first number of source domain sample data and a second number of target domain sample data, wherein each source domain sample data comprises various target spectrum information of source domain grains and a production place label corresponding to the source domain grains, and each target domain sample data comprises various target spectrum information of target domain grains and a production place label corresponding to the target domain grains;

and performing migration learning on the source domain grain origin recognition model according to the second number of target domain sample data to obtain a target domain grain origin recognition model, wherein the target domain grain origin recognition model is used for performing origin tracing recognition on the target domain grains.

2. The method of claim 1, wherein performing the migration learning on the source domain grain origin recognition model according to the second number of target domain sample data to obtain a target domain grain origin recognition model comprises:

3. The method according to claim 1, wherein the method further comprises:

4. The method of claim 1, wherein the second loss term is represented by the following formula:

p-th target spectral information representing sample data, < >>

5. The method of claim 1, wherein the third loss term is represented by the following formula:

wherein MMD (X) _S ,X _T ) X represents _kS Represents the kth source domain sample data, X _iT Represents the i-th target domain sample data, m represents the number of source domain sample data, n represents the number of target domain sample data,

Is described.

6. The method of claim 1, wherein the plurality of target spectral information of the source domain cereal comprises laser induced breakdown spectral information, near infrared spectral information, and hyperspectral information, and wherein the plurality of target spectral information of the target domain cereal comprises laser induced breakdown spectral information, near infrared spectral information, and hyperspectral information.

7. The method of any one of claims 1-6, wherein the neural network model to be trained is a one-dimensional res net-18 network.

8. A cereal origin traceability device, the device comprising:

9. A non-transitory computer readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor realizes the steps of the method according to any of claims 1-7.

10. An electronic device, comprising:

a memory having a computer program stored thereon;

a processor for executing the computer program in the memory to implement the steps of the method of any one of claims 1-7.