WO2020191857A1 - 一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别***与方法 - Google Patents

一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别***与方法 Download PDF

Info

Publication number
WO2020191857A1
WO2020191857A1 PCT/CN2019/085612 CN2019085612W WO2020191857A1 WO 2020191857 A1 WO2020191857 A1 WO 2020191857A1 CN 2019085612 W CN2019085612 W CN 2019085612W WO 2020191857 A1 WO2020191857 A1 WO 2020191857A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
mass spectrum
mass
spectrum
spectra
Prior art date
Application number
PCT/CN2019/085612
Other languages
English (en)
French (fr)
Inventor
庞国芳
常巧英
范春林
陈辉
吴兴强
白若镔
张紫娟
Original Assignee
中国检验检疫科学研究院
北京合众恒星检测科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国检验检疫科学研究院, 北京合众恒星检测科技有限公司 filed Critical 中国检验检疫科学研究院
Priority to EP19921232.5A priority Critical patent/EP3951653A4/en
Priority to US16/475,348 priority patent/US11340201B2/en
Priority to JP2021556378A priority patent/JP2022529207A/ja
Publication of WO2020191857A1 publication Critical patent/WO2020191857A1/zh
Priority to GB2113218.8A priority patent/GB2595625A/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/478Contour-based spectral representations or scale-space representations, e.g. by Fourier analysis, wavelet analysis or curvature scale-space [CSS]
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8675Evaluation, i.e. decoding of the signal into analytical information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N27/00Investigating or analysing materials by the use of electric, electrochemical, or magnetic means
    • G01N27/62Investigating or analysing materials by the use of electric, electrochemical, or magnetic means by investigating the ionisation of gases, e.g. aerosols; by investigating electric discharges, e.g. emission of cathode
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/62Detectors specially adapted therefor
    • G01N30/72Mass spectrometers
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8624Detection of slopes or peaks; baseline correction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/20Identification of molecular entities, parts thereof or of chemical compositions
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/90Programming languages; Computing architectures; Database systems; Data warehousing
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8624Detection of slopes or peaks; baseline correction
    • G01N2030/8648Feature extraction not otherwise provided for
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01JELECTRIC DISCHARGE TUBES OR DISCHARGE LAMPS
    • H01J49/00Particle spectrometers or separator tubes
    • H01J49/26Mass spectrometers or separator tubes

Definitions

  • the invention belongs to the field of pesticide detection, and relates to a system and method for automatically comparing and identifying pesticide and chemical pollutant spectra, in particular to a cloud platform-based intelligent comparison and identifying system and method for pesticide and chemical pollutant mass spectrogram.
  • Pesticide residue detection technology is a crucial research content in ensuring food safety.
  • scientists around the world have long been committed to the research of pesticide residue detection technology in food.
  • the earlier pesticide residue detection was based on gas chromatography.
  • the application of gas chromatography-mass spectrometry (GC-MS) technology increased the types of pesticides to be tested. More than ten kinds.
  • GC-MS/MS gas chromatography-tandem mass spectrometry
  • the types of pesticides to be tested are about 200.
  • liquid chromatography-tandem mass spectrometry (LC-MS/MS) technology has gained advantages over GC-MS and GC-MS/MS due to its strong polarity and thermally unstable pesticide detection. Wide range of applications. As complementary technologies, researchers often use both for the detection of pesticide residues. Since 2001, mass spectrometry detection technology related to GC and LC has become the leading technology for pesticide multi-residue detection.
  • the high-resolution mass spectrometry involved in the present invention liquid chromatography-quadrupole-time of flight mass spectrometry (LC-Q-TOFMS), gas chromatography-quadrupole-time of flight mass spectrometry (GC-Q-TOFMS), linear ion trap— Electric field cyclotron resonance orbitrap mass spectrometry (LC-LTQ-Orbitrap), liquid chromatography-quadrupole-electrostatic field orbitrap mass spectrometry (LC-Q-Orbitrap) and gas chromatography-quadrupole-electrostatic field orbitrap mass spectrometry (GC -Q-Orbitrap), its biggest advantage in pesticide multi-residue detection is that it can provide sufficient sensitivity in the full scan mode, and obtain as much compound information as possible, and at the same time, it can further confirm the compound.
  • LC-Q-TOFMS liquid chromatography-quadrupole-time of flight mass spectrometry
  • GC-Q-TOFMS gas chromatography-qua
  • the application of the above-mentioned mass spectrometry technology has two problems that need to be solved.
  • Resolving mass spectra is particularly important. In actual work, if you want to obtain standard mass spectra, you must either use standard products to collect them yourself, or use commercial mass spectra provided by instrument companies, but these require a lot of manpower, material resources, or financial resources, which are relatively limited. This is also one of the problems that has always plagued analysts.
  • the present invention applies mainstream mass spectrometry instruments: liquid chromatography-tandem mass spectrometry (LC-MS/MS), gas chromatography-tandem mass spectrometry (GC-MS/MS), liquid chromatography-quadrupole-time of flight Mass spectrometry (LC-Q-TOFMS), gas chromatography-quadrupole-time-of-flight mass spectrometry (GC-Q-TOFMS), linear ion trap-electric field cyclotron resonance orbitrap combined mass spectrometry (LTQ-Orbitrap), liquid chromatography-quadrupole Rod-Electrostatic Field Orbitrap Mass Spectrometry (LC-Q-Orbitrap), Gas Chromatography-Quadrupole-Electrostatic Field Orbitrap Mass Spectrometry (GC-Q-Orbitrap), established an electronic ID card in the database for each pesticide, Use image processing technology to obtain spectrum information, apply deep convolutional neural network to classify and model the detected spectrum, put the image model system on the
  • the invention provides a pesticide and chemical pollutant spectrum comparison and identification system and method based on a cloud platform, which can realize the rapid and accurate comparison and identification of pesticides and chemical pollutants.
  • the system includes a cloud server platform end and a user platform end. ;
  • the cloud server platform side includes:
  • the spectrum acquisition part is used to acquire the mass spectrum
  • the spectrum parameter acquisition unit is used to acquire the experimental environment, experimental conditions, and experimental parameter data corresponding to the mass spectrum spectrum;
  • the spectrum equipment acquisition part is used to acquire the information of the spectrum detection equipment corresponding to the mass spectrum spectrum
  • the spectrum preprocessing part is used for longitudinal splicing and preprocessing of the acquired mass spectra, and extracting the characteristics of the spectra;
  • the spectrum classification model part is used to obtain the fitting angle change value at the pixel point where the highest peak of the mass spectrum is located, and establish a mass spectrum classification model;
  • the pesticide category classification model part is used to train the extracted spectrum characteristics, spectrum detection equipment information, and experimental parameter data using neural network models to obtain a classification model that can identify the types and/or names of pesticides and chemical pollutants;
  • the user platform terminal includes:
  • Spectral data uploading part used to upload to the system the mass spectra to be detected, spectrogram description data and experimental parameter data;
  • the spectrum preprocessing part is used for longitudinal splicing and preprocessing of the mass spectra to be detected, and extracting the characteristics of the mass spectra;
  • the spectrum type identification part is used to classify the mass spectrum according to the fitting angle change value at the pixel point where the highest peak in the mass spectrum is located;
  • the spectrum identification part is used to input the extracted spectrum characteristics, spectrum description data and experimental parameter data into the pesticide type classification model, and identify the corresponding pesticide and chemical pollutant types and/or names.
  • the neural network model is a layer-by-layer refinement convolutional neural network model, and its design or use method is: input various pre-processed spectra into the layer-by-layer refinement convolutional neural network to train the spectrum classification model.
  • the size of the spectrogram that is input to the layer-by-layer refinement convolutional neural network for training is 1 ⁇ 1 ⁇ 1626 ⁇ 1626, and the meaning of each parameter is in order: each sample is selected in the training set to update the weight ,
  • the number of channels of the input image is 1 (binary image), and the size of the input image is 1626 ⁇ 1626 (height ⁇ width).
  • the first convolutional layer Conv1 uses a convolution kernel with a size of 11 ⁇ 11 ⁇ 1, which means that after each convolution operation, the convolution kernel moves 4 pixels, and the edge supplement pixel p is 0, which means that the edge of the image is not filled
  • the feature map is output, which reflects the edge contour of the spectrogram and other information.
  • Use Relu activation function to map the result after convolution to control the range of data.
  • the local response normalization layer LRN1 normalizes the feature data output by the conv1 conv1 layer, creates a competition mechanism for the activity of local neurons, makes the value of the larger response become relatively larger, and inhibits others Feed back smaller neurons to enhance the generalization ability of the model.
  • the pooling layer Pool1 uses a size of 3 ⁇ 3 ⁇ 64 to maximize the pooling of the feature maps output by the LRN1 layer, and reduces the amount of calculation and the number of parameters through sampling.
  • Convolutional layers Conv2-Conv5 respectively perform corresponding convolution operations on the feature maps output from the previous layer, and the size of the convolution kernel is reduced layer by layer, respectively 9 ⁇ 9 ⁇ 64, 7 ⁇ 7 ⁇ 128, 5 ⁇ 5 ⁇ 256, 3 ⁇ 3 ⁇ 512, of which 64, 128, 256, 512 correspond to the number of convolution kernels used by the convolutional layer. The more the number of convolution kernels used, the higher the feature dimension obtained.
  • the low-level features are abstracted into higher-dimensional and more refined convolutional activation features.
  • the step size and edge supplement pixel size in each convolutional layer are shown in Figure 3.
  • the local response normalization layer LRN2 normalizes the feature data output by the conv layer Conv2. Pooling layer Pool2-Pool5 uses cores with sizes of 3 ⁇ 3 ⁇ 128, 3 ⁇ 3 ⁇ 256, 3 ⁇ 3 ⁇ 512, 3 ⁇ 3 ⁇ 512, respectively, to maximize the pooling of the feature maps output by the previous layer.
  • the fully connected layer Fc6 connects the local features output by Conv5.
  • the three fully connected layers Fc6-Fc8 learn all the weights to filter the features that perform well in the classification task during the training process, and send the features to the Softmax-loss layer .
  • the Dropout layer Dop6 and Drop7 are used in the calculation results of Fc6 and Fc7, respectively, to randomly disable some hidden layer nodes to speed up training and prevent overfitting.
  • the Softmax-loss layer is equivalent to a classifier, and the value of the loss function is calculated.
  • the stochastic gradient descent algorithm is used to update the weights and set the initial learning rate to 0.0001.
  • the classification effect is gradually improved by minimizing the loss function, and a layer-by-layer refined convolutional neural network classification model with better classification effect is obtained.
  • the spectra include mass spectra and/or chromatograms.
  • the spectra include: liquid chromatography-tandem mass spectrometry, gas chromatography-tandem mass spectrometry, liquid chromatography-quadrupole-time-of-flight mass spectrometry, gas chromatography-quadrupole-time-of-flight mass spectrometry, linear ion trap- One or more of electric field cyclotron resonance orbitrap combined mass spectrum, liquid chromatography-quadrupole-electrostatic field orbitrap mass spectrum, gas chromatography-quadrupole-electrostatic field orbitrap mass spectrum.
  • the mass spectrum classification model part classifies according to the angle change value at the pixel point where the highest peak in the spectrum is located: the fitting angle change value range of the ion current chromatogram in the liquid chromatography-tandem mass spectrum is x 11- x 12 , The range of the fitting angle change of the ion mass spectrum under the four collision energies is x 13- x 14 ; the range of the fitting angle change of the ion current chromatogram in the liquid chromatography-quadrupole-time-of-flight mass spectrum is x 21 —X 22 , the fitting angle values of the ion mass spectra under the four collision energies are all x 23 ; the fitting angle change value of the ion chromatograms in the linear ion trap-electric field cyclotron resonance orbitrap combined mass spectrum is x 31 —x 32 , the fitting angle value of the ionization mode full-scan mass spectrum is x 33 ; the fitting angle change value of the gas chromatograph-tandem mass spectrum is x
  • the mass spectrogram classification model part converts the grayscale image of the mass spectrogram into a binary image, and assigns the image value to a two-dimensional matrix; according to the matrix value, judges the position of the pixel where the high peak of the image (that is, the highest peak of the spectrum) is located ( That is, the rows and columns of the matrix), traverse a certain area from the bottom left and bottom right with this as the center, obtain the rows and columns of the matrix with the corresponding matrix value of 1, and store them to fit the image angle at the peak.
  • the angle change value at the pixel point where the highest peak in the mass spectrum is located is calculated by the gradient vector.
  • the gradient vector In the vicinity of a straight line or curve, the gradient vector is perpendicular to the straight line or curve, and the angle can be calculated from the change in the orientation of the gradient vector.
  • the gradient vector of a certain point on the curve is the vertical line of the curve segment passing that point.
  • a small line segment near the point is used to replace the curve segment, and the vertical line of the line segment is calculated as the gradient vector.
  • the line segment near this point is determined by the length of the neighborhood chain, and the calculated gradient vector is also slightly different for the chain length.
  • the orientation of the gradient vector is its angular size.
  • P n ⁇ p 1 ,..., p n ⁇ be an ordered set of points on a curve or straight line.
  • the value of m can be set to a value between 1 and 5.
  • S n ⁇ s 1 ,...,s n ⁇ represents a set of slopes of the vertical lines of the line segment l i .
  • a n ⁇ a 1, ... , a n ⁇ represents a vertical line an angle set point p i l i near to, a i in the range [0,360 °].
  • the slope of the vertical line of the line segment l i is (-1/g i ), that is
  • the spectrogram recognition unit Before inputting the mass spectrogram to be detected into the classification model, the spectrogram recognition unit also screens out possibilities from the existing mass spectrum library according to the description data of the mass spectrum, experimental parameters, and the number of mass spectra. For the mass spectrum data of the same type as the mass spectrum to be detected, extract the Fc7 layer features of each mass spectrum to be detected, and compare it with the Fc7 layer features of the preprocessed mass spectra of all types selected from the library. Cosine similarity calculation finds the spectrum with the highest similarity to the current mass spectrum to be detected, and judges whether the similarity is higher than 50%. If the similarity is higher than 50%, the mass spectrum entered by the user is successfully identified.
  • the cosine similarity calculation adopts the following methods:
  • a i represents the i-th eigenvalue of the spectra A
  • B i denotes an i-th feature value B of the spectrum
  • d n represents the total number of feature dimensions.
  • the present invention also proposes a method for identifying seven types of pesticides and chemical pollutants mass spectrograms based on a cloud platform, which includes:
  • the extracted characteristics of the mass spectrum, the description data of the mass spectrum and the experimental parameter data are input into the pesticide type classification model, and the corresponding pesticide and chemical pollutant types and/or names are identified.
  • the method for comparing and recognizing pesticides and chemical pollutants spectra on the cloud platform proposed in the present invention performs spectrogram classification model establishment, spectrographic data feature extraction and convolutional neural network training modeling on the cloud server platform, and the user platform
  • the terminal is used for users to upload mass spectrograms and experimental conditions and equipment data, and identify the type of mass spectra according to the mass spectrum classification model of the cloud server platform, and automatically compare the type and name of the pesticide based on the neural network model trained on the cloud server platform. , And feedback the comparison results to users.
  • the system solves the user's restriction on purchasing standard products, and the use of the system is not restricted by location, which can quickly and accurately detect pesticides and chemical pollutants.
  • the present invention covers liquid chromatography-tandem mass spectrometry LC-MS/MS (605 types), gas chromatography-tandem mass spectrometry GC-MS/MS (619 types), liquid chromatography-quadrupole-time-of-flight mass spectrometry LC- Q-TOFMS (510 types), gas chromatography-quadrupole-time-of-flight mass spectrometry GC-Q-TOFMS (753 types), linear ion trap-electric field cyclotron resonance orbitrap combined mass spectrometry LC-LTQ-Orbitrap (378 types), liquid GC-Quadrupole-Electrostatic Field Orbitrap Mass Spectrometry LC-Q-Orbitrap (570 types) and Gas Chromatography-Quadrupole-Electrostatic Field Orbitrap Mass Spectrometry GC-Q-Orbitrap (664 Types) Seven main types of chromatography-mass spectrometry technologies , Established unique electronic ID information for more than 1200 pesticide chemical pollutants: mass
  • the present invention can realize intelligent matching, comparison and identification and qualitative identification of more than 1200 kinds of pesticide chemical pollutants commonly used in the world.
  • Search by compound composition classification including organohalogen pesticides, organophosphorus pesticides, pyrethroid pesticides, carbamate pesticides, organonitrogen pesticides, organosulfur pesticides, etc.; can be searched according to the functional classification of pesticides, including pesticides and fungicides , Herbicides, acaricides, nematicides, insect growth regulators, plant growth regulators, and persistent environmental pollutants such as polychlorinated biphenyls and polycyclic aromatic hydrocarbons; it can also be searched by pesticide toxicity, including slightly toxic, low Poisonous, poisoned, highly toxic, highly toxic, and prohibited pesticides.
  • chromatographic-mass spectrometric information such as the molecular structure of the compound and fragment ions under different conditions can be quickly obtained through the chromatographic-mass spectrometry atlas. Based on this, the detection and identification method can be established scientifically, reasonably and quickly to ensure the accuracy and reliability of the target detection and identification results.
  • the present invention can realize the identification and recognition of unknown compounds.
  • the unknown substance is determined under the specified chromatographic-mass spectrometry conditions to obtain its accurate mass, total ion current diagram and secondary fragment ion mass spectrum and other chromatographic mass spectrometry information; and then by comparing with system information, it can be quickly and accurately qualitatively determined The unknown compound.
  • the present invention can realize the confirmation of the same compound on different instruments and improve the identification and confirmation ability.
  • the detection of pesticide chemical contaminants in complex matrices is often interfered by the co-extraction matrix, which is prone to false positive results, and sometimes requires different types of instruments for confirmation.
  • the invention includes 7 types of chromatogram-mass spectrometers under various conditions, complements and expands the application range, is in line with actual work, and has strong reference.
  • the high-resolution mass spectrum standard mass spectrum of the present invention provides a basis for the confirmation of pesticide multi-residue detection results. There is no need to purchase a large number of physical reference standards to collect mass spectra by yourself, which realizes the intelligence and automation of spectrum search and comparison, and saves The cost of pesticide residue analysis has also improved the market-oriented rapid detection capabilities. At the same time, it brings great convenience to the analysis and detection of pesticides and chemical pollutants, enabling analysts to have a reference when establishing methods, and a query tool when confirming results, which has very important use value and high economics. benefit.
  • the present invention realizes the electronicization of spectrum data and the automation of data retrieval, and develops a relatively complete world-leading pesticide information and pesticide residue detection database with complete independent intellectual property rights in my country. It is not only a reference to the world's chromatography-mass spectrometry It is a major contribution, and it is of great scientific and social significance for my country's pesticide residue analysis, food safety and environmental safety testing, import and export inspection and quarantine.
  • the integration, development and utilization of the chromatographic-mass spectrometry information database of the present invention will quickly improve the construction of agricultural residue laboratories in my country, and improve the overall level and detection efficiency of pesticide identification and pesticide residue testing, which has high social significance.
  • the construction of the retrieval system will greatly improve the data analysis capabilities and pesticide identification capabilities of samples, and the screening and detection capabilities of target pesticides. It has a good prospect of promotion and application and economic value.
  • the invention has four major functions: a guide book for the research and development of new detection technologies for pesticide residues, a reference book for the identification of unknown compounds, a textbook for technical training, and a tool book for daily business.
  • a guide book for the research and development of new detection technologies for pesticide residues a reference book for the identification of unknown compounds
  • a textbook for technical training a textbook for technical training
  • a tool book for daily business a tool book for daily business.
  • Figure 1 is a system structure diagram of the mass spectrum comparison system of the present invention
  • FIG. 2 is a diagram of the hierarchical structure of the convolutional neural network refined layer by layer according to the present invention
  • Figure 3 is a first-stage mass spectrum diagram of an embodiment of the present invention.
  • Figure 5 is a total ion current chromatogram of an embodiment of the present invention.
  • FIG. 6 is a mass spectrum of product ions under corresponding collision energy according to an embodiment of the present invention.
  • Figure 7 is an extracted ion chromatogram of an embodiment of the present invention.
  • FIG. 8 is a product ion mass spectrum under corresponding collision energy according to an embodiment of the present invention.
  • Fig. 9 is a [M+H] + extracted ion chromatogram of an embodiment of the present invention.
  • Figure 10 is a secondary mass spectrum of [M+H] + according to an embodiment of the present invention.
  • Fig. 11 is an extracted ion current chromatogram of an embodiment of the present invention.
  • Fig. 12 is a typical MS spectrum of [M+H] + , [M+NH 4 ] + and [M+Na] + according to an embodiment of the present invention.
  • Figure 1 shows a schematic diagram of the cloud platform-based pesticide and chemical pollutant spectrum comparison system of the present invention
  • the system includes a cloud server platform end and a user platform end, wherein the user platform end includes a user registration module, a user login module, User search module, spectrum data upload module, mass spectrum preprocessing module, mass spectrum type identification and mass spectrum identification module;
  • cloud server platform includes spectrum device information acquisition module, spectrum parameter acquisition module, and mass spectrum Obtaining module, mass spectrum information library, mass spectrum preprocessing module, mass spectrum classification model module, and pesticide category classification module.
  • the mass spectrum acquisition module receives the mass spectrogram uploaded by the user
  • the spectrum device acquisition module receives the spectrum device information uploaded by the user
  • the spectrum parameter acquisition module receives the experimental environment, experimental conditions, and experimental parameters uploaded by the user.
  • the spectra uploaded by the user can be mass spectra or extracted ion current chromatograms; Figures 3 to 12 show multiple examples of spectra that can be processed by the present invention. Those skilled in the art should understand that these spectra The figure is only a schematic example of the type of spectrum that can be processed by the spectrum comparison system of the present invention, and the spectrum that can be processed by the present invention includes but is not limited to this.
  • the original spectra in the embodiment of the present invention include 7 types of mass spectrograms, including liquid chromatography-tandem mass spectrometry, gas chromatography-tandem mass spectrometry, liquid chromatography-quadrupole-time-of-flight mass spectrometry, gas chromatography -Quadrupole-time-of-flight mass spectrum, linear ion trap-electric field cyclotron resonance orbitrap combined mass spectrum, liquid chromatography-quadrupole-electrostatic field orbitrap mass spectrum, gas chromatography-quadrupole-electrostatic field orbitrap mass spectrum Figure.
  • mass spectrograms including liquid chromatography-tandem mass spectrometry, gas chromatography-tandem mass spectrometry, liquid chromatography-quadrupole-time-of-flight mass spectrometry, gas chromatography -Quadrupole-time-of-flight mass spectrum, linear ion trap-electric field cyclotron resonance orbitrap combined mass spectrum, liquid chromatography
  • the mass spectrogram preprocessing module can preprocess the received mass spectra to meet the processing requirements.
  • the spectrogram preprocessing includes vertical splicing, logarithmic transformation, gamma correction, and histogram of the mass spectra. Equalize, and perform geometric transformations such as rotation, translation, and scaling of the spectrum, and perform feature extraction on the preprocessed mass spectrum;
  • the mass spectrum classification model module classifies according to the angle change value at the pixel point where the highest peak in the mass spectrum is located: the fitted angle change value range of the ion current chromatogram in the liquid chromatography-tandem mass spectrum is x 11 —x 12 , the range of the fitting angle change value of the ion mass spectrum under four collision energies is x 13 —x 14 ; the fitting angle change value of the ion current chromatogram in the liquid chromatography-quadrupole-time-of-flight mass spectrum The range is x 21- x 22 , the fitting angle value of the ion mass spectrum under the four collision energies is x 23 ; the fitting angle change value of the ion chromatogram in the combined mass spectrum of the linear ion trap-electric field cyclotron resonance orbitrap is x 31 —x 32 , the fitting angle value of the ionization mode full-scan mass spectrum is x 33 ; the fitting angle value of the gas chromatograph-tandem mass spectrum is x 11
  • the mass spectrum classification model module converts the grayscale image of the mass spectrum into a binary image, and assigns the image value to a two-dimensional matrix; according to the matrix value, judge the position of the pixel where the high peak of the image (ie the highest peak of the spectrum) is located (ie the matrix)
  • the rows and columns of traverse a certain area from the bottom left and bottom right with this as the center, obtain the rows and columns of the matrix with the corresponding matrix value of 1, and store them to fit the image angle at the peak.
  • the pesticide type classification model module trains the classification model of pesticide types, detection equipment types, experimental parameters, mass spectrum characteristics and pesticide names, etc., and obtains a layer-by-layer refinement convolutional neural network training model for pesticides and chemical pollution on the user platform Spectrogram comparison and pesticide detection.
  • the cloud server platform also includes a mass spectrum information database, which stores data such as spectrum type, pesticide name, pesticide type, and its corresponding spectrum, which can be used by the user platform according to the spectrum type and/or pesticide name, And/or pesticide type to query the corresponding mass spectrum.
  • users register and log in to the system through the user registration module and user login module; among them, the user registration function provides registration with different permissions, and the user can register as a user with the permission to upload information (such as uploading training samples, etc.).
  • the spectrum data upload module is used to upload the mass spectra, spectrum description data and experimental parameter data to be detected to the system; among them, the spectrum description data includes the experiment Equipment information, spectrum type, etc., experimental parameter data includes information such as experimental environment, experimental conditions, and experimental parameters.
  • the user can upload a single spectrum or multiple spectra at the same time, and the uploaded spectrum can be any spectrum format commonly used in the technical field.
  • the mass spectrum preprocessing module preprocesses the mass spectrum, including vertical splicing, logarithmic transformation, gamma correction, histogram equalization, and And geometric transformations such as rotation, translation, zooming, etc., and feature extraction of the preprocessed spectrum.
  • the mass spectrum spectrum type recognition module inputs the spectrum extracted by the mass spectrum preprocessing module into the mass spectrum classification model for matching recognition.
  • the mass spectrum recognition module reads the trained layer-by-layer refinement convolutional neural network model stored on the cloud server platform, and extracts the spectrum features, spectrum description data and experimental parameter data extracted by the mass spectrum preprocessing module Input the above-mentioned convolutional neural network model for matching and recognition, thereby obtaining the pesticide type and pesticide name corresponding to the mass spectrum spectrum to be detected.
  • the mass spectrogram identification module inputs the mass spectra to be detected into the classification model
  • the mass spectrogram description data, the experimental parameters, and the number of mass spectra are further selected from
  • the existing mass spectrum library screens out mass spectrum data that may be of the same type as the mass spectrum spectrum to be detected, reducing the number of similarity comparisons and further reducing the amount of calculation of the classification model.
  • extract the Fc7 layer features of each mass spectrogram to be detected and calculate the cosine similarity with the Fc7 layer features of the pre-processed mass spectra of all types from the library, and find the mass spectrum that is currently to be detected.
  • the spectrum with the highest degree of similarity is determined, and it is judged whether its similarity is higher than 50%. If the similarity is higher than 50%, the type of the mass spectrum entered by the user is successfully identified.
  • the cosine similarity calculation adopts the following methods:
  • a i represents the i-th eigenvalue of the spectra A
  • B i denotes an i-th feature value B of the spectrum
  • d n represents the total number of feature dimensions.
  • the cloud platform-based pesticide and chemical pollutant spectrum comparison system uses sample data to train a layer-by-layer refinement convolutional neural network model on the cloud server platform, and the user platform receives the mass spectra uploaded by the user Map and experimental parameter information, and use the above-mentioned neural network model to identify the pesticide type and name corresponding to the mass spectrogram uploaded by the user.
  • the system can automatically identify the mass spectra to be detected, without the need to manually map a large number of standard spectra.
  • the search and comparison can quickly obtain the types and names of pesticides and chemical pollutants corresponding to the spectrum to be detected, which improves the efficiency and accuracy of pesticide residue detection.
  • Fig. 2 shows the network structure of the Layer-by-Layer Refinement Network (LbLReNet) of the present invention.
  • the mass spectra of pesticides and ion mass spectra are relatively sparse spectra.
  • the present invention designs a convolutional neural network structure of "refining the network layer by layer".
  • the layer-by-layer refinement of the convolutional neural network structure of the present invention includes a total of 5 convolutional layers and a ReLU activation function layer, a local response normalization layer, a Pool layer, and a fully connected layer; among them, the low-level convolutional layer focuses on The contour edge information of the spectrogram, as the number of layers increases, the size of the convolution kernel decreases layer by layer, and the convolution layer abstracts the low-level features into higher-dimensional and more refined convolutional activation features.
  • the local response normalization layer (Local Response Norm, LRN) normalizes the result after convolution. After normalization, the variance of the variables is the same, so it will accelerate the training of the model.
  • the Pool layer reduces the amount of calculation and the number of parameters through sampling, and changes the dimensionality of the output.
  • the fully connected (FC) layer connects the previous local features and sends these features to the softmax classifier for training the classifier. Dropout randomly disables some hidden layer nodes to speed up training and prevent overfitting.
  • a 5-layer convolutional layer and its corresponding ReLU activation function are designed for the neural network structure, combined with the LRN layer and Pool
  • the layer and FC layer accelerate the training speed of the model, and have the characteristics of fast model training and high accuracy, which can be used to accurately and quickly identify the types of pesticide residues.
  • Table 1 shows the parameter map of the layer-by-layer thinning convolutional neural network of the present invention.
  • the pre-processed spectrogram image is input into the layer-by-layer refinement convolutional neural network, the size of the input spectrogram image is 1 ⁇ 1 ⁇ 1626 ⁇ 1626, and the meaning of each parameter in turn is: select one at a time in the training set
  • the sample is used to update the weight
  • the number of channels of the input image is 1 (binary image)
  • the size of the input image is 1626 ⁇ 1626 (height ⁇ width).
  • the first convolutional layer Conv1 uses a convolution kernel with a size of 11 ⁇ 11 ⁇ 1, which means that after each convolution operation, the convolution kernel moves 4 pixels, and the edge supplement pixel p is 0, which means that the edge of the image is not filled
  • the feature map is output, which reflects the edge contour of the spectrogram and other information.
  • Use Relu activation function to map the result after convolution to control the range of data.
  • the local response normalization layer LRN1 normalizes the feature data output by the conv1 conv1 layer, creates a competition mechanism for the activity of local neurons, makes the value of the larger response become relatively larger, and inhibits others Feed back smaller neurons to enhance the generalization ability of the model.
  • the pooling layer Pool1 uses a size of 3 ⁇ 3 ⁇ 64 to maximize the pooling of the feature maps output by the LRN1 layer, and reduces the amount of calculation and the number of parameters through sampling.
  • Convolutional layers Conv2-Conv5 respectively perform corresponding convolution operations on the feature maps output from the previous layer, and the size of the convolution kernel is reduced layer by layer, respectively 9 ⁇ 9 ⁇ 64, 7 ⁇ 7 ⁇ 128, 5 ⁇ 5 ⁇ 256, 3 ⁇ 3 ⁇ 512, of which 64, 128, 256, 512 correspond to the number of convolution kernels used by the convolutional layer. The more the number of convolution kernels used, the higher the feature dimension obtained.
  • the local response normalization layer LRN2 normalizes the feature data output by the conv layer Conv2.
  • Pooling layer Pool2-Pool5 uses cores with sizes of 3 ⁇ 3 ⁇ 128, 3 ⁇ 3 ⁇ 256, 3 ⁇ 3 ⁇ 512, 3 ⁇ 3 ⁇ 512, respectively, to maximize the pooling of the feature maps output by the previous layer.
  • the fully connected layer Fc6 connects the local features output by Conv5.
  • the three fully connected layers Fc6-Fc8 learn all the weights to filter the features that perform well in the classification task during the training process, and send the features to the Softmax-loss layer .
  • the Dropout layer Dop6 and Drop7 are used in the calculation results of Fc6 and Fc7, respectively, to randomly disable some hidden layer nodes to speed up training and prevent overfitting.
  • the Softmax-loss layer is equivalent to a classifier, and the value of the loss function is calculated.
  • the stochastic gradient descent algorithm is used to update the weights and the initial learning rate is set to 0.0001.
  • the classification effect is gradually improved by minimizing the loss function, and a layer-by-layer refined convolutional neural network classification model with better classification effect is obtained.
  • spectrogram size, convolution kernel size and other parameters are only exemplary, and can be adaptively changed according to the actual needs of the system.
  • each implementation manner can be implemented in two ways, the foreground and the background.
  • the front-end part of the above description only includes atlas comparison and recognition software and atlas type recognition methods; the back-end part only includes training and recognizing spectrogram models and establishing the method for identifying types of spectrograms.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • Immunology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Library & Information Science (AREA)
  • Multimedia (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Electrochemistry (AREA)
  • Medical Informatics (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)

Abstract

本发明公开了一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别***与方法,包括云服务器平台端和用户平台端。云服务器平台端进行质谱谱图类型分类模型建立、谱图数据特征提取与卷积神经网络训练建模;用户平台端用于上传质谱谱图、实验条件与设备数据,根据质谱谱图类型分类模型或质谱谱图信息库直接筛查比对识别质谱谱图所属类别,基于云服务器平台端训练得到的神经网络模型自动对比判别农药类型与名称,将比对结果反馈给用户。本发明解决了用户购买标准品的限制,并且使用不受地点限制,可自动、快速准确地对农药残留进行鉴定。

Description

一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别***与方法 技术领域
本发明属于农药检测领域,涉及一种农药及化学污染物谱图自动比对识别***与方法,尤其是一种基于云平台的农药及化学污染物质谱谱图的智能比对识别***与方法。
背景技术
农药残留检测技术是保障食品安全方面至关重要的研究内容。世界各国的科学家长期致力于食品中农药残留检测技术的研究。较早的农残检测是基于气相色谱进行的,农药种类相对较少通常为几种或十几种,而气相色谱-质谱(GC-MS)技术的应用,使待测农药的种类增加到了几十种以上。气相色谱-串联质谱(GC-MS/MS)技术的应用,则使食品中农药残留的检测上升到新的台阶,待测农药的种类在200种左右。与此同时,液相色谱-串联质谱(LC-MS/MS)技术由于在检测极性较强和热不稳定性农药方面具有优于GC-MS和GC-MS/MS技术的优势也得到了广泛的应用。作为互补技术,研究人员也常将两者同时用于农药残留的检测。自2001年之后,与GC和LC相关的质谱检测技术已然成为农药多残留检测的主导技术。
据报道,目前世界常用的农药已经超过1000种,而且还在不断增加。面对如此种类繁多,性质各异的农药,以及各种复杂的样品基质,应用低分辨质谱开展目标化合物的常规检测已经不能满足实际需求。高分辨质谱的应用使低分辨质谱遇到的问题迎刃而解,其中具有代表性的是飞行时间质谱(TOF)和轨道离子阱质谱(Orbitrap)。本发明中涉及的高分辨质谱:液相色谱-四极杆-飞行时间质谱(LC-Q-TOFMS)、气相色谱-四极杆-飞行时间质谱(GC-Q-TOFMS)、线性离子阱—电场回旋共振轨道阱组合质谱(LC-LTQ-Orbitrap)、液相色谱-四极杆-静电场轨道阱质谱(LC-Q-Orbitrap)和气相色谱-四极杆-静电场轨道阱质谱(GC-Q-Orbitrap),其在农药多残留检测方面的最大优势就是可以在全扫描模式下提供足够的灵敏度,并获得尽可能多的化合物信息,同时,可以使化合物进一步得到确证。
上述质谱技术的应用,存在两方面需要解决的问题,一是,方法建立过程中需要参考标准品的质谱信息,二是,对于检测结果的确证需要与标准质谱图进行比对,这点对高分辨质谱尤为重要。在实际工作中,要想获得标准的质谱图,要么使用标准品自行采集,要么使用仪器公司提供的商业质谱图,但是这些都需要投入大量的人力、物力或财 力,局限性相对较大。这也是一直困扰分析工作者的难题之一。
发明内容
为解决上述问题,本发明应用主流的质谱仪器:液相色谱-串联质谱(LC-MS/MS)、气相色谱-串联质谱(GC-MS/MS)、液相色谱-四极杆-飞行时间质谱(LC-Q-TOFMS)、气相色谱-四极杆-飞行时间质谱(GC-Q-TOFMS)、线性离子阱-电场回旋共振轨道阱组合质谱(LTQ-Orbitrap)、液相色谱-四极杆-静电场轨道阱质谱(LC-Q-Orbitrap)、气相色谱-四极杆-静电场轨道阱质谱(GC-Q-Orbitrap),为每种农药在数据库中建立了一张电子身份证,利用图像处理技术获取谱图信息,应用深度卷积神经网络为已检测的图谱分类建模,将图像模型***放到后台云服务器端,用户登录***在浏览器端上传其检测农药残留谱图,利用该智能比对***和方法可以方便地获知所检测的数据是哪种农药。
本发明提供了一种基于云平台的农药及化学污染物谱图比对识别***与方法,能够实现农药及化学污染物快速准确的比对识别,所述***包括云服务器平台端、用户平台端;
其中所述云服务器平台端包括:
谱图获取部,用于获取质谱谱图;
谱图参数获取部,用于获取与质谱谱图对应的实验环境、实验条件、实验参数数据;
谱图设备获取部,用于获取与质谱谱图对应的谱图检测设备信息;
谱图预处理部,用于对所获取的质谱谱图进行纵向拼接、预处理,并提取谱图特征;
谱图分类模型部,用于获取质谱谱图内部最高峰值所在像素点处的拟合角度变化值,建立质谱谱图分类模型;
农药种类分类模型部,用于使用神经网络模型对所提取的谱图特征、谱图检测设备信息、实验参数数据进行训练,得到能够识别农药及化学污染物种类和/或名称的分类模型;
所述用户平台端包括:
谱图数据上传部,用于向***上传待检测的质谱谱图、谱图说明数据和实验参数数据;
谱图预处理部,用于对待检测的质谱谱图进行纵向拼接、预处理,并提取质谱谱图特征;
谱图类型识别部,用于根据质谱谱图内部最高峰值所在像素点处的拟合角度变化值对质谱谱图进行分类;
谱图识别部,用于将所提取的谱图特征、谱图说明数据和实验参数数据输入所述农药种类分类模型,识别出对应的农药及化学污染物种类和/或名称。
优选地,
所述神经网络模型为逐层细化卷积神经网络模型,其设计或使用方法为:将经过预处理的各类谱图输入逐层细化卷积神经网络中训练谱图分类模型。经过预处理后,输入逐层细化卷积神经网络进行训练的谱图的尺寸为1×1×1626×1626,各参数的含义依次为:在训练集中每次选择一个样本用来更新权值,输入图像的通道数为1(二值图像),输入图像的大小为1626×1626(高×宽)。
第一个卷积层Conv1使用尺寸为11×11×1的卷积核,表示每次卷积运算后,卷积核移动4个像素点,边缘补充像素p为0,表示不对图像边缘进行填充,经过Conv1层的运算后,输出特征图,该特征图反映了谱图的边缘轮廓等信息。使用Relu激活函数对卷积后的结果进行映射,控制数据的范围。接下来,局部响应归一化层LRN1对卷积层conv1输出的特征数据进行归一化,对局部神经元的活动创建竞争机制,使得其中响应比较大的值变得相对更大,并抑制其他反馈较小的神经元,增强模型的泛化能力,经过该层的计算后,特征图的尺寸不变。之后,池化层Pool1使用尺寸为3×3×64的核对LRN1层输出的特征图进行最大池化,通过采样减少计算量和参数个数。
卷积层Conv2-Conv5分别对其上一层输出的特征图进行相应的卷积运算,卷积核尺寸逐层减小,分别为9×9×64,7×7×128,5×5×256,3×3×512,其中64,128,256,512分别相应卷积层使用的卷积核数量,使用的卷积核数量越多,得到的特征维度越高,经过逐层的卷积运算后,低层特征被抽象成为更高维更细化的卷积激活特征,各个卷积层中的步长以及边缘补充像素大小如图3所示。局部响应归一化层LRN2对卷积层Conv2输出的特征数据进行归一化。池化层Pool2-Pool5分别使用尺寸为3×3×128,3×3×256,3×3×512,3×3×512的核对其上一层输出的特征图进行最大池化。
全连接层Fc6将Conv5输出的局部特征进行连接,Fc6-Fc8三个全连接层在训练过程中通过学习全部的权重来筛选在分类任务中表现好的特征,并将特征送入Softmax-loss层。Dropout层Dop6与Drop7分别用于Fc6与Fc7的计算结果中,随机禁用一部分隐藏层的节点,加快训练速度并防止过拟合。Softmax-loss层相当于一个分类器,计算得到损失函数的值。在训练过程中,使用随机梯度下降算法更新权值并设置初 始学习率为0.0001,通过最小化损失函数逐步提高分类效果,并得到分类效果较好的逐层细化卷积神经网络分类模型。
优选地,
所述谱图包括质谱图和/或色谱图。
优选地,
所述谱图包括:液相色谱-串联质谱图、气相色谱-串联质谱图、液相色谱-四极杆-飞行时间质谱图、气相色谱-四极杆-飞行时间质谱图、线性离子阱-电场回旋共振轨道阱组合质谱图、液相色谱-四极杆-静电场轨道阱质谱图、气相色谱-四极杆-静电场轨道阱质谱图的一种或多种。
优选地,
所述质谱谱图分类模型部根据谱图内部最高峰值所在像素点处的角度变化值分类:液相色谱—串联质谱图中的离子流色谱图的拟合角度变化值范围为x 11—x 12,四个碰撞能量下离子质谱图的拟合角度变化值范围为x 13—x 14;液相色谱—四极杆—飞行时间质谱中的离子流色谱图的拟合角度变化值范围为x 21—x 22,四个碰撞能量下离子质谱图的拟合角度值均为x 23;线性离子阱—电场回旋共振轨道阱组合质谱图中的离子色谱图的拟合角度变化值为x 31—x 32,电离模式全扫描质谱图的拟合角度值均为x 33;气相色谱—串联质谱图一级质谱图拟合角度变化值为x 41,四个碰撞能量下离子质谱图的拟合角度值为x 43;液相色谱-四极杆-静电场轨道阱质谱中的离子流色谱图的拟合角度变化值为x 51,碎片离子质谱图的拟合角度值为x 53;气相色谱—四极杆—飞行时间质谱图中的质谱图的拟合角度值为x 61;气相色谱-四极杆-静电场轨道阱质谱总离子色谱图的拟合角度变化值为x 71—x 72,电离模式全扫描质谱图的拟合角度值均为x 73。其中,x 11—x 73的取值范围为0°—40°。
优选地,
所述质谱谱图分类模型部将质谱谱图灰度图转化为二值图,并将图像值赋予二维矩阵;根据矩阵值,判断图像高峰值(即图谱最高峰值)所在像素点的位置(即矩阵的行和列),以此为中心向左下和右下一定区域范围遍历,获取相对应矩阵值为1的矩阵的行与列,并记忆存储后拟合高峰处图像角度。
优选地,
所述质谱谱图内部最高峰所在像素点处的角度变化值通过梯度矢量计算。在直线或曲线附近,梯度矢量垂直于该直线或曲线,角度可由梯度矢量的方位变化计算得到。曲 线上某点的梯度矢量是过该点的曲线片段的垂直线,用该点附近的一小段线段来代替曲线片段,计算出该线段的垂直线作为梯度矢量。该点附近的线段用邻域链长来确定,链长不同,计算出来的梯度矢量也略有差别。梯度矢量的方位就是它的角度大小。
优选地,
设P n={p 1,…,p n}是曲线或直线上的有序点集。L n={l 1,…,l n}是直线或曲线上有序点附近的一小段线段,l i(i=1,…,n)表示以点p i为中心,邻域链长为m,即连接点p i-m和p i+m之间的线段。在本***中,可将m的值设定为1~5之间的数值。S n={s 1,…,s n}表示线段l i的垂直线的斜率的集合。A n={a 1,…,a n}表示点p i附近l i的垂直线的角度集,a i范围在[0,360°]。
点p i(x i,y i)附近线段l i(连接点p i-m(x i-m,y i-m)和点p i+m(x i+m,y i+m))的斜率为:
g i=(y i+m-y i-m)/(x i+m-x i-m)
线段l i的垂直线的斜率为(-1/g i),即
s i=(x i+m-x i-m)/(y i+m-y i-m)
a i的计算方式如表2所示。
表2
Figure PCTCN2019085612-appb-000001
优选地,
所述谱图识别部在将待检测质谱谱图输入到所述分类模型前,还根据所述质谱谱图说明数据、实验参数和质谱谱图的数量从现有的质谱谱图库中筛选出可能与待检测质谱谱图类别相同的质谱谱图数据,对每幅待检测质谱谱图提取其Fc7层特征,并与从库中筛选出的所有类别预处理后的质谱谱图的Fc7层特征进行余弦相似度计算,找到与当前待检测质谱谱图相似程度最高的谱图,并判断其相似度是否高于50%,若相似度高于50%,则成功识别出用户输入的质谱谱图。
优选地,
所述余弦相似度计算采用以下方式:
Figure PCTCN2019085612-appb-000002
其中,A i表示谱图A的第i个特征值,B i表示谱图B的第i个特征值,d n表示特征 的总维数。
相应地,本发明还提出了一种基于云平台的农药及化学污染物七类质谱谱图识别方法,其包括:
在云服务器平台端获取质谱谱图,并获取与质谱谱图对应的实验环境、实验条件、实验参数数据;
获取与质谱谱图对应的谱图检测设备信息;
对所获取的质谱谱图进行纵向拼接、预处理,并提取谱图特征;
获取质谱谱图内部最高峰值所在像素点处的拟合角度变化值,建立质谱谱图分类模型;
使用神经网络模型对所提取的质谱谱图特征、谱图检测设备信息、实验参数数据进行训练,得到能够识别农药及化学污染物种类和/或名称的农药种类分类模型;
在用户平台端向***上传待检测的质谱谱图、质谱谱图说明数据和实验参数数据;
对待检测的质谱谱图进行纵向拼接、预处理并提取质谱谱图特征;
根据质谱谱图内部最高峰值所在像素点处的拟合角度变化值对质谱谱图进行分类;
将所提取的质谱谱图特征、质谱谱图说明数据和实验参数数据输入所述农药种类分类模型,识别出对应的农药及化学污染物种类和/或名称。
本发明所提出的基于云平台的农药及化学污染物谱图比对识别方法,其在云服务器平台端进行谱图分类模型建立、谱图数据特征提取与卷积神经网络训练建模,用户平台端用于用户上传质谱谱图和实验条件与设备数据,并根据云服务器平台端的质谱谱图分类模型识别质谱谱图类型,基于云服务器平台端训练得到的神经网络模型自动对比判别农药类型与名称,并将比对结果反馈给用户。该***解决了用户购买标准品的限制,并且***的使用不受地点限制,可快速准确地对农药及化学污染物进行检测。
本发明的有益效果:
1.本发明涵盖了液相色谱-串联质谱LC-MS/MS(605种)、气相色谱-串联质谱GC-MS/MS(619种)、液相色谱-四极杆-飞行时间质谱LC-Q-TOFMS(510种)、气相色谱-四极杆-飞行时间质谱GC-Q-TOFMS(753种)、线性离子阱-电场回旋共振轨道阱组合质谱LC-LTQ-Orbitrap(378种)、液相色谱-四极杆-静电场轨道阱质谱LC-Q-Orbitrap(570种)和气相色谱-四极杆-静电场轨道阱质谱GC-Q-Orbitrap(664种)七类色谱-质谱主流技术,为1200多种农药化学污染物建立了独有的电子身份证信息:质谱信息数据库(精确质量数、同位素分布、同位素丰度)和质谱特征谱图数据库(总离子流色 谱图和不同能量碰撞下的碎片离子质谱图)等色谱-质谱分析鉴定的必要参数,为研发高通量农药多残留检测技术奠定了理论和方法基础,具有技术创新性,是当前最精确、灵敏可靠的检测技术,可以实现最大单次农药集群检测的唯一精准侦测技术。
2.本发明可实现世界常用1200多种农药化学污染物质谱谱图智能匹配、比对识别、定性。按照化合物组成成分分类检索,包括有机卤素农药、有机磷农药、拟除虫菊酯农药、氨基甲酸酯农药、有机氮农药、有机硫农药等;可以按照农药功能分类检索,包括杀虫剂、杀菌剂、除草剂、杀螨剂、杀线虫剂、昆虫生长调节剂、植物生长调节剂,以及多氯联苯、多环芳烃等持久性环境污染物;也可以按照农药毒性检索,包括微毒、低毒、中毒、高毒、剧毒,以及违禁农药等。对于已知化合物的鉴定,可通过色谱-质谱图集快速获取该化合物的分子结构及不同条件下的碎片离子等全面的色谱-质谱信息。据此,可以科学、合理、快速建立检测鉴定方法,保障目标物检测鉴定结果的准确可靠。
3.本发明可实现未知化合物的鉴定识别。根据本发明指定色谱-质谱条件下对未知物进行测定,获取其精确质量数、总离子流图和二级碎片离子质谱图等色谱质谱信息;再通过与***信息比对,即可快速准确定性该未知化合物。
4.本发明可实现对同一化合物在不同仪器上的确证,提高了鉴定确证能力。对复杂基质中农药化学污染物残留的检测,经常受到共萃取基质的干扰,容易出现假阳性结果,有时需要不同类型的仪器进行确证。本发明包含7类不同的色谱-质谱仪器多种条件下的色谱-质谱图,互补扩展了应用范围,与实际工作接轨,参考性强。
5.本发明高分辨质谱标准质谱图,为农药多残留检测结果的确证提供了依据,无需购买大量实物参考标准品自行采集质谱图,实现了谱图检索、对照的智能化、自动化,节省了农药残留分析的成本,也提高了市场化快速检测的能力。同时,为农药及化学污染物的分析检测带来极大地便利,使得分析工作者在建立方法时有了参考依据,在确证结果时有了查询工具,具有非常重要的使用价值和较高的经济效益。
6.本发明实现了谱图数据的电子化和数据检索的自动化,开发了较完整的具有我国完全自主知识产权的世界领先的农药信息与农残检测数据库,不仅是对世界色谱-质谱学的重大贡献,而且对我国的农残分析、食品安全和环境安全检测、进出口检验检疫具有非常重大的科学意义和社会意义。
7.通过本发明色谱-质谱信息库的整合、开发与利用,将快速提高我国农残留实验室的建设,整体提升农药鉴定、农残检测业行水平和检测效率,具有很高的社会意义。检索***的构建,将大大提升样品的数据分析能力和农药鉴定能力,提升目标农药的筛查 侦测能力,具有很好的推广应用前景和经济价值。
8.本发明具有四大功能:农药残留新检测技术研发的指导书、未知化合物鉴定的参考书、技术培训的教科书和日常业务的工具书。当这种世界常用农药化学污染物七类色谱-质谱谱图自动识别***建立成型后,这四大功能将发挥更大作用。
附图说明
图1为本发明的质谱谱图比对***的***结构图;
图2为本发明的逐层细化的卷积神经网络层次结构图;
图3为本发明一实施例的一级质谱图;
图4为本发明一实施例相应碰撞能量下子离子质谱图;
图5为本发明一实施例的总离子流色谱图;
图6为本发明一实施例的相应碰撞能量下子离子质谱图;
图7为本发明一实施例的提取离子色谱图;
图8为本发明一实施例的相应碰撞能量下子离子质谱图;
图9为本发明一实施例的[M+H] +提取离子色谱图;
图10为本发明一实施例的[M+H] +的二级质谱图;
图11为本发明一实施例的提取离子流色谱图;
图12为本发明一实施例的[M+H] +、[M+NH 4] +和[M+Na] +典型的一级质谱图。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚,下面将结合具体实施例及相应的附图对本发明的技术方案进行清楚、完整地描述。显然,所描述的实施例仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
图1示出了本发明基于云平台的农药及化学污染物谱图比对***的示意图;所述***包括云服务器平台端和用户平台端,其中用户平台端包括用户注册模块、用户登录模块、用户检索模块、谱图数据上传模块、质谱谱图预处理模块、质谱谱图类型识别和质谱谱图识别模块;云服务器平台端包括谱图设备信息获取模块、谱图参数获取模块、质谱谱图获取模块、质谱谱图信息库、质谱谱图预处理模块、质谱谱图分类模型模块和农药种类分类模块。
在云服务器平台端,质谱谱图获取模块接收用户上传的质谱谱图,谱图设备获取模 块接收用户上传的谱图设备信息,谱图参数获取模块接收用户上传的实验环境、实验条件、实验参数等信息;其中用户上传的谱图可以是质谱图或提取离子流色谱图;图3至图12示出了本发明能够处理的谱图的多个示例,本领域技术人员应当理解,该些谱图仅是对本发明谱图对比***所能处理的谱图类型的示意性举例,本发明所能处理的谱图包括但不限于此。
优选地,本发明实施例中的原始谱图包括7类质谱谱图,包括液相色谱-串联质谱图、气相色谱-串联质谱图、液相色谱-四极杆-飞行时间质谱图、气相色谱-四极杆-飞行时间质谱图、线性离子阱-电场回旋共振轨道阱组合质谱图、液相色谱-四极杆-静电场轨道阱质谱图、气相色谱-四极杆-静电场轨道阱质谱图。
质谱谱图预处理模块可对接收的质谱谱图进行预处理以符合处理要求,具体地,所述谱图预处理包括对质谱谱图进行竖向拼接、对数变换、伽玛校正、直方图均衡化、以及对谱图进行旋转、平移、缩放等几何变换,并对预处理后的质谱谱图进行特征提取;
优选地,所述质谱谱图分类模型模块根据质谱谱图内部最高峰值所在像素点处的角度变化值分类:液相色谱—串联质谱图中的离子流色谱图的拟合角度变化值范围为x 11—x 12,四个碰撞能量下离子质谱图的拟合角度变化值范围为x 13—x 14;液相色谱—四极杆—飞行时间质谱中的离子流色谱图的拟合角度变化值范围为x 21—x 22,四个碰撞能量下离子质谱图的拟合角度值均为x 23;线性离子阱—电场回旋共振轨道阱组合质谱图中的离子色谱图的拟合角度变化值为x 31—x 32,电离模式全扫描质谱图的拟合角度值均为x 33;气相色谱—串联质谱图一级质谱图拟合角度变化值为x 41,四个碰撞能量下离子质谱图的拟合角度值为x 43;液相色谱-四极杆-静电场轨道阱质谱中的离子流色谱图的拟合角度变化值为x 51,碎片离子质谱图的拟合角度值为x 53;气相色谱—四极杆—飞行时间质谱图中的质谱图的拟合角度值为x 61;气相色谱-四极杆-静电场轨道阱质谱总离子色谱图的拟合角度变化值为x 71—x 72,电离模式全扫描质谱图的拟合角度值均为x 73。其中,x 11—x 73的取值范围为0°—40°。
质谱谱图分类模型模块将质谱谱图灰度图转化为二值图,并将图像值赋予二维矩阵;根据矩阵值,判断图像高峰值(即图谱最高峰值)所在像素点的位置(即矩阵的行和列),以此为中心向左下和右下一定区域范围遍历,获取相对应矩阵值为1的矩阵的行与列,并记忆存储后拟合高峰处图像角度。
农药种类分类模型模块对农药种类、检测设备类别、实验参数、质谱谱图特征及农药名称等进行分类模型训练,得到逐层细化卷积神经网络训练模型以用于用户平台端的 农药及化学污染物谱图比对和农药检测。云服务器平台端还包括一质谱谱图信息库,其存储有谱图类型、农药名称、农药种类及其相对应的谱图谱图等数据,可供用户平台端根据谱图类型和/或农药名称、和/或农药种类进行相应的质谱谱图的查询。
在用户平台端,用户通过用户注册模块和用户登录模块进行***注册并登录***;其中,用户注册功能提供不同权限的注册,用户可注册为具有上传信息(例如上传训练样本等)权限的用户,也可注册为仅具有查询权限的用户;用户注册后***管理员对用户注册信息进行审核,审核通过后方可登录***使用。
当用户注册、登录成功后,为获得所检测农药物质的信息,利用谱图数据上传模块向***上传待检测的质谱谱图、谱图说明数据和实验参数数据;其中,谱图说明数据包括实验设备信息、谱图类型等,实验参数数据包括实验环境、实验条件、实验参数等信息。具体地,用户在上传待检测的质谱谱图时可单张谱图上传,也可多张谱图同时上传,上传的谱图可以是本技术领域中常用的任意的谱图格式。
在用户上传待检测的质谱谱图后,质谱谱图预处理模块对所述质谱谱图进行预处理,包括对质谱谱图进行竖向拼接、对数变换、伽玛校正,直方图均衡化、以及旋转、平移、缩放等几何变换,并对预处理后的谱图进行特征提取。
质谱谱图类型识别模块将质谱谱图预处理模块提取的谱图输入质谱谱图分类模型进行匹配识别。
质谱谱图识别模块读取云服务器平台端存储的已训练好的逐层细化卷积神经网络模型,并将质谱谱图预处理模块提取的谱图特征、谱图说明数据和实验参数数据等输入上述卷积神经网络模型进行匹配识别,从而得到与待检测质谱谱图对应的农药种类和农药名称。
根据本发明的又一优选方式,所述质谱谱图识别模块在将待检测质谱谱图输入到所述分类模型前,还根据所述质谱谱图说明数据、实验参数和质谱谱图的数量从现有的质谱谱图库中筛选出可能与待检测质谱谱图类别相同的质谱谱图数据,减少相似度比较的次数,进一步降低分类模型的运算量。具体地,对每幅待检测质谱谱图提取其Fc7层特征,并与从库中筛选出的所有类别预处理后的质谱谱图的Fc7层特征进行余弦相似度计算,找到与当前待检测质谱谱图相似程度最高的谱图,并判断其相似度是否高于50%,若相似度高于50%,则成功识别出用户输入的质谱谱图的类别。其中,所述余弦相似度计算采用以下方式:
Figure PCTCN2019085612-appb-000003
其中,A i表示谱图A的第i个特征值,B i表示谱图B的第i个特征值,d n表示特征的总维数。
根据本发明提供的上述基于云平台的农药及化学污染物谱图比对***,其在云服务器平台端利用样本数据训练逐层细化卷积神经网络模型,用户平台端接收用户上传的质谱谱图和实验参数信息,并利用上述神经网络模型识别与用户上传的质谱谱图对应的农药种类和名称,该***能够自动地对待检测质谱谱图进行识别,无需人工在大量的谱图标准图中进行查找比对,可快速地获取与待检测谱图对应的农药及化学污染物种类和名称,提高了农药残留检测的效率和准确性。
图2示出了本发明的逐层细化卷积神经网络(Layer-by-Layer Refinement Network,LbLReNet)的网络结构。农药的质谱图以及离子质谱图都是内容比较稀疏的谱图,而对于稀疏的数据,当使用比较小的卷积核时,局部感受域比较小,卷积操作无法表示其特征,若采用较大的卷积核则会导致复杂度极大的增加。因此本发明设计了“逐层细化网络”的卷积神经网络结构。具体地,本发明的逐层细化卷积神经网络结构共包含5个卷积层以及ReLU激活函数层、局部响应归一化层、Pool层、全连接层;其中,低层的卷积层关注谱图的轮廓边缘信息,随着层数增高,卷积核尺寸逐层减小,卷积层将低层特征抽象成为更高维更细化的卷积激活特征。此外,局部响应归一化层(Local Response Norm,LRN)对卷积后的结果进行归一化,归一化之后,变量的方差相同,所以会对于模型的训练起到加速的作用。Pool层通过采样减少了计算量和参数个数,改变了输出的维度。全连接(FC)层将以前的局部特征进行连接,并将这些特征送入softmax分类器中,用于训练分类器。Dropout随机禁用一部分隐藏层的节点,加快训练速度并防止过拟合。
根据本发明所提出的上述逐层细化卷积神经网络结构,其根据农药检测谱图的特点,为该神经网络结构设计5层卷积层及其对应的ReLU激活函数,结合LRN层、Pool层、FC层加速模型的训练速度,具有模型训练速度快、准确性高的特点,可用于准确快速地对农药残留种类进行识别。
表1示出了本发明的逐层细化卷积神经网络的参数图。其中,将经过预处理的谱图图像输入逐层细化卷积神经网络中,输入谱图图像的尺寸为1×1×1626×1626,各参数的含义依次为:在训练集中每次选择一个样本用来更新权值,输入图像的通道数为1(二值图像),输入图像的大小为1626×1626(高×宽)。第一个卷积层Conv1使用尺寸为11×11×1的卷积核,表示每次卷积运算后,卷积核移动4个像素点,边缘补充像素p为 0,表示不对图像边缘进行填充,经过Conv1层的运算后,输出特征图,该特征图反映了谱图的边缘轮廓等信息。使用Relu激活函数对卷积后的结果进行映射,控制数据的范围。接下来,局部响应归一化层LRN1对卷积层conv1输出的特征数据进行归一化,对局部神经元的活动创建竞争机制,使得其中响应比较大的值变得相对更大,并抑制其他反馈较小的神经元,增强模型的泛化能力,经过该层的计算后,特征图的尺寸不变。之后,池化层Pool1使用尺寸为3×3×64的核对LRN1层输出的特征图进行最大池化,通过采样减少计算量和参数个数。卷积层Conv2-Conv5分别对其上一层输出的特征图进行相应的卷积运算,卷积核尺寸逐层减小,分别为9×9×64,7×7×128,5×5×256,3×3×512,其中64,128,256,512分别相应卷积层使用的卷积核数量,使用的卷积核数量越多,得到的特征维度越高,经过逐层的卷积运算后,低层特征被抽象成为更高维更细化的卷积激活特征,各个卷积层中的步长以及边缘补充像素大小如附图3所示。局部响应归一化层LRN2对卷积层Conv2输出的特征数据进行归一化。池化层Pool2-Pool5分别使用尺寸为3×3×128,3×3×256,3×3×512,3×3×512的核对其上一层输出的特征图进行最大池化。全连接层Fc6将Conv5输出的局部特征进行连接,Fc6-Fc8三个全连接层在训练过程中通过学习全部的权重来筛选在分类任务中表现好的特征,并将特征送入Softmax-loss层。Dropout层Dop6与Drop7分别用于Fc6与Fc7的计算结果中,随机禁用一部分隐藏层的节点,加快训练速度并防止过拟合。Softmax-loss层相当于一个分类器,计算得到损失函数的值。在训练过程中,使用随机梯度下降算法更新权值并设置初始学习率为0.0001,通过最小化损失函数逐步提高分类效果,得到分类效果较好的逐层细化卷积神经网络分类模型。本领域技术人员可以理解,上述谱图尺寸、卷积核尺寸等参数仅是示例性的,可以根据***实际的需要进行适应性改变。
表1
Figure PCTCN2019085612-appb-000004
以上所描述的***和方法的实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到各实施方式可借助前台和后台两种方式来实现。上述描述前台部分仅包含图谱对比识别软件、图谱类型识别方法;后台部分仅包含训练识别谱图模型和建立谱图类型判别方法。基于这样的理解,上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法,若使***能够识别更多的谱图,还需要获取更多的谱图类型和谱图数量进行分类和建模。
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。
上文所列出的一系列的详细说明仅仅是针对本发明的可行性实施方式的具体说明,它们并非用以限制本发明的保护范围,凡未脱离本发明技艺精神所作的等效实施方式或变更均应包含在本发明的保护范围之内。

Claims (14)

  1. 一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别***,其特征在于,包括云服务器平台端、用户平台端;
    所述云服务器平台端包括:
    谱图获取部,用于获取质谱谱图;
    谱图参数获取部,用于获取与质谱谱图对应的实验环境、实验条件、实验参数数据;
    谱图设备获取部,用于获取与质谱谱图对应的谱图检测设备信息;
    谱图预处理部,用于对所获取的质谱谱图进行纵向拼接、预处理,并提取谱图特征;农药种类分类模型部,使用神经网络模型对所提取的质谱谱图特征、谱图检测设备信息、实验参数数据进行训练,得到能够识别农药及化学污染物种类和/或名称的分类模型;
    所述用户平台端包括:
    谱图数据上传部,用于向***上传待检测的质谱谱图、谱图说明数据和实验参数数据;
    谱图预处理部,用于对待检测的质谱谱图进行纵向拼接、预处理,并提取谱图特征;谱图识别部,用于将所提取的质谱谱图特征、谱图说明数据和实验参数数据上传输入所述农药种类分类模型,识别出对应的农药及化学污染物种类和/或名称。
  2. 根据权利要求1所述的一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别***,其特征在于,
    所述云平台服务器端还包括:质谱谱图分类模型部,根据获取质谱谱图内部最高峰值所在像素点处的拟合角度变化值,建立质谱谱图分类模型;
    所述用户平台端还包括:质谱谱图类型识别部,根据所述质谱谱图分类模型计算的质谱谱图内部最高峰值所在像素点处的拟合角度变化值,获取对质谱谱图进行分类的结果。
  3. 根据权利要求2所述的一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别***,其特征在于,所述云平台服务器端还包括质谱谱图信息库,所述用户平台端还包括用户检索部;所述质谱谱图类型识别部在将待检测谱图输入到所述质谱谱图分类模型前,利用所述用户检索部对所述质谱谱图说明数据、实验参数和质谱谱图的数量从所述质谱谱图信息库中筛选出可能与待检测质谱谱图类别相同的质谱谱图数据,对每幅待检测质谱谱图提取其Fc7层特征,并与从库中筛选出的所有类别预处理后的质谱谱图的Fc7层特征进行余弦相似度计算,找到与当前待检测质谱谱图相似程度最 高的质谱谱图,并判断其相似度是否高于50%,若相似度高于50%,则成功识别出用户输入的质谱谱图。
  4. 根据权利要求1所述的一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别***,其特征在于,所述农药种类分类模型部使用的神经网络模型为逐层细化卷积神经网络模型,其设计/使用方法为:
    质谱谱图经过预处理后,输入逐层细化卷积神经网络进行训练的质谱谱图的尺寸为1×1×1626×1626,各参数的含义依次为:在训练集中每次选择一个样本用来更新权值,输入图像的通道数为1,输入图像的大小为1626×1626;
    第一个卷积层Conv1使用尺寸为11×11×1的卷积核,表示每次卷积运算后,卷积核移动4个像素点,边缘补充像素p为0,表示不对图像边缘进行填充,经过Conv1层的运算后,输出特征图,该特征图反映了质谱谱图的边缘轮廓等信息;使用Relu激活函数对卷积后的结果进行映射,控制数据的范围;接下来,局部响应归一化层LRN1对卷积层conv1输出的特征数据进行归一化,对局部神经元的活动创建竞争机制,使得其中响应比较大的值变得相对更大,并抑制其他反馈较小的神经元,增强模型的泛化能力,经过该层的计算后,特征图的尺寸不变;之后,池化层Pool1使用尺寸为3×3×64的核对LRN1层输出的特征图进行最大池化,通过采样减少计算量和参数个数;
    卷积层Conv2-Conv5分别对其上一层输出的特征图进行相应的卷积运算,卷积核尺寸逐层减小,分别为9×9×64,7×7×128,5×5×256,3×3×512,其中64,128,256,512分别对应卷积层使用的卷积核数量,经过逐层的卷积运算后,低层特征被抽象成为更高维更细化的卷积激活特征,局部响应归一化层LRN2对卷积层Conv2输出的特征数据进行归一化;池化层Pool2-Pool5分别使用尺寸为3×3×128,3×3×256,3×3×512,3×3×512的核对其上一层输出的特征图进行最大池化;
    全连接层Fc6将Conv5输出的局部特征进行连接,Fc6-Fc8三个全连接层在训练过程中通过学习全部的权重来筛选在分类任务中表现好的特征,并将特征送入Softmax-loss层;Dropout层Dop6与Drop7分别用于Fc6与Fc7的计算结果中,随机禁用一部分隐藏层的节点,加快训练速度并防止过拟合;Softmax-loss层计算得到损失函数的值;
    在训练过程中,使用随机梯度下降算法更新权值并设置初始学习率为0.0001,通过最小化损失函数逐步提高分类效果,得到分类效果较好的逐层细化卷积神经网络分类模 型。
  5. 根据权利要求2所述的一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别***,其特征在于,所述质谱谱图分类模型部根据质谱谱图内部最高峰值所在像素点处的角度变化值进行分类时:液相色谱—串联质谱图中的离子流色谱图的拟合角度变化值范围为x 11—x 12,四个碰撞能量下离子质谱图的拟合角度变化值范围为x 13—x 14;液相色谱—四极杆—飞行时间质谱中的离子流色谱图的拟合角度变化值范围为x 21—x 22,四个碰撞能量下离子质谱图的拟合角度值均为x 23;线性离子阱—电场回旋共振轨道阱组合质谱图中的离子色谱图的拟合角度变化值为x 31—x 32,电离模式全扫描质谱图的拟合角度值均为x 33;气相色谱—串联质谱图一级质谱图拟合角度变化值为x 41,四个碰撞能量下离子质谱图的拟合角度值为x 43;液相色谱-四极杆-静电场轨道阱质谱中的离子流色谱图的拟合角度变化值为x 51,碎片离子质谱图的拟合角度值为x 53;气相色谱—四极杆—飞行时间质谱图中的质谱图的拟合角度值为x 61;气相色谱-四极杆-静电场轨道阱质谱总离子色谱图的拟合角度变化值为x 71—x 72,电离模式全扫描质谱图的拟合角度值均为x 73;其中,x 11—x 73的取值范围为0°—40°。
  6. 根据权利要求2所述的一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别***,其特征在于,所述质谱谱图分类模型部获取质谱谱图内部最高峰值所在像素点处的拟合角度变化值通过梯度矢量计算得到:在直线或曲线附近,梯度矢量垂直于该直线或曲线,角度由梯度矢量的方位变化计算得到;曲线上某点的梯度矢量是过该点的曲线片段的垂直线,用该点附近的一小段线段来代替曲线片段,计算出该线段的垂直线作为梯度矢量,该点附近的线段用邻域链长来确定,梯度矢量的方位就是其角度大小。
  7. 根据权利要求6所述的一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别***,其特征在于,所述质谱谱图内部最高峰值所在像素点处的拟合角度变化值的计算方法包括如下:
    设P n={p 1,…,p n}是曲线或直线上的有序点集,L n={l 1,…,l n}是直线或曲线上有序点附近的一小段线段,l i(i=1,…,n)表示以点p i为中心,邻域链长为m,即连接点p i-m和p i+m之间的线段,S n={s 1,…,s n}表示线段l i的垂直线的斜率的集合,A n={a 1,…,a n}表示点p i附近l i的垂直线的角度集,a i范围在[0,360°];
    点p i(x i,y i)附近线段l i(连接点p i-m(x i-m,y i-m)和点p i+m(x i+m,y i+m))的斜率为:g i=(y i+m-y i-m)/(x i+m-x i-m)
    线段l i的垂直线的斜率为(-1/g i),即
    s i=(x i+m-x i-m)/(y i+m-y i-m);
    a i的计算方式:在斜率不存在时,a i=π/2,在斜率为0时,a i=π,在斜率大于0时,a i=arctanki,在斜率小于0时,a i=π+arctanki;ki表示斜率。
  8. 根据权利要求1-7任一项所述的一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别***,其特征在于,所述质谱图包括液相色谱-串联质谱图、气相色谱-串联质谱图、液相色谱-四极杆-飞行时间质谱图、气相色谱-四极杆-飞行时间质谱图、线性离子阱-电场回旋共振轨道阱组合质谱图、液相色谱-四极杆-静电场轨道阱质谱图、气相色谱-四极杆-静电场轨道阱质谱图的一种或多种。
  9. 根据权利要求8所述的一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别***,其特征在于,所述质谱谱图还可以是提取离子流色谱图。
  10. 一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别方法,其特征在于,
    在云服务器平台端:
    获取质谱谱图,并获取与质谱谱图对应的实验环境、实验条件、实验参数数据;
    获取与质谱谱图对应的谱图检测设备信息;
    对所获取的质谱谱图进行纵向拼接、预处理,并提取谱图特征;
    获取质谱谱图内部最高峰值所在像素点处的拟合角度变化值,建立质谱谱图分类模型;
    使用神经网络模型对所提取的质谱谱图特征、谱图检测设备信息、实验参数数据进行训练,得到能够识别农药及化学污染物种类和/或名称的农药种类分类模型;
    在用户平台端:
    向云服务器平台端上传待检测的质谱谱图、谱图说明数据和实验参数数据;
    对待检测的质谱谱图进行纵向拼接、预处理并提取质谱谱图特征;
    接收云服务器平台端返回的质谱谱图分类结果;
    将所提取的谱图特征、谱图说明数据和实验参数数据上传至云服务器平台端的农药种类分类模型,接收识别出的对应的农药及化学污染物种类和/或名称。
  11. 根据权利要求10一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别方法,其特征在于,所述神经网络模型为逐层细化卷积神经网络模型,其设计/使用方法为:
    谱图经过预处理后,输入逐层细化卷积神经网络进行训练的谱图的尺寸为1×1×1626×1626,各参数的含义依次为:在训练集中每次选择一个样本用来更新权值,输入图像的通道数为1,输入图像的大小为1626×1626;
    第一个卷积层Conv1使用尺寸为11×11×1的卷积核,表示每次卷积运算后,卷积核移动4个像素点,边缘补充像素p为0,表示不对图像边缘进行填充,经过Conv1层的运算后,输出特征图,该特征图反映谱图的边缘轮廓等信息;使用Relu激活函数对卷积后的结果进行映射,控制数据的范围;接下来,局部响应归一化层LRN1对卷积层conv1输出的特征数据进行归一化,对局部神经元的活动创建竞争机制,使得其中响应比较大的值变得相对更大,并抑制其他反馈较小的神经元,增强模型的泛化能力,经过该层的计算后,特征图的尺寸不变;之后,池化层Pool1使用尺寸为3×3×64的核对LRN1层输出的特征图进行最大池化,通过采样减少计算量和参数个数;
    卷积层Conv2-Conv5分别对其上一层输出的特征图进行相应的卷积运算,卷积核尺寸逐层减小,分别为9×9×64,7×7×128,5×5×256,3×3×512,其中64,128,256,512分别对应卷积层使用的卷积核数量,经过逐层的卷积运算后,低层特征被抽象成为更高维更细化的卷积激活特征,局部响应归一化层LRN2对卷积层Conv2输出的特征数据进行归一化;池化层Pool2-Pool5分别使用尺寸为3×3×128,3×3×256,3×3×512,3×3×512的核对其上一层输出的特征图进行最大池化;
    全连接层Fc6将Conv5输出的局部特征进行连接,Fc6-Fc8三个全连接层在训练过程中通过学习全部的权重来筛选在分类任务中表现好的特征,并将特征送入Softmax-loss层;Dropout层Dop6与Drop7分别用于Fc6与Fc7的计算结果中,随机禁用一部分隐藏层的节点,加快训练速度并防止过拟合;Softmax-loss层计算得到损失函数的值;
    在训练过程中,使用随机梯度下降算法更新权值并设置初始学习率为0.0001,通过最小化损失函数逐步提高分类效果,得到分类效果较好的逐层细化卷积神经网络分类模 型。
  12. 根据权利要求10一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别方法,其特征在于,所述建立质谱谱图分类模型时根据质谱谱图内部最高峰值所在像素点处的角度变化值进行分类:液相色谱—串联质谱图中的离子流色谱图的拟合角度变化值范围为x 11—x 12,四个碰撞能量下离子质谱图的拟合角度变化值范围为x 13—x 14;液相色谱—四极杆—飞行时间质谱中的离子流色谱图的拟合角度变化值范围为x 21—x 22,四个碰撞能量下离子质谱图的拟合角度值均为x 23;线性离子阱—电场回旋共振轨道阱组合质谱图中的离子色谱图的拟合角度变化值为x 31—x 32,电离模式全扫描质谱图的拟合角度值均为x 33;气相色谱—串联质谱图一级质谱图拟合角度变化值为x 41,四个碰撞能量下离子质谱图的拟合角度值为x 43;液相色谱-四极杆-静电场轨道阱质谱中的离子流色谱图的拟合角度变化值为x 51,碎片离子质谱图的拟合角度值为x 53;气相色谱—四极杆—飞行时间质谱图中的质谱图的拟合角度值为x 61;气相色谱-四极杆-静电场轨道阱质谱总离子色谱图的拟合角度变化值为x 71—x 72,电离模式全扫描质谱图的拟合角度值均为x 73;其中,x 11—x 73的取值范围为0°—40°。
  13. 根据权利要求10一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别方法,其特征在于,所述获取谱图内部最高峰值所在像素点处的拟合角度变化值时通过梯度矢量计算得到:在直线或曲线附近,梯度矢量垂直于该直线或曲线,角度由梯度矢量的方位变化计算得到;曲线上某点的梯度矢量是过该点的曲线片段的垂直线,用该点附近的一小段线段来代替曲线片段,计算出该线段的垂直线作为梯度矢量,该点附近的线段用邻域链长来确定,梯度矢量的方位就是其角度大小;
    计算方法如下:
    设P n={p 1,…,p n}是曲线或直线上的有序点集,L n={l 1,…,l n}是直线或曲线上有序点附近的一小段线段,l i(i=1,…,n)表示以点p i为中心,邻域链长为m,即连接点p i-m和p i+m之间的线段,S n={s 1,…,s n}表示线段l i的垂直线的斜率的集合,A n={a 1,…,a n}表示点p i附近l i的垂直线的角度集,a i范围在[0,360°];
    点p i(x i,y i)附近线段l i(连接点p i-m(x i-m,y i-m)和点p i+m(x i+m,y i+m))的斜率为:g i=(y i+m-y i-m)/(x i+m-x i-m)
    线段l i的垂直线的斜率为(-1/g i),即
    s i=(x i+m-x i-m)/(y i+m-y i-m);
    a i的计算方式:在斜率不存在时,a i=π/2,在斜率为0时,a i=π,在斜率大于0时,a i=arctanki,在斜率小于0时,a i=π+arctanki;ki表示斜率。
  14. 根据权利要求10一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别方法,其特征在于,所述用户接收云服务器平台端返回的质谱谱图分类结果的方法包括:根据所述质谱谱图说明数据、实验参数和质谱谱图的数量从云平台服务器端的质谱谱图信息库中直接筛选得到分类结果;
    对每幅待检测质谱谱图提取其Fc7层特征,与从信息库中筛选出的所有类别预处理后的质谱谱图的Fc7层特征进行余弦相似度计算,找到与当前待检测质谱谱图相似程度最高的质谱谱图,并判断其相似度是否高于50%,若相似度高于50%,则成功识别出用户输入的质谱谱图类型。
PCT/CN2019/085612 2019-03-26 2019-05-06 一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别***与方法 WO2020191857A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP19921232.5A EP3951653A4 (en) 2019-03-26 2019-05-06 CLOUD PLATFORM BASED AUTOMATIC IDENTIFICATION SYSTEM AND PROCESS FOR SEVEN TYPES OF MASS SPECTROGRAMS OF COMMONLY USED PESTICIDES AND CHEMICAL POLLUTANTS AROUND THE WORLD
US16/475,348 US11340201B2 (en) 2019-03-26 2019-05-06 Cloud-platform based automatic identification system and method of seven types of mass spectrums for pesticides and chemical pollutants commonly used in the world
JP2021556378A JP2022529207A (ja) 2019-03-26 2019-05-06 クラウドプラットフォームに基づいて世界の一般的な農薬及び化学汚染物質の7種類の質量スペクトルを自動的に識別するシステム及び方法
GB2113218.8A GB2595625A (en) 2019-03-26 2021-05-06 Cloud platform-based automatic identification system and method for seven types of mass spectrograms of commonly used pesticides and chemical pollutants

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910234026.5 2019-03-26
CN201910234026.5A CN110110743B (zh) 2019-03-26 2019-03-26 一种七类质谱谱图自动识别***与方法

Publications (1)

Publication Number Publication Date
WO2020191857A1 true WO2020191857A1 (zh) 2020-10-01

Family

ID=67484574

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/085612 WO2020191857A1 (zh) 2019-03-26 2019-05-06 一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别***与方法

Country Status (6)

Country Link
US (1) US11340201B2 (zh)
EP (1) EP3951653A4 (zh)
JP (1) JP2022529207A (zh)
CN (1) CN110110743B (zh)
GB (1) GB2595625A (zh)
WO (1) WO2020191857A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112730275A (zh) * 2021-02-04 2021-04-30 华东理工大学 显微光谱成像***、农药检测***及其方法
CN113203850A (zh) * 2021-03-24 2021-08-03 柳州东风容泰化工股份有限公司 一种氯代苯酚的生物活性检测方法及装置
CN113780430A (zh) * 2021-09-14 2021-12-10 天津国科医工科技发展有限公司 一种基于Hopfield模型的三重四极杆质谱仪谱图分类方法
CN113971747A (zh) * 2021-12-24 2022-01-25 季华实验室 拉曼光谱数据处理方法、装置、设备与可读存储介质
CN114420222A (zh) * 2022-03-29 2022-04-29 北京市疾病预防控制中心 一种基于分布式流式处理的碎片离子化合物结构的快速确认方法

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111062411A (zh) * 2019-11-06 2020-04-24 北京大学 从质谱数据中识别多种化合物的方法、装置和设备
CN115280143A (zh) * 2020-03-27 2022-11-01 文塔纳医疗***公司 用于识别质谱响应曲线中的至少一个峰的计算机实现的方法
CN111814864A (zh) * 2020-07-03 2020-10-23 北京中计新科仪器有限公司 一种质谱分析数据人工智能云平台***及数据分析方法
CN111610281B (zh) * 2020-07-14 2022-06-10 北京行健谱实科技有限公司 基于气相色谱质谱谱库鉴定的云平台构架的操作方法
CN112505133B (zh) * 2020-12-28 2023-09-12 黑龙江莱恩检测有限公司 一种基于深度学习的质谱检测方法
CN112924523A (zh) * 2021-01-28 2021-06-08 中国农业科学院农产品加工研究所 具快速萃取功能的农药残留检测用质谱检测***及方法
CN114755357A (zh) * 2022-04-14 2022-07-15 武汉迈特维尔生物科技有限公司 一种色谱质谱自动积分方法、***、设备、介质
CN115144457B (zh) * 2022-06-27 2023-03-24 中验科学仪器(福建)有限公司 一种便携式质谱分析仪、分析方法以及终端
CN115950993B (zh) * 2023-03-15 2023-07-25 福建德尔科技股份有限公司 氟氮混合气中氟含量的测试方法
CN116597227A (zh) * 2023-05-29 2023-08-15 广东省麦思科学仪器创新研究院 质谱图解析方法、装置、设备及存储介质
CN117077004B (zh) * 2023-08-18 2024-02-23 中国科学院华南植物园 物种鉴定方法、***、设备及存储介质
CN117169406A (zh) * 2023-11-02 2023-12-05 启东泓昱生物医药有限公司 基于成分分析的药品质量检测方法及***

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101855700A (zh) * 2007-10-10 2010-10-06 Mks仪器股份有限公司 使用四极或飞行时间质谱仪的化学电离反应或质子转移反应质谱测定
CN108414610A (zh) * 2018-05-09 2018-08-17 南开大学 一种基于单颗粒气溶胶质谱仪和ART-2a神经网络法的综合污染源成分谱构建方法
JP2018141750A (ja) * 2017-02-28 2018-09-13 株式会社ハウス食品分析テクノサービス イメージング質量分析法による異物の混入時期判別方法

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5218529A (en) * 1990-07-30 1993-06-08 University Of Georgia Research Foundation, Inc. Neural network system and methods for analysis of organic materials and structures using spectral data
CA2298181C (en) * 2000-02-02 2006-09-19 Dayan Burke Goodnough Non-targeted complex sample analysis
JP5757270B2 (ja) * 2012-04-26 2015-07-29 株式会社島津製作所 クロマトグラフ質量分析用データ処理装置
CN103823008B (zh) * 2014-03-14 2016-03-02 北京市疾病预防控制中心 构建液相色谱-质谱数据库检测未知毒物的方法
CN107077592B (zh) * 2014-03-28 2021-02-19 威斯康星校友研究基金会 高分辨率气相色谱-质谱数据与单位分辨率参考数据库的改进谱图匹配的高质量精确度滤波
CN104764843A (zh) * 2015-02-27 2015-07-08 潍坊出入境检验检疫局综合技术服务中心 一种利用负化学源质谱数据库对含电负性元素农药检测的方法
TWI613445B (zh) * 2016-04-01 2018-02-01 行政院農業委員會農業藥物毒物試驗所 搭配質譜影像分析檢驗農藥殘留之方法及其系統
CN108760909A (zh) * 2017-04-17 2018-11-06 中国检验检疫科学研究院 一种食用农产品农药残留非靶标、多指标、快速侦测的电子化方法
CN107103571B (zh) * 2017-04-17 2018-07-31 中国检验检疫科学研究院 基于高分辨质谱、互联网和数据科学的农药残留侦测数据平台及侦测报告自动生成方法
US11457554B2 (en) * 2019-10-29 2022-10-04 Kyndryl, Inc. Multi-dimension artificial intelligence agriculture advisor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101855700A (zh) * 2007-10-10 2010-10-06 Mks仪器股份有限公司 使用四极或飞行时间质谱仪的化学电离反应或质子转移反应质谱测定
JP2018141750A (ja) * 2017-02-28 2018-09-13 株式会社ハウス食品分析テクノサービス イメージング質量分析法による異物の混入時期判別方法
CN108414610A (zh) * 2018-05-09 2018-08-17 南开大学 一种基于单颗粒气溶胶质谱仪和ART-2a神经网络法的综合污染源成分谱构建方法

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
PANG GUOFANG, CHEN YI , FAN CHUNLIN, BAI RUOBIN, SUN YUEHONG, CHANG QIAOYING: "Tri-element Integrated Technology of High Resolution MS, Internet, and Digital Science Constitutes Technical Platform for Pesticide Residues", BULLETIN OF CHINESE ACADEMY OF SCIENCES, vol. 32, no. 12, 31 December 2017 (2017-12-31), pages 11384 - 1396, XP055854241, ISSN: 1000-3045, DOI: 10.16418/j.issn.1000-3045.2017.12.013 *
PANG GUOFANG, CHEN YI , FAN CHUNLIN, BAI RUOBIN, SUN YUEHONG, CHANG QIAOYING: "Tri-element Integrated Technology of High Resolution MS, Internet, and Digital Science Constitutes Technical Platform for Pesticide Residues", BULLETIN OF CHINESE ACADEMY OF SCIENCES, vol. 32, no. 12, 31 December 2017 (2017-12-31), pages 1384 - 1396, XP055854241, ISSN: 1000-3045, DOI: 10.16418/j.issn.1000-3045.2017.12.013 *
See also references of EP3951653A4 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112730275A (zh) * 2021-02-04 2021-04-30 华东理工大学 显微光谱成像***、农药检测***及其方法
CN113203850A (zh) * 2021-03-24 2021-08-03 柳州东风容泰化工股份有限公司 一种氯代苯酚的生物活性检测方法及装置
CN113780430A (zh) * 2021-09-14 2021-12-10 天津国科医工科技发展有限公司 一种基于Hopfield模型的三重四极杆质谱仪谱图分类方法
CN113780430B (zh) * 2021-09-14 2024-05-24 天津国科医疗科技发展有限公司 一种基于Hopfield模型的三重四极杆质谱仪谱图分类方法
CN113971747A (zh) * 2021-12-24 2022-01-25 季华实验室 拉曼光谱数据处理方法、装置、设备与可读存储介质
CN114420222A (zh) * 2022-03-29 2022-04-29 北京市疾病预防控制中心 一种基于分布式流式处理的碎片离子化合物结构的快速确认方法

Also Published As

Publication number Publication date
CN110110743B (zh) 2019-12-31
US20220050092A1 (en) 2022-02-17
GB2595625A (en) 2021-12-01
EP3951653A1 (en) 2022-02-09
JP2022529207A (ja) 2022-06-20
CN110110743A (zh) 2019-08-09
EP3951653A4 (en) 2023-04-26
US11340201B2 (en) 2022-05-24

Similar Documents

Publication Publication Date Title
WO2020191857A1 (zh) 一种基于云平台的世界常用农药及化学污染物七类质谱谱图自动识别***与方法
Xie et al. A deep-learning-based real-time detector for grape leaf diseases using improved convolutional neural networks
US10636169B2 (en) Synthesizing training data for broad area geospatial object detection
CN1690713B (zh) 对样本进行分析以提供表征数据的方法、***
CN112347244B (zh) 基于混合特征分析的涉黄、涉赌网站检测方法
CN114172748A (zh) 一种加密恶意流量检测方法
Shi et al. Amur tiger stripes: Individual identification based on deep convolutional neural network
WO2019195971A1 (zh) 光谱分析方法、装置、电子设备及计算机可读存储介质
Deklerck et al. Comparison of species classification models of mass spectrometry data: Kernel Discriminant Analysis vs Random Forest; A case study of Afrormosia (Pericopsis elata (Harms) Meeuwen)
Frontera-Pons et al. Unsupervised feature-learning for galaxy SEDs with denoising autoencoders
Franceschi et al. Self‐organizing maps: A versatile tool for the automatic analysis of untargeted imaging datasets
WO2018222775A1 (en) Broad area geospatial object detection
Komsta Chemometrics in fingerprinting by means of thin layer chromatography
Shi et al. Individual automatic detection and identification of big cats with the combination of different body parts
CN106776958A (zh) 基于关键路径的违法网站识别***及其方法
CN111896609B (zh) 一种基于人工智能分析质谱数据的方法
CN110887798B (zh) 基于极端随机树的非线性全光谱水体浊度定量分析方法
Theuerkauf et al. A trainable object finder, selector and identifier for pollen, spores and other things: A step towards automated pollen recognition in lake sediments
Torralvo et al. Effectiveness of Fourier transform near‐infrared spectroscopy spectra for species identification of anurans fixed in formaldehyde and conserved in alcohol: A new tool for integrative taxonomy
CN103870720A (zh) 蛋白质信号转导子网的预测方法和装置
French et al. Peak correlation classifier (PCC) applied to FTIR spectra: a novel means of identifying toxic substances in mixtures
Hu et al. The Bridge between Screening and Assessment: Establishment and Application of Online Screening Platform for Food Risk Substances
CN117877622B (zh) 基于化合物质谱信息预测化合物结构的装置、方法和计算机可读存储介质
CN114550840A (zh) 一种基于孪生网络的芬太尼类物质检测方法及其装置
Kutzke et al. Potential of hyperspectral-based geochemical predictions with neural networks for strategic and regional exploration improvement

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19921232

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021556378

Country of ref document: JP

Kind code of ref document: A

Ref document number: 202113218

Country of ref document: GB

Kind code of ref document: A

Free format text: PCT FILING DATE = 20210506

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019921232

Country of ref document: EP

Effective date: 20211026