CN106248621A - A kind of evaluation methodology and system - Google Patents
A kind of evaluation methodology and system Download PDFInfo
- Publication number
- CN106248621A CN106248621A CN201610790067.9A CN201610790067A CN106248621A CN 106248621 A CN106248621 A CN 106248621A CN 201610790067 A CN201610790067 A CN 201610790067A CN 106248621 A CN106248621 A CN 106248621A
- Authority
- CN
- China
- Prior art keywords
- near infrared
- infrared spectrum
- basic data
- spectrum
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 94
- 238000011156 evaluation Methods 0.000 title claims abstract description 39
- 238000002329 infrared spectrum Methods 0.000 claims abstract description 143
- 239000000126 substance Substances 0.000 claims abstract description 43
- 238000001228 spectrum Methods 0.000 claims description 61
- 238000005070 sampling Methods 0.000 claims description 7
- 239000002699 waste material Substances 0.000 abstract description 6
- 239000000463 material Substances 0.000 abstract description 5
- 238000004458 analytical method Methods 0.000 abstract description 3
- 230000002950 deficient Effects 0.000 abstract 1
- 230000000875 corresponding effect Effects 0.000 description 46
- 238000010586 diagram Methods 0.000 description 13
- 230000008859 change Effects 0.000 description 5
- 238000009795 derivation Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000004519 manufacturing process Methods 0.000 description 4
- 244000061176 Nicotiana tabacum Species 0.000 description 3
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 3
- 238000010009 beating Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000011157 data evaluation Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- SNICXCGAKADSCV-JTQLQIEISA-N (-)-Nicotine Chemical compound CN1CCC[C@H]1C1=CC=CN=C1 SNICXCGAKADSCV-JTQLQIEISA-N 0.000 description 1
- 235000019504 cigarettes Nutrition 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 229960002715 nicotine Drugs 0.000 description 1
- SNICXCGAKADSCV-UHFFFAOYSA-N nicotine Natural products CN1CCCC1C1=CC=CN=C1 SNICXCGAKADSCV-UHFFFAOYSA-N 0.000 description 1
- 238000010986 on-line near-infrared spectroscopy Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000013441 quality evaluation Methods 0.000 description 1
- 238000012372 quality testing Methods 0.000 description 1
- 239000002994 raw material Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
- G01N21/31—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
- G01N21/35—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
- G01N21/359—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light using near infrared light
Landscapes
- Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Investigating Or Analysing Materials By Optical Means (AREA)
Abstract
The present invention provides a kind of evaluation methodology and system, for the quality of basic data is evaluated, including: from basic data, obtain the chemical score of a plurality of near infrared spectrum and correspondence;Near infrared spectrum is carried out pretreatment;Solve the similarity distance between each near infrared spectrum and partial auto-correlation;Obtain near infrared spectrum and the chemical score of the maximum similarity corresponding with each near infrared spectrum;Calculate the absolute difference of chemical score corresponding between the near infrared spectrum of the corresponding maximum similarity of each near infrared spectrum respectively, and solve the meansigma methods of all absolute differences;Judge that the meansigma methods of absolute difference, whether more than the error amount preset, when for time no, evaluates basic data qualified;When for being, evaluate basic data defective.The present invention can precise and high efficiency before near infrared spectrum is modeled, the quality of basic data is evaluated, with get rid of near infrared spectrum data of low quality is modeled, improve modeling analysis effectiveness, reduce manpower and materials waste.
Description
Technical field
The present invention relates to data processing field, particularly relate to a kind of evaluation methodology and system.
Background technology
Near infrared spectrum quantitative modeling needs substantial amounts of sample spectrum information and basic data, for a collection of modeling light
The quality of data that spectrum is traditional with basic data differentiates it is to see two, and one is not make model to see the repeatability of flowing, and one is to do
Forecast error is seen after model, but owing to the flow detection of the repetition of Duplicate Samples is highly difficult;And model needs substantial amounts of sample
Could differentiate and be belonging to differentiate afterwards.For a collection of modeling data quality lack early stage to its overall evaluation, if a collection of fixed
Spectrum and the basic data of amount modeling are inaccurate or the most corresponding, and the model of foundation often precision is the highest or the suitability is not strong,
Tradition runs into this situation, or resamples modeling, or is updated model safeguarding, but still has problems, gesture
Model must be caused to again pull up failure, cause the waste of great human and material resources, financial resources.How just to model before modeling
Whether quality anticipation can set up a qualified model becomes particularly important and necessary.
Summary of the invention
The shortcoming of prior art in view of the above, it is an object of the invention to provide a kind of evaluation methodology and system, uses
In solution prior art, the basic data of the near infrared spectrum to be modeled can not be carried out effectiveness anticipation and cause effect
Rate is low and the problem of waste of manpower and financial resources etc..
For achieving the above object and other relevant purposes, the present invention provides a kind of evaluation methodology, is used for comprising near infrared light
The quality of basic data for modeling of spectrum is evaluated, and described method includes: obtain from described basic data a plurality of closely
Infrared spectrum, and obtain the chemical score corresponding with every near infrared spectrum;Described near infrared spectrum is carried out pretreatment;Solve each
Similarity distance between described near infrared spectrum and partial auto-correlation;According to the similarity distance between each described near infrared spectrum with
And partial auto-correlation obtains the near infrared spectrum of the maximum similarity corresponding to each described near infrared spectrum and corresponding respectively
Chemical score;Obtain chemistry corresponding between the near infrared spectrum of the corresponding maximum similarity of each described near infrared spectrum respectively
The difference of value, and after all described differences are taken absolute value, obtain the absolute difference corresponding with described difference, solve all described
The meansigma methods of absolute difference;The meansigma methods of described absolute difference is compared, when described absolute difference with the error amount preset
Meansigma methods more than described default error amount time, evaluate the off quality of described basic data;When described absolute difference
When meansigma methods is less than or equal to described default error amount, evaluate the up-to-standard of described basic data.
In the present invention one specific embodiment, the mode that described near infrared spectrum carries out pretreatment includes S-G derivation side
Method.
In the present invention one specific embodiment, according to the information content of described near infrared spectrum, solve each described near-infrared
Similarity distance between spectrum.
In the present invention one specific embodiment, the similarity between each described near infrared spectrum is between each described near infrared spectrum
Partial auto-correlation and each described near infrared spectrum between the ratio of similarity distance.
In the present invention one specific embodiment, when evaluate described basic data off quality time, to described basis number
The sampling mode of the near infrared spectrum according to is adjusted and/or safeguards basis flow-data.
For achieving the above object and other relevant purposes, the present invention also provides for a kind of data evaluation system, for comprising
The quality for the basic data of modeling of near infrared spectrum is evaluated, and described system includes: basic data acquisition module, uses
To obtain a plurality of near infrared spectrum from described basic data, and obtain the chemical score corresponding with every near infrared spectrum;Pre-place
Reason module, in order to carry out pretreatment to described near infrared spectrum;Maximum similarity spectrum acquisition module, in order to solve each described closely
Similarity distance between infrared spectrum and partial auto-correlation;And according to the similarity distance between each described near infrared spectrum and office
Portion's correlation coefficient obtains near infrared spectrum and correspondingization of the maximum similarity corresponding with each described near infrared spectrum respectively
Value;Difference meansigma methods solves module, in order to obtain the near of the corresponding maximum similarity of each described near infrared spectrum respectively
The difference of chemical score corresponding between infrared spectrum, and after all described differences are taken absolute value, obtain corresponding with described difference
Absolute difference, solves the meansigma methods of all described absolute differences;Comparison module, in order to by the meansigma methods of described absolute difference with pre-
If error amount compare, when the meansigma methods of described absolute difference is more than described default error amount, evaluate described basis
Data off quality;When the meansigma methods of described absolute difference is less than or equal to described default error amount, evaluate described
Basic data up-to-standard.
In the present invention one specific embodiment, the mode that described near infrared spectrum carries out pretreatment includes S-G derivation side
Method.
In the present invention one specific embodiment, maximum similarity spectrum acquisition module is in order to according to described near infrared spectrum
Information content, solves the similarity distance between each described near infrared spectrum.
In the present invention one specific embodiment, the similarity between each described near infrared spectrum is between each described near infrared spectrum
Partial auto-correlation and each described near infrared spectrum between the ratio of similarity distance.
In the present invention one specific embodiment, also include adjusting module, in order to when evaluating the quality of described basic data not
Time qualified, the sampling mode of the near infrared spectrum in described basic data be adjusted and/or basis flow-data is carried out
Safeguard.
As it has been described above, the evaluation methodology of the present invention and system, for the basis for modeling comprising near infrared spectrum
The quality of data is evaluated, and described method includes: obtains a plurality of near infrared spectrum from described basic data, and obtains with every
The chemical score that bar near infrared spectrum is corresponding;Described near infrared spectrum is carried out pretreatment;Solve between each described near infrared spectrum
Similarity distance and partial auto-correlation;According to the similarity distance between each described near infrared spectrum and partial auto-correlation's difference
Obtain the near infrared spectrum of the maximum similarity corresponding with each described near infrared spectrum and corresponding chemical score;Obtain each respectively
The difference of chemical score corresponding between the near infrared spectrum of the maximum similarity that described near infrared spectrum is corresponding, and to all institutes
State after difference takes absolute value, obtain the absolute difference corresponding with described difference, solve the meansigma methods of all described absolute differences;Will
The meansigma methods of described absolute difference compares with the error amount preset, when the meansigma methods of described absolute difference is preset more than described
Error amount time, evaluate the off quality of described basic data;When the meansigma methods of described absolute difference is less than or equal to described
During the error amount preset, evaluate the up-to-standard of described basic data.The present invention can with precise and high efficiency near infrared spectrum
Before being modeled, use sample in a small amount to carry out near infrared spectrum and pass judgment on chemical score quality, with the matter to basic data
Amount is evaluated, and judges whether basic data can set up a stable accurate model, for the matter of near infrared spectrum data
The evaluation of amount provides a kind of effective method of discrimination, it is to avoid cause owing to basic data is of low quality samples modeling in a large number,
Also promote basic data when being of high quality, for expanding and improving the chemometrics method for basic data and provide and instruct,
It is modeled underproof near infrared spectrum analyzing to get rid of, improves effectiveness and the accuracy of modeling analysis, reduce manpower
The waste of material resources.
Accompanying drawing explanation
Fig. 1 is shown as the evaluation methodology of present invention schematic flow sheet in one embodiment.
Fig. 2 is shown as the pass applying the number of samples in a specific embodiment of the evaluation methodology of the present invention with correlation coefficient
It it is schematic diagram.
Fig. 3 is shown as the relation applying the number of samples in a specific embodiment of the evaluation methodology of the present invention with similarity
Schematic diagram.
Fig. 4 is shown as the relation applying the chemical score in a specific embodiment of the evaluation methodology of the present invention with sample number
Schematic diagram.
Fig. 5 is shown as the relation applying the sample number in a specific embodiment of the evaluation methodology of the present invention with chemical score
Schematic diagram.
Fig. 6 is shown as the pass applying the sample number in a specific embodiment of the evaluation methodology of the present invention with relative error
It it is schematic diagram.
Fig. 7 is shown as the relation applying the chemical score in a specific embodiment of the evaluation methodology of the present invention with sample number
Schematic diagram.
Fig. 8 is shown as the pass applying the sample number in a specific embodiment of the evaluation methodology of the present invention with correlation coefficient
It it is schematic diagram.
Fig. 9 is shown as the relation applying the sample number in a specific embodiment of the evaluation methodology of the present invention with similarity
Schematic diagram.
Figure 10 is shown as applying the sample number in a specific embodiment of the evaluation methodology of the present invention and relative error
Relation schematic diagram.
Figure 11 is shown as the evaluation system of present invention module diagram in one embodiment.
Element numbers explanation
1 evaluates system
11 basic data acquisition modules
12 pretreatment module
13 maximum similarity spectrum acquisition modules
14 difference meansigma methodss solve module
15 comparison modules
S11~S16 step
Detailed description of the invention
Below by way of specific instantiation, embodiments of the present invention being described, those skilled in the art can be by this specification
Disclosed content understands other advantages and effect of the present invention easily.The present invention can also be by the most different concrete realities
The mode of executing is carried out or applies, the every details in this specification can also based on different viewpoints and application, without departing from
Various modification or change is carried out under the spirit of the present invention.It should be noted that, in the case of not conflicting, following example and enforcement
Feature in example can be mutually combined.
It should be noted that the diagram provided in following example illustrates the basic structure of the present invention the most in a schematic way
Think, component count, shape and size when then only showing the assembly relevant with the present invention rather than implement according to reality in diagram
Drawing, during its actual enforcement, the kenel of each assembly, quantity and ratio can be a kind of random change, and its assembly layout kenel is also
It is likely more complexity.
Refer to Fig. 1, be shown as the evaluation methodology of present invention schematic flow sheet in one embodiment.Described method
For the quality for the basic data of modeling comprising near infrared spectrum is evaluated, be i.e. equivalent to be used for the near of modeling
The quality of the quantitative modeling data of infrared spectrum is evaluated, to the spectrum of a small amount of quantitative modeling and basic data before modeling
Accuracy and corresponding performance be analyzed, and then evaluate the quality of all of basic data of this batch, i.e. spectrum and basis
Accuracy and the correspondence of data are the highest, and the quality of spectrum is the highest.The present invention, before near-infrared great amount of samples obtains, uses in a small amount
Sample carry out near infrared spectrum and pass judgment on chemical score quality, judge basic data whether can set up one stable
Accurate model, the evaluation for the quality of near infrared spectrum data provides a kind of effective method of discrimination, it is to avoid due to basis number
According to of low quality cause sample modeling in a large number, also promote basic data when being of high quality, for expanding and improving for basis
The chemometrics method of data provides and instructs.
Described evaluation methodology shown in Fig. 1 includes:
S11: obtain a plurality of near infrared spectrum from described basic data, and obtain the change corresponding with every near infrared spectrum
Value;In a specific embodiment, obtain a collection of modeling spectrum and chemical score wherein comprise M spectrum and with its spectrogram label
Essential Chemistry value T_m (T_m represents the chemical score that m-th spectrum is corresponding) that Attribute Relative is answered, spectrum is made up of m wavelength points.
S12: described near infrared spectrum is carried out pretreatment;Although due to chemistry containing raw material near infrared spectrum, outer
See, physical message, but near infrared spectrum is easily moved by external environment, instrument self that parts are instable to be affected, so in
In the present invention one specific embodiment, after obtaining a plurality of described near infrared spectrum, also include using S-G Method of Seeking Derivative to acquisition
Described a plurality of near infrared spectrum carries out pretreatment.Can eliminate or reduce above-mentioned shortcoming to a certain extent.Yu Benshi
Executing in example, S-G Method of Seeking Derivative is: first each spectrum carrying out S-G and smooths, window width is 2k+1, uses differential width afterwards
For w, the spectrum after smoothing is carried out first derivation.
S13: solve the similarity distance between each described near infrared spectrum and partial auto-correlation;
S14: obtain described with each according to the similarity distance between each described near infrared spectrum and partial auto-correlation respectively
The near infrared spectrum of the maximum similarity that near infrared spectrum is corresponding and corresponding chemical score;Wherein, two spectrum XiWith Yj(i, j=
1 ... n, i ≠ j) similarity between is Dij, and in specific embodiment, solving similarity is DijStep also include:
1) spectrum X is soughtiWith YjBetween coefficient Rij, constructing a moving window number is the window of k, spectrum XiWith Yj
In have m wavelength points, by spectrum XiWith YjFrom the beginning of the c wavelength points, move to c+k-1 wavelength points, and calculate XiWith Yj?
Coefficient R in this section of spectrumcij, c is from 1:m-k+1 wavelength points, finally tries to achieve being averagely correlated with under all moving windows
Coefficient is XjWith YjBetween coefficient Rij。
Xi,cRepresent the c moving window in i-th article of spectrum, Yj,cRepresent the c moving window in j-th strip spectrum.
2) X in original spectrum is calculatediWith YjInformation content, xiIt is the information content of i-th spectrum, yjFor j-th strip light
The information content of spectrum, calculating the information content comprised mutually between two spectrum is:
Wherein, i, j=1 ... n, and i ≠ j.
Similarity distance between described each described near infrared spectrum is Sxy+Syx, and the similarity between each described near infrared spectrum
For the partial auto-correlation between each described near infrared spectrum and the ratio of the similarity distance between each described near infrared spectrum.I.e. described
Similarity is
S15: obtain correspondence between the near infrared spectrum of each described near infrared spectrum corresponding maximum similarity respectively
The difference of chemical score, and after all described differences are taken absolute value, obtain the absolute difference corresponding with described difference, solve all
The meansigma methods of described absolute difference;
S16: the meansigma methods of described absolute difference is compared with the error amount preset, average when described absolute difference
When value is more than described default error amount, evaluate the off quality of described basic data, the most of low quality, for follow-up sample
Originally it is not enough to set up a good near-infrared model;When the meansigma methods of described absolute difference is less than or equal to described default error
During value, evaluate the up-to-standard of described basic data.
Know similarity D between each sampleij, spectrum samples i choose from other remaining m-1 samples one and it
The spectrum v that similarity is maximum, obtains the chemical score T of corresponding with spectrum v for spectrum i two group sample simultaneouslyiWith Tv, obtain every
Spectrum maximum similar spectral and corresponding chemical score thereof, obtain the mean error of their similar group of chemical scoreWhenMore than threshold value H
(threshold value H determines according to actual production demand), then judge that this collection of modeling data is not suitable for, otherwise, this collection of modeling data energy
It is enough in and sets up stable a, model accurately.Wherein,
In the present invention one specific embodiment, when evaluate described basic data off quality time, also include described
The step that the sampling mode of the near infrared spectrum in basic data is adjusted and/or safeguards basis flow-data.
The present invention can also verify reasonability and the suitability of this method according to model external certificate error and quality evaluation.
The near infrared spectrum quantitative modeling data evaluation method of the present invention, it is possible to anticipation modeling data quality is good in advance
Bad, it is to avoid the near-infrared data that the quality of data is the highest sample in a large number, can be that modeling producer reduces sample unnecessary waste,
Reducing substantial amounts of material resources, manpower, financial resources, the accuracy verification for the adjustment basic data of sampling method simultaneously provides reliably
Know foundation.
Can be the highest for anticipation qualitative data by the evaluation methodology of the present invention, but modeling data result is bad
The model data that provide the foundation ensure reliably, promote improvement and the raising of near-infrared quantitative modeling method, for actual production
Concrete application lays a solid foundation.
With the present invention concrete application example in actual production, the present invention will be further detailed below, this reality
Example uses nicotine basic data that the former cigarette Nicotiana tabacum L. near infrared spectrum after beating and double roasting and Flow Analyzer done as experiment
Object, specifically describes in detail a kind of near-infrared quantitative modeling quality testing new method.
Step one, obtain near infrared spectrum and corresponding chemical score thereof, detailed process: Nicotiana tabacum L. through beating and double roasting it
After, the spectrogram of 268 samples is obtained through On-line NIR instrument, it is 256 that the wavelength of spectrum is counted, and by corresponding sample
Product obtain corresponding chemical score for Flow Analyzerization inspection.Wherein, described sample is the Nicotiana tabacum L. that Red River Redrying Factory provides.
Step 2, near infrared spectrum is carried out pretreatment, detailed process: each spectrogram is converted into row matrix, chooses
Window number is 7;Differential width is 3 each spectrum is carried out S-G convolution to smooth derivation.
Step 3, asking for the similarity of spectrum, detailed process is:
1. one moving window k=7 of structure, by spectrum from the beginning of first wavelength points, moves to 250 wavelength points, calculates
Obtain the correlation coefficient between each spectrum as shown in Figure 2.
2. information content, x between each spectrogram in original spectrumiIt is the information content of i-th spectrum, yjFor j-th strip light
The information content of spectrum, calculates the information comprised between each spectrum, is substituted into following formula and calculate respectively between two spectrum
Information content:
3. the similarity between a spectrumAnd similarity collection of illustrative plates is as shown in Figure 3.
Step 4, according to similarity Dij, ask the mean error between its basic data to judge near-infrared quantitative modeling number
According to quality, its detailed process: according to step 3, try to achieve similarity D between each sample and other samples, from No. 1 sample
Start, select the sample the highest with his similarity to mate, and find corresponding chemical score, calculate 268 samples and he
The error of chemical score between the highest coupling sample of similarity, 268 sample chemical Distribution value as shown in the figure 4, mutual
Join Distribution of chemical value as it is shown in figure 5, relative error scattergram as shown in Figure 6, obtains the sample of 268 samples and its similarity mode
Average relative error between product is 11.24%, and mean absolute error is 0.26, less than mean absolute error H in reality application
=0.35, it is possible to determine that this batch of basic data quality can set up a stable near-infrared quantitative model being suitable for.According to 268
Spectrum and basic data set up near-infrared quantitative model, its external certificate parameter such as table 1 institute.Table 1 is that first near-infrared is quantitative
Model external certificate parameter.
Table 1
The model correlation coefficient set up as seen from Table 1 is 0.82, and validation criteria deviation is 0.33, and average relative error is
10.9%, less than the average relative error in reality application and mean absolute error, this model can be applied in actual production.
In another specific embodiment, use and obtain other a collection of modeling spectrum and its corresponding chemical score, ask it
Similarity between spectrum, the error matched to evaluate the quality of this batch of modeling data, its detailed process: obtain the most a collection of
Through 210 spectrum and the corresponding chemical score of On-line near infrared analyzer after beating and double roasting, basic data scattergram as it is shown in fig. 7,
According to above-mentioned steps two, three, four, obtain their correlation coefficient as shown in Figure 8, obtain the similarity that is mutually matched between sample such as
Shown in Fig. 9, relative error scattergram as shown in Figure 10, obtains the average exhausted of 210 samples sample room with its similarity mode
Being 0.65 to error, average relative error is 27.42%, and mean absolute error 0.64 is more than the average absolute in reality application by mistake
Difference H=0.35, it is determined that this batch of spectrum basic data quality is the poorest, it is impossible to set up a stable model.According to 210 spectrum and
Basic data sets up near-infrared quantitative model, and its external certificate parameter is as shown in table 2, and table 2 is shown as second batch data near-infrared
Quantitative model external certificate parameter.
Table 2
From Table 2, it can be seen that the near-infrared quantitative model predictive value set up by this group basic data and basic data
Dependency is little, and validation criteria deviation is 0.41, and owing to error is too big, the model of foundation cannot be applied to reality or predict not
Accurate.
Showing from Tables 1 and 2, the quality of near-infrared modeling basic data can be evaluated by the inventive method, permissible
The quality of this batch of basic data was quickly judged before modeling.
Refer to Figure 11, be shown as the module diagram in the present invention one specific embodiment.Described evaluation system 1, is used for
The quality of basic data for modeling comprising near infrared spectrum is evaluated, i.e. the accuracy of spectrum and basic data and
Correspondence is the highest, and the quality of spectrum is the highest.Described system 1 includes:
Basic data acquisition module 11, in order to obtain a plurality of near infrared spectrum from described basic data, and obtains with every
The chemical score that bar near infrared spectrum is corresponding;
Pretreatment module 12, in order to carry out pretreatment to described near infrared spectrum;
Maximum similarity spectrum acquisition module 13, in order to solve the similarity distance between each described near infrared spectrum and local
Correlation coefficient;And according to the similarity distance between each described near infrared spectrum and partial auto-correlation obtain respectively with each described closely
The near infrared spectrum of the maximum similarity that infrared spectrum is corresponding and corresponding chemical score;
Difference meansigma methods solves module 14, the maximum similarity corresponding in order to obtain each described near infrared spectrum respectively
Near infrared spectrum between the difference of corresponding chemical score, and after all described differences are taken absolute value, obtain and described difference pair
The absolute difference answered, solves the meansigma methods of all described absolute differences;
Comparison module 15, in order to by the meansigma methods of described absolute difference with preset error amount compare, when described absolutely
During to the meansigma methods of difference more than described default error amount, evaluate the off quality of described basic data;When described definitely
When the meansigma methods of difference is less than or equal to described default error amount, evaluate the up-to-standard of described basic data.
In the present invention one specific embodiment, the mode that described near infrared spectrum carries out pretreatment includes S-G derivation side
Method.
In the present invention one specific embodiment, described maximum similarity spectrum acquisition module 12 is in order to according to described near-infrared
The information content of spectrum, solves the similarity distance between each described near infrared spectrum.
In the present invention one specific embodiment, the similarity between each described near infrared spectrum is between each described near infrared spectrum
Partial auto-correlation and each described near infrared spectrum between the ratio of similarity distance.
In the present invention one specific embodiment, also include adjusting module, in order to when evaluating the quality of described basic data not
Time qualified, the sampling mode of the near infrared spectrum in described basic data be adjusted and/or basis flow-data is carried out
Safeguard.
Described evaluation system 1 is the system entries corresponding with described evaluation methodology, both technical scheme one_to_one corresponding, and institute is relevant
Description in described evaluation methodology all can be applicable in the present embodiment, is not added with at this repeating.
In sum, the evaluation methodology of the present invention and system, for the basis for modeling comprising near infrared spectrum
The quality of data is evaluated, and described method includes: obtains a plurality of near infrared spectrum from described basic data, and obtains with every
The chemical score that bar near infrared spectrum is corresponding;Described near infrared spectrum is carried out pretreatment;Solve between each described near infrared spectrum
Similarity distance and partial auto-correlation;According to the similarity distance between each described near infrared spectrum and partial auto-correlation's difference
Obtain the near infrared spectrum of the maximum similarity corresponding with each described near infrared spectrum and corresponding chemical score;Obtain each respectively
The difference of chemical score corresponding between the near infrared spectrum of the maximum similarity that described near infrared spectrum is corresponding, and to all institutes
State after difference takes absolute value, obtain the absolute difference corresponding with described difference, solve the meansigma methods of all described absolute differences;Will
The meansigma methods of described absolute difference compares with the error amount preset, when the meansigma methods of described absolute difference is preset more than described
Error amount time, evaluate the off quality of described basic data;When the meansigma methods of described absolute difference is less than or equal to described
During the error amount preset, evaluate the up-to-standard of described basic data.The present invention can with precise and high efficiency near infrared spectrum
Before being modeled, use sample in a small amount to carry out near infrared spectrum and pass judgment on chemical score quality, with the matter to basic data
Amount is evaluated, and judges whether basic data can set up a stable accurate model, for the matter of near infrared spectrum data
The evaluation of amount provides a kind of effective method of discrimination, it is to avoid cause owing to basic data is of low quality samples modeling in a large number,
Also promote basic data when being of high quality, for expanding and improving the chemometrics method for basic data and provide and instruct,
It is modeled underproof near infrared spectrum analyzing to get rid of, improves effectiveness and the accuracy of modeling analysis, reduce manpower
The waste of material resources.So, the present invention effectively overcomes various shortcoming of the prior art and has high industrial utilization.
The principle of above-described embodiment only illustrative present invention and effect thereof, not for limiting the present invention.Any ripe
Above-described embodiment all can be modified under the spirit and the scope of the present invention or change by the personage knowing this technology.Cause
This, have usually intellectual such as complete with institute under technological thought without departing from disclosed spirit in art
All equivalences become are modified or change, and must be contained by the claim of the present invention.
Claims (10)
1. an evaluation methodology, it is characterised in that for the quality to the basic data for modeling comprising near infrared spectrum
Being evaluated, described method includes:
From described basic data, obtain a plurality of near infrared spectrum, and obtain the chemical score corresponding with every near infrared spectrum;
Described near infrared spectrum is carried out pretreatment;
Solve the similarity distance between each described near infrared spectrum and partial auto-correlation;
Obtain respectively and each described near infrared light according to the similarity distance between each described near infrared spectrum and partial auto-correlation
The near infrared spectrum of the maximum similarity that spectrum is corresponding and corresponding chemical score;
Obtain chemical score corresponding between the near infrared spectrum of the corresponding maximum similarity of each described near infrared spectrum respectively
Difference, and after all described differences are taken absolute value, obtain the absolute difference corresponding with described difference, solve all described definitely
The meansigma methods of difference;
The meansigma methods of described absolute difference is compared with the error amount preset, when the meansigma methods of described absolute difference is more than institute
When stating default error amount, evaluate the off quality of described basic data;When described absolute difference meansigma methods less than or etc.
When described default error amount, evaluate the up-to-standard of described basic data.
Evaluation methodology the most according to claim 1, it is characterised in that: described near infrared spectrum is carried out the mode of pretreatment
Including S-G Method of Seeking Derivative.
Evaluation methodology the most according to claim 1, it is characterised in that: according to the information content of described near infrared spectrum, ask
Solve the similarity distance between each described near infrared spectrum.
Evaluation methodology the most according to claim 1, it is characterised in that: the similarity between each described near infrared spectrum is each institute
State the partial auto-correlation between near infrared spectrum and the ratio of the similarity distance between each described near infrared spectrum.
Evaluation methodology the most according to claim 1, it is characterised in that: when evaluating the off quality of described basic data
Time, also include the sampling mode of the near infrared spectrum in described basic data being adjusted and/or basis flow-data being entered
The step that row is safeguarded.
6. evaluate system for one kind, it is characterised in that for the quality to the basic data for modeling comprising near infrared spectrum
Being evaluated, described system includes:
Basic data acquisition module, in order to obtain a plurality of near infrared spectrum from described basic data, and obtains near with every red
The chemical score that external spectrum is corresponding;
Pretreatment module, in order to carry out pretreatment to described near infrared spectrum;
Maximum similarity spectrum acquisition module, in order to solve the similarity distance between each described near infrared spectrum and Local Phase relation
Number;And obtain respectively and each described near infrared light according to the similarity distance between each described near infrared spectrum and partial auto-correlation
The near infrared spectrum of the maximum similarity that spectrum is corresponding and corresponding chemical score;
Difference meansigma methods solves module, in order to obtain the reddest of the corresponding maximum similarity of each described near infrared spectrum respectively
The difference of chemical score corresponding between external spectrum, and after taking absolute value all described differences, obtains corresponding with described difference exhausted
To difference, solve the meansigma methods of all described absolute differences;
Comparison module, in order to compare the meansigma methods of described absolute difference, when described absolute difference with the error amount preset
Meansigma methods more than described default error amount time, evaluate the off quality of described basic data;When described absolute difference
When meansigma methods is less than or equal to described default error amount, evaluate the up-to-standard of described basic data.
Evaluation system the most according to claim 6, it is characterised in that: described near infrared spectrum is carried out the mode of pretreatment
Including S-G Method of Seeking Derivative.
Evaluation system the most according to claim 6, it is characterised in that: maximum similarity spectrum acquisition module is in order to according to institute
State the information content of near infrared spectrum, solve the similarity distance between each described near infrared spectrum.
Evaluation system the most according to claim 6, it is characterised in that: the similarity between each described near infrared spectrum is each institute
State the partial auto-correlation between near infrared spectrum and the ratio of the similarity distance between each described near infrared spectrum.
Evaluation system the most according to claim 6, it is characterised in that: also include adjusting module, in order to when evaluating described base
During plinth data off quality, the sampling mode of the near infrared spectrum in described basic data is adjusted and/or to base
Plinth flow-data is safeguarded.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610790067.9A CN106248621B (en) | 2016-08-31 | 2016-08-31 | A kind of evaluation method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610790067.9A CN106248621B (en) | 2016-08-31 | 2016-08-31 | A kind of evaluation method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106248621A true CN106248621A (en) | 2016-12-21 |
CN106248621B CN106248621B (en) | 2019-04-02 |
Family
ID=58080988
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610790067.9A Active CN106248621B (en) | 2016-08-31 | 2016-08-31 | A kind of evaluation method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106248621B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109324015A (en) * | 2018-10-17 | 2019-02-12 | 浙江中烟工业有限责任公司 | Based on the similar tobacco leaf alternative of spectrum |
CN109409700A (en) * | 2018-10-10 | 2019-03-01 | 网宿科技股份有限公司 | A kind of configuration data confirmation method, business monitoring method and device |
CN110765161A (en) * | 2018-07-10 | 2020-02-07 | 普天信息技术有限公司 | Implementation method for applying energy consumption data quality control to big data real-time processing architecture |
CN111257277A (en) * | 2018-11-30 | 2020-06-09 | 湖南中烟工业有限责任公司 | Tobacco leaf similarity judgment method based on near infrared spectrum technology |
CN111426648A (en) * | 2020-03-19 | 2020-07-17 | 甘肃省交通规划勘察设计院股份有限公司 | Method and system for determining similarity of infrared spectrogram |
CN113670847A (en) * | 2021-09-26 | 2021-11-19 | 山东大学 | Near-infrared quality monitoring method for swertia mussotii extraction process |
CN113984708A (en) * | 2021-10-22 | 2022-01-28 | 浙江中烟工业有限责任公司 | Maintenance method and device of chemical index detection model |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101251471A (en) * | 2008-03-12 | 2008-08-27 | 湖南中烟工业有限责任公司 | Method for searching analog tobacco leaf based on tobacco leaf near infrared spectra |
JP2009261831A (en) * | 2008-04-30 | 2009-11-12 | Pola Chem Ind Inc | Estimation method of amount of sebum of skin |
JP4385433B2 (en) * | 1998-09-04 | 2009-12-16 | 三井化学株式会社 | Manufacturing operation control method by near infrared analysis |
CN103729650A (en) * | 2014-01-17 | 2014-04-16 | 华东理工大学 | Selection method for near infrared spectrum modeling samples |
CN104330381A (en) * | 2014-10-25 | 2015-02-04 | 陕西玉航电子有限公司 | Near-infrared spectrum analysis method |
CN104990894A (en) * | 2015-07-09 | 2015-10-21 | 南京富岛信息工程有限公司 | Detection method of gasoline properties based on weighted absorbance and similar samples |
CN105136736A (en) * | 2015-09-14 | 2015-12-09 | 上海创和亿电子科技发展有限公司 | Online near infrared sample size determination method |
CN105334185A (en) * | 2015-09-14 | 2016-02-17 | 上海创和亿电子科技发展有限公司 | Spectrum projection discrimination-based near infrared model maintenance method |
CN105891147A (en) * | 2016-03-30 | 2016-08-24 | 浙江中烟工业有限责任公司 | Near infrared spectrum information extraction method based on canonical correlation coefficients |
-
2016
- 2016-08-31 CN CN201610790067.9A patent/CN106248621B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4385433B2 (en) * | 1998-09-04 | 2009-12-16 | 三井化学株式会社 | Manufacturing operation control method by near infrared analysis |
CN101251471A (en) * | 2008-03-12 | 2008-08-27 | 湖南中烟工业有限责任公司 | Method for searching analog tobacco leaf based on tobacco leaf near infrared spectra |
JP2009261831A (en) * | 2008-04-30 | 2009-11-12 | Pola Chem Ind Inc | Estimation method of amount of sebum of skin |
CN103729650A (en) * | 2014-01-17 | 2014-04-16 | 华东理工大学 | Selection method for near infrared spectrum modeling samples |
CN104330381A (en) * | 2014-10-25 | 2015-02-04 | 陕西玉航电子有限公司 | Near-infrared spectrum analysis method |
CN104990894A (en) * | 2015-07-09 | 2015-10-21 | 南京富岛信息工程有限公司 | Detection method of gasoline properties based on weighted absorbance and similar samples |
CN105136736A (en) * | 2015-09-14 | 2015-12-09 | 上海创和亿电子科技发展有限公司 | Online near infrared sample size determination method |
CN105334185A (en) * | 2015-09-14 | 2016-02-17 | 上海创和亿电子科技发展有限公司 | Spectrum projection discrimination-based near infrared model maintenance method |
CN105891147A (en) * | 2016-03-30 | 2016-08-24 | 浙江中烟工业有限责任公司 | Near infrared spectrum information extraction method based on canonical correlation coefficients |
Non-Patent Citations (2)
Title |
---|
张浚哲 等: ""一种基于变权重组合的光谱相似性测度"", 《测绘学报》 * |
陈斌 等: ""PCA结合马氏距离法剔除近红外异常样品"", 《江苏大学学报》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110765161A (en) * | 2018-07-10 | 2020-02-07 | 普天信息技术有限公司 | Implementation method for applying energy consumption data quality control to big data real-time processing architecture |
CN109409700A (en) * | 2018-10-10 | 2019-03-01 | 网宿科技股份有限公司 | A kind of configuration data confirmation method, business monitoring method and device |
CN109409700B (en) * | 2018-10-10 | 2022-03-08 | 网宿科技股份有限公司 | Configuration data confirmation method, service monitoring method and device |
CN109324015A (en) * | 2018-10-17 | 2019-02-12 | 浙江中烟工业有限责任公司 | Based on the similar tobacco leaf alternative of spectrum |
CN109324015B (en) * | 2018-10-17 | 2021-07-13 | 浙江中烟工业有限责任公司 | Tobacco leaf replacing method based on spectrum similarity |
CN111257277A (en) * | 2018-11-30 | 2020-06-09 | 湖南中烟工业有限责任公司 | Tobacco leaf similarity judgment method based on near infrared spectrum technology |
CN111257277B (en) * | 2018-11-30 | 2023-02-17 | 湖南中烟工业有限责任公司 | Tobacco leaf similarity judgment method based on near infrared spectrum technology |
CN111426648A (en) * | 2020-03-19 | 2020-07-17 | 甘肃省交通规划勘察设计院股份有限公司 | Method and system for determining similarity of infrared spectrogram |
CN113670847A (en) * | 2021-09-26 | 2021-11-19 | 山东大学 | Near-infrared quality monitoring method for swertia mussotii extraction process |
CN113984708A (en) * | 2021-10-22 | 2022-01-28 | 浙江中烟工业有限责任公司 | Maintenance method and device of chemical index detection model |
CN113984708B (en) * | 2021-10-22 | 2024-03-19 | 浙江中烟工业有限责任公司 | Maintenance method and device for chemical index detection model |
Also Published As
Publication number | Publication date |
---|---|
CN106248621B (en) | 2019-04-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106248621A (en) | A kind of evaluation methodology and system | |
CN110161013B (en) | Laser-induced breakdown spectroscopy data processing method and system based on machine learning | |
CN105630743A (en) | Spectrum wave number selection method | |
CN107958267B (en) | Oil product property prediction method based on spectral linear representation | |
CN109064553B (en) | Solid wood board section morphology inversion method based on near infrared spectrum analysis | |
CN105334185B (en) | The near-infrared model maintaining method differentiated based on spectrum projection | |
WO2020029851A1 (en) | Workflow-based vibration spectrum analysis model optimization method | |
CN105843870B (en) | The analysis method and its application of repeatability and reproducibility | |
CN105203498A (en) | Near infrared spectrum variable selection method based on LASSO | |
CN109324013A (en) | A method of it is quickly analyzed using Gaussian process regression model building oil property near-infrared | |
CN107941739A (en) | A kind of SBS performance of modified bitumen index method for rapidly judging | |
McNeish et al. | The effect of measurement quality on targeted structural model fit indices: A comment on Lance, Beck, Fan, and Carter (2016). | |
CN110569566A (en) | Method for predicting mechanical property of plate strip | |
US8725469B2 (en) | Optimization of data processing parameters | |
WO2023207453A1 (en) | Traditional chinese medicine ingredient analysis method and system based on spectral clustering | |
CN114611582B (en) | Method and system for analyzing substance concentration based on near infrared spectrum technology | |
CN105223140A (en) | The method for quickly identifying of homology material | |
CN107976417B (en) | Crude oil type identification method based on infrared spectrum | |
CN107966420B (en) | Method for predicting crude oil property by near infrared spectrum | |
CN104897709A (en) | Agricultural product element quantitative detection model building method based on X-ray fluorescence analysis | |
CN108663334B (en) | Method for searching spectral characteristic wavelength of soil nutrient based on multi-classifier fusion | |
CN106485049B (en) | A kind of detection method of the NIRS exceptional sample based on Monte Carlo cross validation | |
CN108920428B (en) | Fuzzy distance discrimination method based on joint fuzzy expansion principle | |
CN111474124B (en) | Spectral wavelength selection method based on compensation | |
CN108489928A (en) | A kind of short-wave infrared extinction spectra textile fiber component detection method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |