CN112668717A - Data processing method and device oriented to neural network model optimization - Google Patents
Data processing method and device oriented to neural network model optimization Download PDFInfo
- Publication number
- CN112668717A CN112668717A CN202110002440.0A CN202110002440A CN112668717A CN 112668717 A CN112668717 A CN 112668717A CN 202110002440 A CN202110002440 A CN 202110002440A CN 112668717 A CN112668717 A CN 112668717A
- Authority
- CN
- China
- Prior art keywords
- cartesian
- expansion
- order
- data
- calculation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Complex Calculations (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a data processing method and device for neural network model optimization. The method comprises the following steps: by calculating the high-order Cartesian expansion item, original data is mapped into a high-order Cartesian expansion space which has stronger expression capacity and contains more information. The device comprises: the device comprises an input module, a Cartesian expansion calculation module and an output module; the input module is used for determining and receiving multi-dimensional data for calculation, and comprises dimensions for determining input data and numerical values of the dimensions; the Cartesian expansion calculation module is used for carrying out Cartesian expansion calculation on the multi-dimensional input data determined by the input module; and the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result. The invention has the advantages that: on the premise of not influencing the model effect, the difficulty of subsequent model learning is reduced, the learning efficiency is improved, and the convenience of distributed parallel computing is provided.
Description
Technical Field
The invention relates to the technical field of machine learning, in particular to a data processing method and device for neural network model optimization.
Background
Machine Learning (Machine Learning) is a discipline that specializes in how computers simulate or implement human Learning behaviors to acquire new knowledge or skills and reorganize existing knowledge structures to continuously improve their performance. In the field of machine learning, a variety of machine learning methods typified by deep neural networks have been applied with great success. Neural network models such as CNN and RNN which are most widely applied have good recognition effects in the fields of computer vision, natural language processing, voice recognition and the like. In the above successful machine learning applications, the data processing method is not necessary, especially in the training of the neural network model, the requirement of the model on the input data is severe, and an efficient and reasonable data processing method is more necessary to improve the capability and training efficiency of the model. In particular, the current artificial intelligence technology using a deep neural network as a main tool generally adopts a deep structure, and has some defects and problems which cannot be avoided. When the depth of the neural network structure is increased, the problem of gradient disappearance or gradient explosion can occur in the training process; meanwhile, training of the neural network needs a large amount of training data, so that the training timeliness is a very critical problem, the deep structure is a structure which is not beneficial to distributed parallel computation in terms of topological relation, and a bottleneck exists in timeliness.
In the current machine learning algorithm, multi-dimensional data composed of manually designed features are mostly directly used for model training. In the aspect of preprocessing of data, preprocessing of data is mostly performed according to the format normalization of data and the requirements of a relevant model. Such as normalization, regularization, and related shifting, rotation, etc., operations in the neural network that are used to augment the training set.
The processing method only transforms the expression form of the data, and does not process the information in the data more finely, so that the structure of the model (especially the structure of the neural network model) and the training process cannot be optimized in the data analysis and processing level.
Other methods such as cosine transform and Principal Component Analysis (PCA) are mainly used for dimension reduction and feature screening of input data. By reducing the dimensionality of input data, the model complexity is reduced by retaining the dimensionality playing a key role in the model effect, and the data processing efficiency and the model accuracy are improved.
The dimension reduction processing method of the data only screens the data dimension, can play a certain role of optimizing the model, but does not improve the expression capacity of the data, only optimizes the data quantity, cannot structurally optimize the model, and does not fundamentally improve the training efficiency of the model.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a data processing method and device for neural network model optimization, and solves the defects in the prior art.
In order to realize the purpose, the technical scheme adopted by the invention is as follows:
a data processing method for neural network model optimization comprises the following steps:
1) receiving input data:
for input vector data X ═ X1,x2,…,xN) Wherein x is1,x2,…,xNDetermining data dimension N and value x of ith dimension for each dimension of input vectori。
2) Full P-order cartesian dilation operation:
2.1 determining the highest order P
The value of P is determined according to the data dimensions and the specific computing hardware conditions. When the data dimension is large (larger than 1000), P is preferably 2 to 3, otherwise, the P value can be determined according to the actually deployed computing hardware memory M and the computing precision lambda, and the P value corresponding to the maximum expansion dimension which can be accommodated by the memory, namely the P value can be taken
2.2 calculation of the Cartesian expansion terms of the respective orders
Constructing k-th cartesian expansion terms from the P-valuesWherein P isjIs the j dimension xjIs a non-negative integer and satisfies
And respectively calculating all possible product values of k under different values between 1 and P, namely all 1-P-order Cartesian expansion items consisting of all dimensions.
3) And outputting a result:
after the calculation in the step 2) is finished, arranging all the obtained 1-P-order Cartesian expansion results together according to a certain sequence to form a result vector S ═ S1,s2,…,sM) Where s represents each calculated cartesian expansion term and M represents the dimensions of the result vector. And finally outputs it.
Preferably, the operation on the P-th cartesian expansion in step 2) may be implemented by multiplication of a matrix, which is as follows:
for an input vector of dimension n, X ═ X1,x2,…,xN) The 2-order Cartesian expansion results in a matrix X of the form (N, N)TThe 3-order Cartesian expansion result of the elements in X is each column vector of N column vectors in the 2-order result, and then X is multiplied to form a matrix of (N, N), so that the N matrices form a 3-dimensional matrix of (N, N, N). And by analogy, the higher order term is calculated, and the Cartesian expansion result matrix is increased by one dimension when each liter is higher by one step until the P-order Cartesian expansion result is calculated.
Preferably, before the result vector S in step 2) is output, dimensions of the result vector S are screened by a dimension reduction method, dimensions with relatively low importance are removed, and the screened vector is output as a result.
The invention also discloses a data processing device, comprising: the device comprises an input module, a Cartesian expansion calculation module and an output module;
the input module is used for determining and receiving multi-dimensional data for calculation, and comprises dimensions for determining input data and numerical values of the dimensions;
the Cartesian expansion calculation module is used for performing Cartesian expansion calculation from 1 order to P order on the multidimensional input data determined by the input module;
and the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result of the Cartesian expansion calculation module.
Further, the cartesian dilation calculation module comprises: an expansion order unit and a multiplication unit;
the expansion order unit is used for setting the highest order for Cartesian expansion according to specific problems, computing equipment and other conditions;
and the multiplication operation unit is used for calculating 1-P-order Cartesian expansion terms among the dimensions of the multi-dimensional data provided by the input module.
Compared with the prior art, the invention has the advantages that:
the dimensionality transformation is carried out on input data by utilizing a multi-order Cartesian expansion algorithm, original input data are mapped into a Cartesian expansion space with higher order, so that the expression capability of the data and the discrimination between different classes are improved, the data change effect the same as that of a deep neural network is realized, further, the neural network becomes possible to become a wide structure from deep structure optimization, distributed parallel computing is supported more effectively, the training efficiency and the training effect of a machine learning model are improved under the condition that the training data volume and the computing capability are the same, and important engineering application significance and research value exist in the fields of artificial intelligence and pattern recognition analysis.
Drawings
FIG. 1 is a schematic representation of Cartesian expansion data processing according to an embodiment of the present invention;
FIG. 2 is a flow chart of a data processing apparatus according to an embodiment of the present invention;
FIG. 3 is a flow chart of a 3-order Cartesian expansion matrix multiplication implementation according to an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings by way of examples.
Example 1
As shown in fig. 1, a data processing method for neural network model optimization includes the following steps:
1) receiving input data:
for input vector data X ═ X1,x2,…,xN) Wherein x is1,x2,…,xNDetermining data dimension N and value x of ith dimension for each dimension of input vectori。
2) Full P-order cartesian dilation operation:
2.1 determining the highest order P
The value of P is determined according to the data dimensions and the specific computing hardware conditions. When the data dimension is large (larger than 1000), P is preferably 2 to 3, otherwise, the P value can be determined according to the actually deployed computing hardware memory M and the computing precision lambda, and the P value corresponding to the maximum expansion dimension which can be accommodated by the memory, namely the P value can be taken
2.2 calculation of the Cartesian expansion terms of the respective orders
Constructing k-th cartesian expansion terms from the P-valuesWherein P isjIs the j dimension xjIs a non-negative integer and satisfies
And respectively calculating all possible product values of k under different values between 1 and P, namely all 1-P-order Cartesian expansion items consisting of all dimensions.
It should be understood that all possible cartesian expansion terms are calculated by using a round-robin enumeration, so that the enumeration can avoid repeated terms caused by multiplication commutative law in all the cartesian expansion terms.
3) And outputting a result:
after the calculation in the step 2) is finished, arranging all the obtained 1-P-order Cartesian expansion results together according to a certain sequence to form a result vector S ═ S1,s2,…,sM) Where s represents each calculated cartesian expansion term and M represents the dimensions of the result vector. And finally outputs it.
In the step of outputting the result, before the result vector S is output, dimension filtering operation may be performed on the result vector S. Due to different values of P, calculation methods of P-order cartesian expansion, and different specific problems, each dimension of the obtained result vector S cannot play an important role. Therefore, before the result vector is output, the dimensionality of the result vector can be screened by a dimensionality reduction method such as cosine transform and Principal Component Analysis (PCA), the dimensionality with relatively low importance degree is removed, and the screened vector is output as a result.
As shown in fig. 2, the data processing apparatus includes: the device comprises an input module, a Cartesian expansion calculation module and an output module;
the input module is used for determining and receiving multi-dimensional data for calculation, and comprises dimensions for determining input data and numerical values of the dimensions;
the Cartesian expansion calculation module is used for performing Cartesian expansion calculation from 1 order to P order on the multidimensional input data determined by the input module;
and the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result of the Cartesian expansion calculation module.
The Cartesian dilation calculation module includes: an expansion order unit and a multiplication unit;
the expansion order unit is used for setting the highest order for Cartesian expansion according to specific problems, computing equipment and other conditions;
and the multiplication operation unit is used for calculating 1-P-order Cartesian expansion terms among the dimensions of the multi-dimensional data provided by the input module.
In the processing flow, the original data firstly enters an input module to enter the processing flow, then the data is sent to a Cartesian expansion calculation module, the Cartesian expansion results from 1 to P orders are calculated, then the Cartesian expansion results of each order are sent to an output module to complete high-dimensional fusion, and finally output data are formed.
Example 2
This example only illustrates the differences from example 1;
the operation for P-th cartesian expansion can be implemented by multiplication of the matrix (tensor), as shown in fig. 3, specifically as follows:
for an input vector of dimension n, X ═ X1,x2,…,xN) The 2-order Cartesian expansion results in a matrix X of the form (N, N)TThe 3-order Cartesian expansion result of the elements in X is each column vector of N column vectors in the 2-order result, and then X is multiplied to form a matrix of (N, N), so that the N matrices form a 3-dimensional matrix of (N, N, N). And by analogy, the higher order term is calculated, and the Cartesian expansion result matrix is increased by one dimension when each liter is higher by one step until the P-order Cartesian expansion result is calculated.
The cross terms obtained in this way are repeated in a large number, but the problem of low calculation efficiency caused by a large number of loops can be effectively avoided.
It will be appreciated by those of ordinary skill in the art that the examples described herein are intended to assist the reader in understanding the manner in which the invention is practiced, and it is to be understood that the scope of the invention is not limited to such specifically recited statements and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.
Claims (5)
1. A data processing method for neural network model optimization is characterized by comprising the following steps:
1) receiving input data:
for input vector data X ═ X1,x2,…,xN) Wherein x is1,x2,…,xNDetermining data dimension N and value x of ith dimension for each dimension of input vectori;
2) Full P-order cartesian dilation operation:
2.1 determining the highest order P
Determining the value of P according to the data dimension and the specific calculation hardware condition; when the data dimension is large (larger than 1000), P is preferably 2 to 3, otherwise, the P value can be determined according to the actually deployed computing hardware memory M and the computing precision lambda, and the P value corresponding to the maximum expansion dimension which can be accommodated by the memory, namely the P value can be taken
2.2 calculation of the Cartesian expansion terms of the respective orders
Constructing k-th cartesian expansion terms from the P-valuesWherein P isjIs the j dimension xjIs a non-negative integer and satisfies
Respectively calculating all possible product values of k under different values between 1 and P according to the Cartesian expansion terms, namely all 1-P-order Cartesian expansion terms consisting of all dimensions;
3) and outputting a result:
after the calculation in the step 2) is finished, arranging all the obtained 1-P-order Cartesian expansion results together according to a certain sequence to form a result vector S ═ S1,s2,…,sM) Where s represents each calculated cartesian expansion term and M represents the dimensions of the result vector; and finally outputs it.
2. The data processing method of claim 1, wherein: the operation on the P-th cartesian expansion in step 2) may be implemented by multiplication of a matrix, which is specifically as follows:
for an input vector of dimension n, X ═ X1,x2,…,xN) The 2-order Cartesian expansion results in a matrix X of the form (N, N)TThe 3-order Cartesian expansion result of the elements in X is each column vector of N column vectors in the 2-order result, and then X is multiplied to form a matrix of (N, N), so that the N matrices form a 3-dimensional matrix of (N, N, N). And by analogy, the higher order term is calculated, and the Cartesian expansion result matrix is increased by one dimension when each liter is higher by one step until the P-order Cartesian expansion result is calculated.
3. The data processing method of claim 1, wherein: before the result vector S in the step 2) is output, the dimensionality is screened by a dimensionality reduction method, the dimensionality with relatively low importance degree is removed, and the screened vector is used as a result to be output.
4. A data processing apparatus for neural network model optimization, comprising: the device comprises an input module, a Cartesian expansion calculation module and an output module;
the input module is used for determining and receiving multi-dimensional data for calculation, and comprises dimensions for determining input data and numerical values of the dimensions;
the Cartesian expansion calculation module is used for performing Cartesian expansion calculation from 1 order to P order on the multidimensional input data determined by the input module;
and the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result of the Cartesian expansion calculation module.
5. The data processing apparatus of claim 4, wherein: the Cartesian dilation calculation module includes: an expansion order unit and a multiplication unit;
the expansion order unit is used for setting the highest order for Cartesian expansion according to specific problems, computing equipment and other conditions;
and the multiplication operation unit is used for calculating 1-P-order Cartesian expansion terms among the dimensions of the multi-dimensional data provided by the input module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110002440.0A CN112668717B (en) | 2021-01-04 | 2021-01-04 | Data processing method and device oriented to neural network model optimization |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110002440.0A CN112668717B (en) | 2021-01-04 | 2021-01-04 | Data processing method and device oriented to neural network model optimization |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112668717A true CN112668717A (en) | 2021-04-16 |
CN112668717B CN112668717B (en) | 2023-06-02 |
Family
ID=75412620
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110002440.0A Active CN112668717B (en) | 2021-01-04 | 2021-01-04 | Data processing method and device oriented to neural network model optimization |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112668717B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106250810A (en) * | 2015-06-15 | 2016-12-21 | 摩福公司 | By iris identification, individuality is identified and/or the method for certification |
CN106887000A (en) * | 2017-01-23 | 2017-06-23 | 上海联影医疗科技有限公司 | The gridding processing method and its system of medical image |
CN107729994A (en) * | 2017-11-28 | 2018-02-23 | 北京地平线信息技术有限公司 | The method and apparatus for performing the computing of the convolutional layer in convolutional neural networks |
CN107832842A (en) * | 2017-11-28 | 2018-03-23 | 北京地平线信息技术有限公司 | The method and apparatus that convolution algorithm is performed for fold characteristics data |
CN107944556A (en) * | 2017-12-12 | 2018-04-20 | 电子科技大学 | Deep neural network compression method based on block item tensor resolution |
WO2018224690A1 (en) * | 2017-06-09 | 2018-12-13 | Deepmind Technologies Limited | Generating discrete latent representations of input data items |
-
2021
- 2021-01-04 CN CN202110002440.0A patent/CN112668717B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106250810A (en) * | 2015-06-15 | 2016-12-21 | 摩福公司 | By iris identification, individuality is identified and/or the method for certification |
CN106887000A (en) * | 2017-01-23 | 2017-06-23 | 上海联影医疗科技有限公司 | The gridding processing method and its system of medical image |
WO2018224690A1 (en) * | 2017-06-09 | 2018-12-13 | Deepmind Technologies Limited | Generating discrete latent representations of input data items |
CN107729994A (en) * | 2017-11-28 | 2018-02-23 | 北京地平线信息技术有限公司 | The method and apparatus for performing the computing of the convolutional layer in convolutional neural networks |
CN107832842A (en) * | 2017-11-28 | 2018-03-23 | 北京地平线信息技术有限公司 | The method and apparatus that convolution algorithm is performed for fold characteristics data |
CN107944556A (en) * | 2017-12-12 | 2018-04-20 | 电子科技大学 | Deep neural network compression method based on block item tensor resolution |
Non-Patent Citations (3)
Title |
---|
FRANK A.RUSSO 等: "Predicting musically induced emotions from physiological inputs: linear and neura lnetwork models", 《FRONTIERS IN PSYCHOLOGY》 * |
HAIFENG LI 等: "MODENN: A Shallow Broad Neural Network Model Based on Multi-Order Descartes Expansion", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 * |
疏官胜: "移动云中基于计算迁移的应用性能优化研究", 《中国博士论文全文数据库信息科技辑》 * |
Also Published As
Publication number | Publication date |
---|---|
CN112668717B (en) | 2023-06-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cheng et al. | Model compression and acceleration for deep neural networks: The principles, progress, and challenges | |
Zhou et al. | Rethinking bottleneck structure for efficient mobile network design | |
Das et al. | A group incremental feature selection for classification using rough set theory based genetic algorithm | |
Howard et al. | Mobilenets: Efficient convolutional neural networks for mobile vision applications | |
Cheng et al. | A survey of model compression and acceleration for deep neural networks | |
EP4036803A1 (en) | Neural network model processing method and apparatus, computer device, and storage medium | |
US11531902B2 (en) | Generating and managing deep tensor neural networks | |
CN109657780A (en) | A kind of model compression method based on beta pruning sequence Active Learning | |
CN110175628A (en) | A kind of compression algorithm based on automatic search with the neural networks pruning of knowledge distillation | |
Guo et al. | Sparse deep nonnegative matrix factorization | |
Wang et al. | Tensor networks meet neural networks: A survey and future perspectives | |
Qi et al. | Learning low resource consumption cnn through pruning and quantization | |
Hu et al. | A dynamic pruning method on multiple sparse structures in deep neural networks | |
Gould et al. | Exploiting problem structure in deep declarative networks: Two case studies | |
Kawase et al. | Parametric t-stochastic neighbor embedding with quantum neural network | |
CN111209530A (en) | Tensor decomposition-based heterogeneous big data factor feature extraction method and system | |
CN112668717A (en) | Data processing method and device oriented to neural network model optimization | |
Li et al. | Towards optimal filter pruning with balanced performance and pruning speed | |
Xia et al. | Efficient synthesis of compact deep neural networks | |
Sun et al. | Computation on sparse neural networks and its implications for future hardware | |
Qian | Performance comparison among VGG16, InceptionV3, and resnet on galaxy morphology classification | |
Liawatimena et al. | Performance optimization of maxpool calculation using 4d rank tensor | |
Girdhar et al. | Deep Learning in Image Classification: Its Evolution, Methods, Challenges and Architectures | |
CN111967243A (en) | Text comparison method and equipment | |
CN113449817B (en) | Image classification implicit model acceleration training method based on phantom gradient |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |