CN112668717A - Data processing method and device oriented to neural network model optimization - Google Patents

Data processing method and device oriented to neural network model optimization Download PDF

Info

Publication number
CN112668717A
CN112668717A CN202110002440.0A CN202110002440A CN112668717A CN 112668717 A CN112668717 A CN 112668717A CN 202110002440 A CN202110002440 A CN 202110002440A CN 112668717 A CN112668717 A CN 112668717A
Authority
CN
China
Prior art keywords
cartesian
expansion
order
data
calculation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110002440.0A
Other languages
Chinese (zh)
Other versions
CN112668717B (en
Inventor
***
徐聪
马琳
丰上
薄洪健
陈婧
王子豪
李洪伟
孙聪珊
徐忠亮
朱泓嘉
张子卿
熊文静
丁施航
姜文浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN202110002440.0A priority Critical patent/CN112668717B/en
Publication of CN112668717A publication Critical patent/CN112668717A/en
Application granted granted Critical
Publication of CN112668717B publication Critical patent/CN112668717B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Complex Calculations (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a data processing method and device for neural network model optimization. The method comprises the following steps: by calculating the high-order Cartesian expansion item, original data is mapped into a high-order Cartesian expansion space which has stronger expression capacity and contains more information. The device comprises: the device comprises an input module, a Cartesian expansion calculation module and an output module; the input module is used for determining and receiving multi-dimensional data for calculation, and comprises dimensions for determining input data and numerical values of the dimensions; the Cartesian expansion calculation module is used for carrying out Cartesian expansion calculation on the multi-dimensional input data determined by the input module; and the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result. The invention has the advantages that: on the premise of not influencing the model effect, the difficulty of subsequent model learning is reduced, the learning efficiency is improved, and the convenience of distributed parallel computing is provided.

Description

Data processing method and device oriented to neural network model optimization
Technical Field
The invention relates to the technical field of machine learning, in particular to a data processing method and device for neural network model optimization.
Background
Machine Learning (Machine Learning) is a discipline that specializes in how computers simulate or implement human Learning behaviors to acquire new knowledge or skills and reorganize existing knowledge structures to continuously improve their performance. In the field of machine learning, a variety of machine learning methods typified by deep neural networks have been applied with great success. Neural network models such as CNN and RNN which are most widely applied have good recognition effects in the fields of computer vision, natural language processing, voice recognition and the like. In the above successful machine learning applications, the data processing method is not necessary, especially in the training of the neural network model, the requirement of the model on the input data is severe, and an efficient and reasonable data processing method is more necessary to improve the capability and training efficiency of the model. In particular, the current artificial intelligence technology using a deep neural network as a main tool generally adopts a deep structure, and has some defects and problems which cannot be avoided. When the depth of the neural network structure is increased, the problem of gradient disappearance or gradient explosion can occur in the training process; meanwhile, training of the neural network needs a large amount of training data, so that the training timeliness is a very critical problem, the deep structure is a structure which is not beneficial to distributed parallel computation in terms of topological relation, and a bottleneck exists in timeliness.
In the current machine learning algorithm, multi-dimensional data composed of manually designed features are mostly directly used for model training. In the aspect of preprocessing of data, preprocessing of data is mostly performed according to the format normalization of data and the requirements of a relevant model. Such as normalization, regularization, and related shifting, rotation, etc., operations in the neural network that are used to augment the training set.
The processing method only transforms the expression form of the data, and does not process the information in the data more finely, so that the structure of the model (especially the structure of the neural network model) and the training process cannot be optimized in the data analysis and processing level.
Other methods such as cosine transform and Principal Component Analysis (PCA) are mainly used for dimension reduction and feature screening of input data. By reducing the dimensionality of input data, the model complexity is reduced by retaining the dimensionality playing a key role in the model effect, and the data processing efficiency and the model accuracy are improved.
The dimension reduction processing method of the data only screens the data dimension, can play a certain role of optimizing the model, but does not improve the expression capacity of the data, only optimizes the data quantity, cannot structurally optimize the model, and does not fundamentally improve the training efficiency of the model.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a data processing method and device for neural network model optimization, and solves the defects in the prior art.
In order to realize the purpose, the technical scheme adopted by the invention is as follows:
a data processing method for neural network model optimization comprises the following steps:
1) receiving input data:
for input vector data X ═ X1,x2,…,xN) Wherein x is1,x2,…,xNDetermining data dimension N and value x of ith dimension for each dimension of input vectori
2) Full P-order cartesian dilation operation:
2.1 determining the highest order P
The value of P is determined according to the data dimensions and the specific computing hardware conditions. When the data dimension is large (larger than 1000), P is preferably 2 to 3, otherwise, the P value can be determined according to the actually deployed computing hardware memory M and the computing precision lambda, and the P value corresponding to the maximum expansion dimension which can be accommodated by the memory, namely the P value can be taken
Figure BDA0002882058730000021
Figure BDA0002882058730000031
2.2 calculation of the Cartesian expansion terms of the respective orders
Constructing k-th cartesian expansion terms from the P-values
Figure BDA0002882058730000032
Wherein P isjIs the j dimension xjIs a non-negative integer and satisfies
Figure BDA0002882058730000033
And respectively calculating all possible product values of k under different values between 1 and P, namely all 1-P-order Cartesian expansion items consisting of all dimensions.
3) And outputting a result:
after the calculation in the step 2) is finished, arranging all the obtained 1-P-order Cartesian expansion results together according to a certain sequence to form a result vector S ═ S1,s2,…,sM) Where s represents each calculated cartesian expansion term and M represents the dimensions of the result vector. And finally outputs it.
Preferably, the operation on the P-th cartesian expansion in step 2) may be implemented by multiplication of a matrix, which is as follows:
for an input vector of dimension n, X ═ X1,x2,…,xN) The 2-order Cartesian expansion results in a matrix X of the form (N, N)TThe 3-order Cartesian expansion result of the elements in X is each column vector of N column vectors in the 2-order result, and then X is multiplied to form a matrix of (N, N), so that the N matrices form a 3-dimensional matrix of (N, N, N). And by analogy, the higher order term is calculated, and the Cartesian expansion result matrix is increased by one dimension when each liter is higher by one step until the P-order Cartesian expansion result is calculated.
Preferably, before the result vector S in step 2) is output, dimensions of the result vector S are screened by a dimension reduction method, dimensions with relatively low importance are removed, and the screened vector is output as a result.
The invention also discloses a data processing device, comprising: the device comprises an input module, a Cartesian expansion calculation module and an output module;
the input module is used for determining and receiving multi-dimensional data for calculation, and comprises dimensions for determining input data and numerical values of the dimensions;
the Cartesian expansion calculation module is used for performing Cartesian expansion calculation from 1 order to P order on the multidimensional input data determined by the input module;
and the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result of the Cartesian expansion calculation module.
Further, the cartesian dilation calculation module comprises: an expansion order unit and a multiplication unit;
the expansion order unit is used for setting the highest order for Cartesian expansion according to specific problems, computing equipment and other conditions;
and the multiplication operation unit is used for calculating 1-P-order Cartesian expansion terms among the dimensions of the multi-dimensional data provided by the input module.
Compared with the prior art, the invention has the advantages that:
the dimensionality transformation is carried out on input data by utilizing a multi-order Cartesian expansion algorithm, original input data are mapped into a Cartesian expansion space with higher order, so that the expression capability of the data and the discrimination between different classes are improved, the data change effect the same as that of a deep neural network is realized, further, the neural network becomes possible to become a wide structure from deep structure optimization, distributed parallel computing is supported more effectively, the training efficiency and the training effect of a machine learning model are improved under the condition that the training data volume and the computing capability are the same, and important engineering application significance and research value exist in the fields of artificial intelligence and pattern recognition analysis.
Drawings
FIG. 1 is a schematic representation of Cartesian expansion data processing according to an embodiment of the present invention;
FIG. 2 is a flow chart of a data processing apparatus according to an embodiment of the present invention;
FIG. 3 is a flow chart of a 3-order Cartesian expansion matrix multiplication implementation according to an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings by way of examples.
Example 1
As shown in fig. 1, a data processing method for neural network model optimization includes the following steps:
1) receiving input data:
for input vector data X ═ X1,x2,…,xN) Wherein x is1,x2,…,xNDetermining data dimension N and value x of ith dimension for each dimension of input vectori
2) Full P-order cartesian dilation operation:
2.1 determining the highest order P
The value of P is determined according to the data dimensions and the specific computing hardware conditions. When the data dimension is large (larger than 1000), P is preferably 2 to 3, otherwise, the P value can be determined according to the actually deployed computing hardware memory M and the computing precision lambda, and the P value corresponding to the maximum expansion dimension which can be accommodated by the memory, namely the P value can be taken
Figure BDA0002882058730000051
Figure BDA0002882058730000052
2.2 calculation of the Cartesian expansion terms of the respective orders
Constructing k-th cartesian expansion terms from the P-values
Figure BDA0002882058730000053
Wherein P isjIs the j dimension xjIs a non-negative integer and satisfies
Figure BDA0002882058730000054
And respectively calculating all possible product values of k under different values between 1 and P, namely all 1-P-order Cartesian expansion items consisting of all dimensions.
It should be understood that all possible cartesian expansion terms are calculated by using a round-robin enumeration, so that the enumeration can avoid repeated terms caused by multiplication commutative law in all the cartesian expansion terms.
3) And outputting a result:
after the calculation in the step 2) is finished, arranging all the obtained 1-P-order Cartesian expansion results together according to a certain sequence to form a result vector S ═ S1,s2,…,sM) Where s represents each calculated cartesian expansion term and M represents the dimensions of the result vector. And finally outputs it.
In the step of outputting the result, before the result vector S is output, dimension filtering operation may be performed on the result vector S. Due to different values of P, calculation methods of P-order cartesian expansion, and different specific problems, each dimension of the obtained result vector S cannot play an important role. Therefore, before the result vector is output, the dimensionality of the result vector can be screened by a dimensionality reduction method such as cosine transform and Principal Component Analysis (PCA), the dimensionality with relatively low importance degree is removed, and the screened vector is output as a result.
As shown in fig. 2, the data processing apparatus includes: the device comprises an input module, a Cartesian expansion calculation module and an output module;
the input module is used for determining and receiving multi-dimensional data for calculation, and comprises dimensions for determining input data and numerical values of the dimensions;
the Cartesian expansion calculation module is used for performing Cartesian expansion calculation from 1 order to P order on the multidimensional input data determined by the input module;
and the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result of the Cartesian expansion calculation module.
The Cartesian dilation calculation module includes: an expansion order unit and a multiplication unit;
the expansion order unit is used for setting the highest order for Cartesian expansion according to specific problems, computing equipment and other conditions;
and the multiplication operation unit is used for calculating 1-P-order Cartesian expansion terms among the dimensions of the multi-dimensional data provided by the input module.
In the processing flow, the original data firstly enters an input module to enter the processing flow, then the data is sent to a Cartesian expansion calculation module, the Cartesian expansion results from 1 to P orders are calculated, then the Cartesian expansion results of each order are sent to an output module to complete high-dimensional fusion, and finally output data are formed.
Example 2
This example only illustrates the differences from example 1;
the operation for P-th cartesian expansion can be implemented by multiplication of the matrix (tensor), as shown in fig. 3, specifically as follows:
for an input vector of dimension n, X ═ X1,x2,…,xN) The 2-order Cartesian expansion results in a matrix X of the form (N, N)TThe 3-order Cartesian expansion result of the elements in X is each column vector of N column vectors in the 2-order result, and then X is multiplied to form a matrix of (N, N), so that the N matrices form a 3-dimensional matrix of (N, N, N). And by analogy, the higher order term is calculated, and the Cartesian expansion result matrix is increased by one dimension when each liter is higher by one step until the P-order Cartesian expansion result is calculated.
The cross terms obtained in this way are repeated in a large number, but the problem of low calculation efficiency caused by a large number of loops can be effectively avoided.
It will be appreciated by those of ordinary skill in the art that the examples described herein are intended to assist the reader in understanding the manner in which the invention is practiced, and it is to be understood that the scope of the invention is not limited to such specifically recited statements and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (5)

1. A data processing method for neural network model optimization is characterized by comprising the following steps:
1) receiving input data:
for input vector data X ═ X1,x2,…,xN) Wherein x is1,x2,…,xNDetermining data dimension N and value x of ith dimension for each dimension of input vectori
2) Full P-order cartesian dilation operation:
2.1 determining the highest order P
Determining the value of P according to the data dimension and the specific calculation hardware condition; when the data dimension is large (larger than 1000), P is preferably 2 to 3, otherwise, the P value can be determined according to the actually deployed computing hardware memory M and the computing precision lambda, and the P value corresponding to the maximum expansion dimension which can be accommodated by the memory, namely the P value can be taken
Figure FDA0002882058720000011
Figure FDA0002882058720000012
2.2 calculation of the Cartesian expansion terms of the respective orders
Constructing k-th cartesian expansion terms from the P-values
Figure FDA0002882058720000013
Wherein P isjIs the j dimension xjIs a non-negative integer and satisfies
Figure FDA0002882058720000014
Respectively calculating all possible product values of k under different values between 1 and P according to the Cartesian expansion terms, namely all 1-P-order Cartesian expansion terms consisting of all dimensions;
3) and outputting a result:
after the calculation in the step 2) is finished, arranging all the obtained 1-P-order Cartesian expansion results together according to a certain sequence to form a result vector S ═ S1,s2,…,sM) Where s represents each calculated cartesian expansion term and M represents the dimensions of the result vector; and finally outputs it.
2. The data processing method of claim 1, wherein: the operation on the P-th cartesian expansion in step 2) may be implemented by multiplication of a matrix, which is specifically as follows:
for an input vector of dimension n, X ═ X1,x2,…,xN) The 2-order Cartesian expansion results in a matrix X of the form (N, N)TThe 3-order Cartesian expansion result of the elements in X is each column vector of N column vectors in the 2-order result, and then X is multiplied to form a matrix of (N, N), so that the N matrices form a 3-dimensional matrix of (N, N, N). And by analogy, the higher order term is calculated, and the Cartesian expansion result matrix is increased by one dimension when each liter is higher by one step until the P-order Cartesian expansion result is calculated.
3. The data processing method of claim 1, wherein: before the result vector S in the step 2) is output, the dimensionality is screened by a dimensionality reduction method, the dimensionality with relatively low importance degree is removed, and the screened vector is used as a result to be output.
4. A data processing apparatus for neural network model optimization, comprising: the device comprises an input module, a Cartesian expansion calculation module and an output module;
the input module is used for determining and receiving multi-dimensional data for calculation, and comprises dimensions for determining input data and numerical values of the dimensions;
the Cartesian expansion calculation module is used for performing Cartesian expansion calculation from 1 order to P order on the multidimensional input data determined by the input module;
and the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result of the Cartesian expansion calculation module.
5. The data processing apparatus of claim 4, wherein: the Cartesian dilation calculation module includes: an expansion order unit and a multiplication unit;
the expansion order unit is used for setting the highest order for Cartesian expansion according to specific problems, computing equipment and other conditions;
and the multiplication operation unit is used for calculating 1-P-order Cartesian expansion terms among the dimensions of the multi-dimensional data provided by the input module.
CN202110002440.0A 2021-01-04 2021-01-04 Data processing method and device oriented to neural network model optimization Active CN112668717B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110002440.0A CN112668717B (en) 2021-01-04 2021-01-04 Data processing method and device oriented to neural network model optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110002440.0A CN112668717B (en) 2021-01-04 2021-01-04 Data processing method and device oriented to neural network model optimization

Publications (2)

Publication Number Publication Date
CN112668717A true CN112668717A (en) 2021-04-16
CN112668717B CN112668717B (en) 2023-06-02

Family

ID=75412620

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110002440.0A Active CN112668717B (en) 2021-01-04 2021-01-04 Data processing method and device oriented to neural network model optimization

Country Status (1)

Country Link
CN (1) CN112668717B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250810A (en) * 2015-06-15 2016-12-21 摩福公司 By iris identification, individuality is identified and/or the method for certification
CN106887000A (en) * 2017-01-23 2017-06-23 上海联影医疗科技有限公司 The gridding processing method and its system of medical image
CN107729994A (en) * 2017-11-28 2018-02-23 北京地平线信息技术有限公司 The method and apparatus for performing the computing of the convolutional layer in convolutional neural networks
CN107832842A (en) * 2017-11-28 2018-03-23 北京地平线信息技术有限公司 The method and apparatus that convolution algorithm is performed for fold characteristics data
CN107944556A (en) * 2017-12-12 2018-04-20 电子科技大学 Deep neural network compression method based on block item tensor resolution
WO2018224690A1 (en) * 2017-06-09 2018-12-13 Deepmind Technologies Limited Generating discrete latent representations of input data items

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250810A (en) * 2015-06-15 2016-12-21 摩福公司 By iris identification, individuality is identified and/or the method for certification
CN106887000A (en) * 2017-01-23 2017-06-23 上海联影医疗科技有限公司 The gridding processing method and its system of medical image
WO2018224690A1 (en) * 2017-06-09 2018-12-13 Deepmind Technologies Limited Generating discrete latent representations of input data items
CN107729994A (en) * 2017-11-28 2018-02-23 北京地平线信息技术有限公司 The method and apparatus for performing the computing of the convolutional layer in convolutional neural networks
CN107832842A (en) * 2017-11-28 2018-03-23 北京地平线信息技术有限公司 The method and apparatus that convolution algorithm is performed for fold characteristics data
CN107944556A (en) * 2017-12-12 2018-04-20 电子科技大学 Deep neural network compression method based on block item tensor resolution

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
FRANK A.RUSSO 等: "Predicting musically induced emotions from physiological inputs: linear and neura lnetwork models", 《FRONTIERS IN PSYCHOLOGY》 *
HAIFENG LI 等: "MODENN: A Shallow Broad Neural Network Model Based on Multi-Order Descartes Expansion", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *
疏官胜: "移动云中基于计算迁移的应用性能优化研究", 《中国博士论文全文数据库信息科技辑》 *

Also Published As

Publication number Publication date
CN112668717B (en) 2023-06-02

Similar Documents

Publication Publication Date Title
Cheng et al. Model compression and acceleration for deep neural networks: The principles, progress, and challenges
Zhou et al. Rethinking bottleneck structure for efficient mobile network design
Das et al. A group incremental feature selection for classification using rough set theory based genetic algorithm
Howard et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications
Cheng et al. A survey of model compression and acceleration for deep neural networks
EP4036803A1 (en) Neural network model processing method and apparatus, computer device, and storage medium
US11531902B2 (en) Generating and managing deep tensor neural networks
CN109657780A (en) A kind of model compression method based on beta pruning sequence Active Learning
CN110175628A (en) A kind of compression algorithm based on automatic search with the neural networks pruning of knowledge distillation
Guo et al. Sparse deep nonnegative matrix factorization
Wang et al. Tensor networks meet neural networks: A survey and future perspectives
Qi et al. Learning low resource consumption cnn through pruning and quantization
Hu et al. A dynamic pruning method on multiple sparse structures in deep neural networks
Gould et al. Exploiting problem structure in deep declarative networks: Two case studies
Kawase et al. Parametric t-stochastic neighbor embedding with quantum neural network
CN111209530A (en) Tensor decomposition-based heterogeneous big data factor feature extraction method and system
CN112668717A (en) Data processing method and device oriented to neural network model optimization
Li et al. Towards optimal filter pruning with balanced performance and pruning speed
Xia et al. Efficient synthesis of compact deep neural networks
Sun et al. Computation on sparse neural networks and its implications for future hardware
Qian Performance comparison among VGG16, InceptionV3, and resnet on galaxy morphology classification
Liawatimena et al. Performance optimization of maxpool calculation using 4d rank tensor
Girdhar et al. Deep Learning in Image Classification: Its Evolution, Methods, Challenges and Architectures
CN111967243A (en) Text comparison method and equipment
CN113449817B (en) Image classification implicit model acceleration training method based on phantom gradient

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant