CN112668717A

CN112668717A - Data processing method and device oriented to neural network model optimization

Info

Publication number: CN112668717A
Application number: CN202110002440.0A
Authority: CN
Inventors: ***; 徐聪; 马琳; 丰上; 薄洪健; 陈婧; 王子豪; 李洪伟; 孙聪珊; 徐忠亮; 朱泓嘉; 张子卿; 熊文静; 丁施航; 姜文浩
Original assignee: Harbin Institute of Technology
Current assignee: Harbin Institute of Technology
Priority date: 2021-01-04
Filing date: 2021-01-04
Publication date: 2021-04-16
Anticipated expiration: 2041-01-04
Also published as: CN112668717B

Abstract

The invention discloses a data processing method and device for neural network model optimization. The method comprises the following steps: by calculating the high-order Cartesian expansion item, original data is mapped into a high-order Cartesian expansion space which has stronger expression capacity and contains more information. The device comprises: the device comprises an input module, a Cartesian expansion calculation module and an output module; the input module is used for determining and receiving multi-dimensional data for calculation, and comprises dimensions for determining input data and numerical values of the dimensions; the Cartesian expansion calculation module is used for carrying out Cartesian expansion calculation on the multi-dimensional input data determined by the input module; and the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result. The invention has the advantages that: on the premise of not influencing the model effect, the difficulty of subsequent model learning is reduced, the learning efficiency is improved, and the convenience of distributed parallel computing is provided.

Description

Data processing method and device oriented to neural network model optimization

Technical Field

The invention relates to the technical field of machine learning, in particular to a data processing method and device for neural network model optimization.

Background

Machine Learning (Machine Learning) is a discipline that specializes in how computers simulate or implement human Learning behaviors to acquire new knowledge or skills and reorganize existing knowledge structures to continuously improve their performance. In the field of machine learning, a variety of machine learning methods typified by deep neural networks have been applied with great success. Neural network models such as CNN and RNN which are most widely applied have good recognition effects in the fields of computer vision, natural language processing, voice recognition and the like. In the above successful machine learning applications, the data processing method is not necessary, especially in the training of the neural network model, the requirement of the model on the input data is severe, and an efficient and reasonable data processing method is more necessary to improve the capability and training efficiency of the model. In particular, the current artificial intelligence technology using a deep neural network as a main tool generally adopts a deep structure, and has some defects and problems which cannot be avoided. When the depth of the neural network structure is increased, the problem of gradient disappearance or gradient explosion can occur in the training process; meanwhile, training of the neural network needs a large amount of training data, so that the training timeliness is a very critical problem, the deep structure is a structure which is not beneficial to distributed parallel computation in terms of topological relation, and a bottleneck exists in timeliness.

In the current machine learning algorithm, multi-dimensional data composed of manually designed features are mostly directly used for model training. In the aspect of preprocessing of data, preprocessing of data is mostly performed according to the format normalization of data and the requirements of a relevant model. Such as normalization, regularization, and related shifting, rotation, etc., operations in the neural network that are used to augment the training set.

The processing method only transforms the expression form of the data, and does not process the information in the data more finely, so that the structure of the model (especially the structure of the neural network model) and the training process cannot be optimized in the data analysis and processing level.

Other methods such as cosine transform and Principal Component Analysis (PCA) are mainly used for dimension reduction and feature screening of input data. By reducing the dimensionality of input data, the model complexity is reduced by retaining the dimensionality playing a key role in the model effect, and the data processing efficiency and the model accuracy are improved.

The dimension reduction processing method of the data only screens the data dimension, can play a certain role of optimizing the model, but does not improve the expression capacity of the data, only optimizes the data quantity, cannot structurally optimize the model, and does not fundamentally improve the training efficiency of the model.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a data processing method and device for neural network model optimization, and solves the defects in the prior art.

In order to realize the purpose, the technical scheme adopted by the invention is as follows:

a data processing method for neural network model optimization comprises the following steps:

1) receiving input data:

for input vector data X ═ X₁,x₂,…,x_N) Wherein x is₁,x₂,…,x_NDetermining data dimension N and value x of ith dimension for each dimension of input vector_i。

2) Full P-order cartesian dilation operation:

2.1 determining the highest order P

The value of P is determined according to the data dimensions and the specific computing hardware conditions. When the data dimension is large (larger than 1000), P is preferably 2 to 3, otherwise, the P value can be determined according to the actually deployed computing hardware memory M and the computing precision lambda, and the P value corresponding to the maximum expansion dimension which can be accommodated by the memory, namely the P value can be taken

2.2 calculation of the Cartesian expansion terms of the respective orders

Constructing k-th cartesian expansion terms from the P-values

Wherein P is_jIs the j dimension x_jIs a non-negative integer and satisfies

And respectively calculating all possible product values of k under different values between 1 and P, namely all 1-P-order Cartesian expansion items consisting of all dimensions.

3) And outputting a result:

after the calculation in the step 2) is finished, arranging all the obtained 1-P-order Cartesian expansion results together according to a certain sequence to form a result vector S ═ S₁,s₂,…,s_M) Where s represents each calculated cartesian expansion term and M represents the dimensions of the result vector. And finally outputs it.

Preferably, the operation on the P-th cartesian expansion in step 2) may be implemented by multiplication of a matrix, which is as follows:

for an input vector of dimension n, X ═ X₁,x₂,…,x_N) The 2-order Cartesian expansion results in a matrix X of the form (N, N)^TThe 3-order Cartesian expansion result of the elements in X is each column vector of N column vectors in the 2-order result, and then X is multiplied to form a matrix of (N, N), so that the N matrices form a 3-dimensional matrix of (N, N, N). And by analogy, the higher order term is calculated, and the Cartesian expansion result matrix is increased by one dimension when each liter is higher by one step until the P-order Cartesian expansion result is calculated.

Preferably, before the result vector S in step 2) is output, dimensions of the result vector S are screened by a dimension reduction method, dimensions with relatively low importance are removed, and the screened vector is output as a result.

The invention also discloses a data processing device, comprising: the device comprises an input module, a Cartesian expansion calculation module and an output module;

the input module is used for determining and receiving multi-dimensional data for calculation, and comprises dimensions for determining input data and numerical values of the dimensions;

the Cartesian expansion calculation module is used for performing Cartesian expansion calculation from 1 order to P order on the multidimensional input data determined by the input module;

and the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result of the Cartesian expansion calculation module.

Further, the cartesian dilation calculation module comprises: an expansion order unit and a multiplication unit;

the expansion order unit is used for setting the highest order for Cartesian expansion according to specific problems, computing equipment and other conditions;

and the multiplication operation unit is used for calculating 1-P-order Cartesian expansion terms among the dimensions of the multi-dimensional data provided by the input module.

Compared with the prior art, the invention has the advantages that:

the dimensionality transformation is carried out on input data by utilizing a multi-order Cartesian expansion algorithm, original input data are mapped into a Cartesian expansion space with higher order, so that the expression capability of the data and the discrimination between different classes are improved, the data change effect the same as that of a deep neural network is realized, further, the neural network becomes possible to become a wide structure from deep structure optimization, distributed parallel computing is supported more effectively, the training efficiency and the training effect of a machine learning model are improved under the condition that the training data volume and the computing capability are the same, and important engineering application significance and research value exist in the fields of artificial intelligence and pattern recognition analysis.

Drawings

FIG. 1 is a schematic representation of Cartesian expansion data processing according to an embodiment of the present invention;

FIG. 2 is a flow chart of a data processing apparatus according to an embodiment of the present invention;

FIG. 3 is a flow chart of a 3-order Cartesian expansion matrix multiplication implementation according to an embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings by way of examples.

Example 1

As shown in fig. 1, a data processing method for neural network model optimization includes the following steps:

1) receiving input data:

2) Full P-order cartesian dilation operation:

2.1 determining the highest order P

2.2 calculation of the Cartesian expansion terms of the respective orders

Constructing k-th cartesian expansion terms from the P-values

Wherein P is_jIs the j dimension x_jIs a non-negative integer and satisfies

It should be understood that all possible cartesian expansion terms are calculated by using a round-robin enumeration, so that the enumeration can avoid repeated terms caused by multiplication commutative law in all the cartesian expansion terms.

3) And outputting a result:

In the step of outputting the result, before the result vector S is output, dimension filtering operation may be performed on the result vector S. Due to different values of P, calculation methods of P-order cartesian expansion, and different specific problems, each dimension of the obtained result vector S cannot play an important role. Therefore, before the result vector is output, the dimensionality of the result vector can be screened by a dimensionality reduction method such as cosine transform and Principal Component Analysis (PCA), the dimensionality with relatively low importance degree is removed, and the screened vector is output as a result.

As shown in fig. 2, the data processing apparatus includes: the device comprises an input module, a Cartesian expansion calculation module and an output module;

The Cartesian dilation calculation module includes: an expansion order unit and a multiplication unit;

In the processing flow, the original data firstly enters an input module to enter the processing flow, then the data is sent to a Cartesian expansion calculation module, the Cartesian expansion results from 1 to P orders are calculated, then the Cartesian expansion results of each order are sent to an output module to complete high-dimensional fusion, and finally output data are formed.

Example 2

This example only illustrates the differences from example 1;

the operation for P-th cartesian expansion can be implemented by multiplication of the matrix (tensor), as shown in fig. 3, specifically as follows:

The cross terms obtained in this way are repeated in a large number, but the problem of low calculation efficiency caused by a large number of loops can be effectively avoided.

It will be appreciated by those of ordinary skill in the art that the examples described herein are intended to assist the reader in understanding the manner in which the invention is practiced, and it is to be understood that the scope of the invention is not limited to such specifically recited statements and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims

1. A data processing method for neural network model optimization is characterized by comprising the following steps:

1) receiving input data:

for input vector data X ═ X₁,x₂,…,x_N) Wherein x is₁,x₂,…,x_NDetermining data dimension N and value x of ith dimension for each dimension of input vector_i；

2) Full P-order cartesian dilation operation:

2.1 determining the highest order P

Determining the value of P according to the data dimension and the specific calculation hardware condition; when the data dimension is large (larger than 1000), P is preferably 2 to 3, otherwise, the P value can be determined according to the actually deployed computing hardware memory M and the computing precision lambda, and the P value corresponding to the maximum expansion dimension which can be accommodated by the memory, namely the P value can be taken

2.2 calculation of the Cartesian expansion terms of the respective orders

Constructing k-th cartesian expansion terms from the P-values

Wherein P is_jIs the j dimension x_jIs a non-negative integer and satisfies

Respectively calculating all possible product values of k under different values between 1 and P according to the Cartesian expansion terms, namely all 1-P-order Cartesian expansion terms consisting of all dimensions;

3) and outputting a result:

after the calculation in the step 2) is finished, arranging all the obtained 1-P-order Cartesian expansion results together according to a certain sequence to form a result vector S ═ S₁,s₂,…,s_M) Where s represents each calculated cartesian expansion term and M represents the dimensions of the result vector; and finally outputs it.

2. The data processing method of claim 1, wherein: the operation on the P-th cartesian expansion in step 2) may be implemented by multiplication of a matrix, which is specifically as follows:

3. The data processing method of claim 1, wherein: before the result vector S in the step 2) is output, the dimensionality is screened by a dimensionality reduction method, the dimensionality with relatively low importance degree is removed, and the screened vector is used as a result to be output.

4. A data processing apparatus for neural network model optimization, comprising: the device comprises an input module, a Cartesian expansion calculation module and an output module;

5. The data processing apparatus of claim 4, wherein: the Cartesian dilation calculation module includes: an expansion order unit and a multiplication unit;