CN111914987A - Data processing method and device based on neural network, equipment and readable medium - Google Patents

Data processing method and device based on neural network, equipment and readable medium Download PDF

Info

Publication number
CN111914987A
CN111914987A CN201910390684.3A CN201910390684A CN111914987A CN 111914987 A CN111914987 A CN 111914987A CN 201910390684 A CN201910390684 A CN 201910390684A CN 111914987 A CN111914987 A CN 111914987A
Authority
CN
China
Prior art keywords
input data
processing
binary
channel
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910390684.3A
Other languages
Chinese (zh)
Inventor
张建浩
朱睿
梅涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201910390684.3A priority Critical patent/CN111914987A/en
Publication of CN111914987A publication Critical patent/CN111914987A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present disclosure provides a data processing method, device, electronic device and computer readable medium based on a neural network, which relates to the field of data processing, and the method comprises: carrying out channel splitting processing on input data of the neural network, and carrying out channel shuffling processing on the input data after channel splitting; carrying out binarization processing on the input data after channel mixing to obtain binary input data; performing packet convolution processing on the binary input data, wherein the weight in a convolution kernel corresponding to the packet convolution processing is binary data; and accumulating the input data after channel mixing and the binary data after the grouping convolution processing according to the channel sequence to obtain an output result. The technical scheme provided by the embodiment of the disclosure can improve the data processing speed and reduce the loss of data information.

Description

Data processing method and device based on neural network, equipment and readable medium
Technical Field
The present disclosure relates to the field of data processing, and in particular, to a data processing method and apparatus based on a neural network, an electronic device, and a computer-readable medium.
Background
In the related art, the neural network in the deep learning technology generally processes input data by using a floating point calculation method, but the floating point calculation method requires a large storage space and generates a large amount of calculation, which seriously hinders the application of the neural network to a mobile terminal. The potential advantages of the binary neural network, such as high model compression rate and fast computation speed, are a popular research direction for deep learning in recent years.
However, in the related art, the binary neural network has a strong information loss problem. Therefore, a method for reducing information loss of the binary neural network is found, and the method is extremely meaningful for application of the binary neural network.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
In view of the above, the present disclosure provides a data processing method and apparatus based on a neural network, an electronic device, and a computer readable medium, which can reduce information loss generated in a binary neural network and reduce memory loss to improve a calculation speed.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
According to a first aspect of the embodiments of the present disclosure, a data processing method based on a neural network is provided, the method including: carrying out channel splitting processing on input data of the neural network, and carrying out channel shuffling processing on the input data after channel splitting; carrying out binarization processing on the input data after channel mixing to obtain binary input data; performing packet convolution processing on the binary input data, wherein the weight in a convolution kernel corresponding to the packet convolution processing is binary data; and accumulating the input data after channel mixing and the binary data after the grouping convolution processing according to the channel sequence to obtain an output result.
In some embodiments, performing the channel shuffling process on the channel-split input data comprises: and carrying out random permutation and combination processing on the channels of the input data after the channels are split.
In some embodiments, the binarization processing is performed on the input data after the channel mixing, and includes: if the input data after the channel mixing is larger than a preset threshold value, converting the input data after the channel mixing into a first value; and if the input data after the channel mixing is smaller than or equal to the preset threshold, converting the input data after the channel mixing into a second value.
In some embodiments, the first value is 1 and the second value is-1.
In some embodiments, performing packet convolution processing on the binary input data includes: correspondingly grouping the binary input data and convolution kernels in the neural network according to channels; and performing binary operation processing on the grouped binary input data and the corresponding convolution kernel.
In some embodiments, the binary operation processing includes: an exclusive nor operation, an exclusive or operation, or an exclusive nor operation.
In some embodiments, the convolution kernel is a 3 x 3 convolution kernel.
According to a second aspect of the embodiments of the present disclosure, a data processing apparatus based on a neural network is provided, the apparatus including: the channel shuffling module is configured to perform channel splitting processing on input data of the neural network and perform channel shuffling processing on the input data after channel splitting; the binary processing module is configured to carry out binary processing on the input data after the channel mixing so as to obtain binary input data; the grouping convolution module is configured to perform grouping convolution processing on the binary input data, and the weight in a convolution kernel corresponding to the grouping convolution processing is binary data; and the accumulation module is configured to perform accumulation processing on the input data after channel mixing and the binary data after packet convolution processing according to the channel sequence so as to obtain an output result.
According to a third aspect of the embodiments of the present disclosure, an electronic device is provided, which includes: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement any of the neural network-based data processing methods described above.
According to a fourth aspect of the embodiments of the present disclosure, a computer-readable medium is proposed, on which a computer program is stored, wherein the program, when executed by a processor, implements the neural network-based data processing method according to any one of the above.
According to the data processing method and device based on the neural network, the electronic equipment and the computer readable medium, on one hand, grouping convolution of input data is realized by carrying out operations such as channel splitting, channel shuffling and the like on the input data, convolution parameters are reduced, and the data processing speed is improved; on the other hand, the input data is binarized to realize the compression of the input data, so that the data processing speed of the neural network is further improved; finally, the technical scheme provided by the embodiment of the disclosure only uses one-time binarization processing, reduces information loss of input data, and improves accuracy of output results.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. The drawings described below are merely some embodiments of the present disclosure, and other drawings may be derived from those drawings by those of ordinary skill in the art without inventive effort.
Fig. 1 shows a schematic diagram of an exemplary system architecture of a neural network-based data processing method or a neural network-based data processing apparatus applied to an embodiment of the present disclosure.
Fig. 2 is a schematic diagram showing a shefflonet architecture according to the related art.
Fig. 3 is a schematic diagram illustrating a ResNet architecture according to the related art.
Fig. 4 is a schematic diagram showing a Bi-RealNet architecture according to the related art.
FIG. 5 is a flow diagram illustrating a neural network-based data processing method in accordance with an exemplary embodiment.
FIG. 6 is a flow chart illustrating another neural network-based data processing method in accordance with an exemplary embodiment.
FIG. 7 is a flow chart illustrating yet another neural network-based data processing method in accordance with an exemplary embodiment.
FIG. 8 is a schematic diagram illustrating a binary neural network, according to an example embodiment.
FIG. 9 is a block diagram illustrating a neural network-based data processing apparatus in accordance with an exemplary embodiment.
FIG. 10 is a block diagram illustrating another neural network-based data processing apparatus in accordance with an exemplary embodiment.
FIG. 11 is a block diagram illustrating yet another neural network-based data processing apparatus in accordance with an exemplary embodiment.
Fig. 12 is a schematic structural diagram of a computer system applied to a data processing apparatus based on a neural network according to an exemplary embodiment.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The same reference numerals denote the same or similar parts in the drawings, and thus, a repetitive description thereof will be omitted.
The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and the like. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the disclosure.
The drawings are merely schematic illustrations of the present disclosure, in which the same reference numerals denote the same or similar parts, and thus, a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and steps, nor do they necessarily have to be performed in the order described. For example, some steps may be decomposed, and some steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
In this specification, the terms "a", "an", "the", "said" and "at least one" are used to indicate the presence of one or more elements/components/etc.; the terms "comprising," "including," and "having" are intended to be inclusive and mean that there may be additional elements/components/etc. other than the listed elements/components/etc.; the terms "first," "second," and "third," etc. are used merely as labels, and are not limiting on the number of their objects.
The following detailed description of exemplary embodiments of the disclosure refers to the accompanying drawings.
Fig. 1 shows a schematic diagram of an exemplary system architecture of a neural network-based data processing method or a neural network-based data processing apparatus to which an embodiment of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may be various electronic devices having display screens and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server that provides various services, such as a background management server that provides support for devices operated by users using the terminal apparatuses 101, 102, 103. The background management server can analyze and process the received data such as the request and feed back the processing result to the terminal equipment.
The server 105 may, for example, perform channel splitting processing on input data of the neural network, and perform channel shuffling processing on the input data after channel splitting; carrying out binarization processing on the input data after channel mixing to obtain binary input data; performing packet convolution processing on the binary input data, wherein the weight in a convolution kernel corresponding to the packet convolution processing is binary data; and accumulating the input data after channel mixing and the binary data after the grouping convolution processing according to the channel sequence to obtain an output result.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is only illustrative, and the server 105 may be a physical server or may be composed of a plurality of servers, and there may be any number of terminal devices, networks and servers according to actual needs.
In the related art, the floating-point (real-valued) neural network has the problems of high memory consumption and low computation speed. Because of the large memory consumption of the floating point neural network, the floating point neural network cannot be used on embedded devices or mobile devices with low memory. In order to solve the problem that the floating-point neural network occupies a large memory, a binary neural network is usually used to replace the floating-point neural network, so as to reduce the memory consumption of the neural network and increase the calculation speed of the neural network.
In the related art, the shuffle net architecture shown in fig. 2 and the ResNet architecture shown in fig. 3 are a CNN network.
Fig. 2 is a schematic diagram showing a shefflonet architecture according to the related art.
As shown in fig. 2, the shuffle net architecture includes a first binarization network 202, a first set of binary convolution networks 203 (employing convolution kernels of 1 × 1 size), a channel shuffle network 204, a second binarization network 205, a binary depth separation convolution network 206 (employing convolution kernels of 3 × 3 size), a third binarization network 207, and a binary set convolution network 208 (employing convolution kernels of 1 × 1 size).
Fig. 3 is a schematic diagram illustrating a ResNet architecture according to the related art.
As shown in fig. 3, the ResNet network includes a fourth binarization network 302, a first binarization network 303 (employing convolution kernels of 1 × 1 size), a fifth binarization network 304, a second binary group convolution network 305 (employing convolution kernels of 3 × 3 size), a sixth binarization network 307, and a second binary convolution network 308 (employing convolution kernels of 1 × 1 size).
As shown in fig. 2 and fig. 3, it can be considered that both the shuffle net network and the ResNet network decompose one normal convolution into a plurality of small convolutions, and each time before convolution, data is binarized once, and each binarization process causes much information to be lost in input data. In this way, the information included in the data is largely erased, so that the ability to fit last is greatly reduced.
In the embodiment of the present disclosure, in order to reduce the loss of information, the number of convolutions in the binary neural network (i.e., the number of times of binarizing data) may be reduced to reduce data loss to improve the accuracy of data processing.
In the related art, a Bi-value neural network architecture, i.e., a Bi-RealNet (Bi-value-real number network) architecture, is also proposed.
Fig. 4 is a schematic diagram showing a Bi-RealNet architecture according to the related art.
As shown in fig. 4, the Bi-RealNet architecture includes a binarization network 402 and a binary convolution network 403 (with a convolution kernel of 3 × 3 size). The binary convolution network 403 in the Bi-RealNet architecture shown in fig. 4 uses a normal convolution, i.e. a convolution operation is performed on the whole input data, and the normal convolution is slower than the packet convolution.
In the embodiment of the present disclosure, to overcome the problem that the normal convolution is slow in operation speed, the packet convolution may be used to reduce the amount of computation to increase the operation speed.
FIG. 5 is a flow diagram illustrating a neural network-based data processing method in accordance with an exemplary embodiment. The method provided by the embodiment of the present disclosure may be processed by any electronic device with computing processing capability, for example, the server 105 and/or the terminal devices 102 and 103 in the embodiment of fig. 1 described above, and in the following embodiment, the server 105 is taken as an execution subject for example, but the present disclosure is not limited thereto.
Referring to fig. 5, the neural network-based data processing provided by the embodiment of the present disclosure may include the following steps.
Step S501, channel splitting processing is carried out on the input data of the neural network, and channel shuffling processing is carried out on the input data after channel splitting.
In some embodiments, the input data to the neural network may include multiple channels, for example, a piece of RGB color image data may include R, G, B three channels.
In some embodiments, multiple channels of input data may be separated, for example, RGB color image data may be data split by R, G, B three channels.
In some embodiments, in order to enhance the expressive power of the neural network, the input data after channel splitting may be subjected to a channel shuffling process, that is, a random permutation and combination process is performed on the channels of the input data after channel splitting. For example, the three channels R, G, B of the channel-split RGB color image data are randomly arranged and combined to obtain a set of random channel combinations, e.g., G, R, B.
Step S502, the input data after channel mixing is subjected to binarization processing to obtain binary input data.
In some embodiments, the binarizing process performed on the input data after the channel mixing may include: if the input data after the channel mixing is larger than a preset threshold value, converting the input data after the channel mixing into a first value; and if the input data after the channel mixing is smaller than or equal to the preset threshold, converting the input data after the channel mixing into a second value.
In some embodiments, the first value is 1 and the second value is-1. Those skilled in the art will appreciate that the two values of the binary data are not limited to 1 and-1, but may be 1 and 0, or other binary data. In the embodiments of the present disclosure, binary data 1 and-1 are taken as examples for explanation, but the present disclosure does not limit this.
In some embodiments, the preset threshold may be set to 10. If the input data after the channel mixing is larger than a preset threshold value, converting the input data after the channel mixing into 1; and if the input data after the channel mixing is less than or equal to the preset threshold, converting the input data after the channel mixing into-1.
Step S503, performing packet convolution processing on the binary input data, where the weight in the convolution kernel corresponding to the packet convolution processing is binary data.
In some embodiments, performing packet convolution processing on the binary input data may include: correspondingly grouping the binary input data and convolution kernels in the neural network according to channels; and performing binary operation processing on the grouped binary input data and the corresponding convolution kernel.
And step S504, accumulating the input data after channel mixing and the binary data after the grouping convolution processing according to the channel sequence to obtain an output result.
According to the technical scheme provided by the embodiment of the disclosure, on one hand, the grouped convolution of the input data is realized by performing operations such as channel splitting, channel shuffling and the like on the input data, so that the convolution parameters are greatly reduced, and the data processing speed is improved; on the other hand, the input data is binarized to realize the compression of the input data, so that the data processing speed of the neural network is further improved; finally, the technical scheme provided by the embodiment of the disclosure only uses binarization processing once, so that the information loss of input data is reduced, and the accuracy of an output result is improved.
In some embodiments, the binarization processing of the input data after the channel mixing in the embodiment shown in fig. 5 may include the steps shown in fig. 6.
Step S5021, if the input data after the channel mixing is larger than a preset threshold value, the input data after the channel mixing is converted into a first value.
Step S5022, if the input data after the channel mixing is smaller than or equal to the preset threshold value, the input data after the channel mixing is converted into a second value.
In some embodiments, the first value is 1 and the second value is-1. Those skilled in the art will appreciate that the two values of the binary data are not limited to 1 and-1, but may be 1 and 0, or other binary data. In the embodiments of the present disclosure, binary data 1 and-1 are taken as examples for explanation, but the present disclosure does not limit this.
In some embodiments, the binary data network is a network that binarizes weights and activation values (eigenvalues), and the floating point data network refers to a network in which both weights and activation values are floating point type data.
In some embodiments, one floating point type data occupies at least 32 bits (16 or 64 bits may also be used), and the binary data occupies only one bit, so that the memory consumption of the binary data network model can be theoretically reduced by many times compared with that of the floating point data network model, and thus the binary data network has great advantages in model compression.
In addition, after input data, an activation function and a weight value of the neural network are binarized simultaneously, the original multiplication and addition operation of 32 floating point numbers can be solved through one-time exclusive or operation (xnor) and one-time addition operation, and the method has great potential in model acceleration.
According to the technical scheme provided by the embodiment, the input data is binarized, so that the memory consumption of the neural network is greatly reduced, and the running speed of the neural network is increased.
In some embodiments, performing the packet convolution processing on the binary input data in the embodiment shown in fig. 5 may include the steps shown in fig. 7.
Referring to fig. 7, performing packet convolution processing on the binary input data may include the following steps.
Step S5031, correspondingly grouping the binary input data and the convolution kernels in the neural network according to channels.
In the related art, a general convolution performs a convolution operation on the whole input data, that is, the input data: h1×W1×C1(ii) a The convolution kernel has the size of h1 xw 1 and has C2Then the output data obtained by convolution is H2×W2×C2. Here we assume that the output and the resolution of the output are constant. The convolution process described above is a one-step process, which puts higher demands on the capacity of the memory.
In the related art, the grouping convolution divides the input data into g (for example, 2 groups), and it should be noted that the grouping is divided according to the channels, i.e. some channels are grouped into one group, and each group includes C1Per g channels (C)1Representing the number of channels of the input data, C1May be 256, for example). Because of the change in the input data, the convolution kernel needs to make the same change accordingly. I.e. the number of channels of the convolution kernel in each group becomes C1G, the size of the convolution kernel is not required to be changed, and the number of the convolution kernels in each group becomes C2Per g, instead of the original C2It is used. Then theAfter the convolution kernels of each group are convolved with the input data in the corresponding group to obtain output data, the output data are combined in a connection mode, and the channel of the final output data is still C2. That is, after the number of groups g is determined, we will operate g identical convolution processes in parallel, where in each process (group), the input data is H1×W1×C1G, convolution kernel size h1×w1×C1G, together with C2A number of the output data is H2×W2×C2/g。
In some embodiments, the input data and convolution kernels in the neural network are grouped into corresponding groups in depth according to a preset number of groups, that is, the input data of some channels are grouped into one group, and the convolution sum data of the corresponding point channels are grouped into corresponding groups.
In step S5032, binary operation processing is performed on the grouped binary input data and the corresponding convolution kernel.
In some embodiments, multiple convolution processes may be operated in parallel (each set of input data being convolved with a corresponding sum of the set, respectively).
In some embodiments, the floating point data convolution process mainly involves a multiplication and addition operation process, but the binary data convolution process may include some logic operation processes besides the addition process, for example, the processes may include: exclusive nor operation, exclusive or nor operation, exclusive nor operation, and the like.
In some embodiments, performing binary operation processing on the grouped binary input data and the convolution kernel corresponding to the grouped binary input data in the binary operation processing may include: exclusive nor operation, exclusive or operation, exclusive nor operation, and the like, and the binary operation processing is not limited by the disclosure, subject to actual needs of those skilled in the art.
According to the technical scheme provided by the embodiment, the input data and the convolution kernels in the neural network are subjected to grouping processing, so that the neural network can process a plurality of convolutions in parallel to process the input data, the consumption of an internal memory is greatly reduced, and the operation speed is improved.
FIG. 8 is a schematic diagram illustrating a neural network architecture, according to an exemplary embodiment.
The neural network shown in fig. 8 may include: a data input network 801, a channel shuffle network 802, a binary processing network 803, a binary group convolution processing network 804, an accumulation network 805, and a result output network 806.
The data input network 801 may be configured to obtain input data for a neural network. In some embodiments, the acquired input data to the neural network includes a plurality of channels. For example, the input data to the neural network may be an RGB color image, which may include R, G, B three channels of data.
The channel shuffling network 802 may be configured to perform channel splitting processing on input data of the neural network, and perform channel shuffling processing on the input data after channel splitting.
In some embodiments, multiple channels of input data may be separated, for example, RGB color image data may be data split by R, G, B three channels.
In some embodiments, in order to enhance the expressive power of the neural network, the input data after channel splitting may be subjected to a channel shuffling process, that is, a random permutation and combination process is performed on the channels of the input data after channel splitting. For example, the three channels R, G, B of the channel-split RGB color image data are randomly arranged and combined to obtain a set of random channel combinations, e.g., G, R, B.
A binary processing network 803 may be configured to perform binary processing on the channel shuffle processed input data to generate binary input data.
In some embodiments, the binarization processing is performed on the input data after the channel mixing, and includes: if the input data after the channel mixing is larger than a preset threshold value, converting the input data after the channel mixing into a first value; and if the input data after the channel mixing is smaller than or equal to the preset threshold, converting the input data after the channel mixing into a second value.
In some embodiments, the first value is 1 and the second value is-1. Those skilled in the art will appreciate that the two values of the binary data are not limited to 1 and-1, but may be 1 and 0, or other binary data. In the embodiments of the present disclosure, binary data 1 and-1 are taken as examples for explanation, but the present disclosure does not limit this.
The binary group convolution processing network 804 may be configured to perform binary group convolution processing on the binary input data, where the weight in the convolution kernel corresponding to the group convolution processing is binary data.
In some embodiments, performing packet convolution processing on the binary input data includes: correspondingly grouping the binary input data and convolution kernels in the neural network according to channels; and performing binary operation processing on the grouped binary input data and the corresponding convolution kernel.
In some embodiments, the convolution kernel in the neural network may be a 3 × 3 convolution kernel.
The accumulation network 805 may be configured to perform accumulation processing on the input data after channel mixing and the binary data after the binary packet convolution processing according to the channel data.
A result output network 806 may be configured to output the accumulated processing result.
According to the neural network provided by the embodiment, the memory consumption of the neural network is reduced by carrying out binary processing on input data, the operation speed of the neural network is increased by using two-value set convolution, and on the basis, the channel shuffling operation is also used for realizing information exchange among different sets of channels. The technical scheme provided by the disclosure can reduce the memory consumption of the neural network and reduce the information loss.
FIG. 9 is a block diagram illustrating a neural network-based data processing apparatus in accordance with an exemplary embodiment. Referring to fig. 9, the apparatus 900 includes a channel shuffle module 901, a binary processing module 902, and a packet convolution module 903 and an accumulation module 904.
The channel shuffling module 901 may be configured to perform channel splitting processing on input data of the neural network, and perform channel shuffling processing on the input data after channel splitting. The binary processing module 902 may be configured to perform binarization processing on the channel-mixed input data to obtain binary input data. The packet convolution module 903 may be configured to perform packet convolution processing on the binary input data, where the weight in the convolution kernel corresponding to the packet convolution processing is binary data. The accumulation module 904 may be configured to perform accumulation processing on the channel-mixed input data and the packet-convolution processed binary data according to a channel order to obtain an output result.
In some embodiments, the channel shuffling module 901 may be further configured to perform a random permutation and combination process on the channels of the input data after the channel splitting.
In some embodiments, the binary processing module 902 in the embodiment shown in fig. 9 may include a first value determination unit 9021 and a second value determination unit 9022 as shown in fig. 10.
The first value determining unit 9021 may be configured to convert the input data after the channel mixing into a first value if the input data after the channel mixing is greater than a preset threshold, and the second value determining unit 9022 may be configured to convert the input data after the channel mixing into a second value if the input data after the channel mixing is less than or equal to the preset threshold.
In some embodiments, the first value is 1 and the second value is-1.
In some embodiments, the packet convolution module 903 illustrated in the embodiment of fig. 9 may include a packet unit 9031 and a binary operation unit 9032 illustrated in fig. 11.
The grouping unit 9031 may be configured to correspondingly group the binary input data and convolution kernels in the neural network according to channels; the binary operation unit 9032 may be configured to perform binary operation processing on the grouped binary input data and the convolution kernel corresponding thereto.
In some embodiments, the binary operation processing includes: an exclusive nor operation, an exclusive or operation, or an exclusive nor operation.
Since each functional module of the data processing apparatus 900 based on the neural network according to the exemplary embodiment of the present disclosure corresponds to the step of the above-described exemplary embodiment of the data processing method based on the neural network, it is not described herein again.
Referring now to FIG. 12, shown is a block diagram of a computer system 1200 suitable for use in implementing a terminal device of an embodiment of the present application. The terminal device shown in fig. 12 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 12, the computer system 1200 includes a Central Processing Unit (CPU)1201, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)1202 or a program loaded from a storage section 1208 into a Random Access Memory (RAM) 1203. In the RAM1203, various programs and data necessary for the operation of the system 1200 are also stored. The CPU 1201, ROM1202, and RAM1203 are connected to each other by a bus 1204. An input/output (I/O) interface 1205 is also connected to bus 1204.
The following components are connected to the I/O interface 1205: an input section 1206 including a keyboard, a mouse, and the like; an output portion 1207 including a display device such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 1208 including a hard disk and the like; and a communication section 1209 including a network interface card such as a LAN card, a modem, or the like. The communication section 1209 performs communication processing via a network such as the internet. A driver 1210 is also connected to the I/O interface 1205 as needed. A removable medium 1211, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is mounted on the drive 1210 as necessary, so that a computer program read out therefrom is mounted into the storage section 1208 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 1209, and/or installed from the removable medium 1211. The computer program performs the above-described functions defined in the system of the present application when executed by the Central Processing Unit (CPU) 1201.
It should be noted that the computer readable medium shown in the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a transmitting unit, an obtaining unit, a determining unit, and a first processing unit. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.
As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to perform functions comprising: carrying out channel splitting processing on input data of the neural network, and carrying out channel shuffling processing on the input data after channel splitting; carrying out binarization processing on the input data after channel mixing to obtain binary input data; performing packet convolution processing on the binary input data, wherein the weight in a convolution kernel corresponding to the packet convolution processing is binary data; and accumulating the input data after channel mixing and the binary data after the grouping convolution processing according to the channel sequence to obtain an output result.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution of the embodiment of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computing device (which may be a personal computer, a server, a mobile terminal, or a smart device, etc.) to execute the method according to the embodiment of the present disclosure, such as one or more of the steps shown in fig. 5.
Furthermore, the above-described figures are merely schematic illustrations of processes included in methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the disclosure is not limited to the details of construction, the arrangements of the drawings, or the manner of implementation that have been set forth herein, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (10)

1. A data processing method based on a neural network is characterized by comprising the following steps:
carrying out channel splitting processing on input data of the neural network, and carrying out channel shuffling processing on the input data after channel splitting;
carrying out binarization processing on the input data after channel mixing to obtain binary input data;
performing packet convolution processing on the binary input data, wherein the weight in a convolution kernel corresponding to the packet convolution processing is binary data;
and accumulating the input data after channel mixing and the binary data after the grouping convolution processing according to the channel sequence to obtain an output result.
2. The method of claim 1, wherein performing channel shuffling on the channel split input data comprises:
and carrying out random permutation and combination processing on the channels of the input data after the channels are split.
3. The method according to claim 1, wherein the binarization processing is performed on the input data after the channel mixing, and comprises:
if the input data after the channel mixing is larger than a preset threshold value, converting the input data after the channel mixing into a first value;
and if the input data after the channel mixing is smaller than or equal to the preset threshold, converting the input data after the channel mixing into a second value.
4. The method of claim 3, wherein the first value is 1 and the second value is-1.
5. The method of claim 1, wherein performing packet convolution processing on the binary input data comprises:
correspondingly grouping the binary input data and convolution kernels in the neural network according to channels;
and performing binary operation processing on the grouped binary input data and the corresponding convolution kernel.
6. The method according to claim 5, wherein the binary operation processing includes: an exclusive nor operation, an exclusive or operation, or an exclusive nor operation.
7. The method of claim 5, wherein the convolution kernel is a 3 x 3 convolution kernel.
8. A data processing apparatus based on a neural network, comprising:
the channel shuffling module is configured to perform channel splitting processing on input data of the neural network and perform channel shuffling processing on the input data after channel splitting;
the binary processing module is configured to carry out binary processing on the input data after the channel mixing so as to obtain binary input data;
the grouping convolution module is configured to perform grouping convolution processing on the binary input data, and the weight in a convolution kernel corresponding to the grouping convolution processing is binary data;
and the accumulation module is configured to perform accumulation processing on the input data after channel mixing and the binary data after packet convolution processing according to the channel sequence so as to obtain an output result.
9. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
10. A computer-readable medium, on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out the method of any one of claims 1-7.
CN201910390684.3A 2019-05-10 2019-05-10 Data processing method and device based on neural network, equipment and readable medium Pending CN111914987A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910390684.3A CN111914987A (en) 2019-05-10 2019-05-10 Data processing method and device based on neural network, equipment and readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910390684.3A CN111914987A (en) 2019-05-10 2019-05-10 Data processing method and device based on neural network, equipment and readable medium

Publications (1)

Publication Number Publication Date
CN111914987A true CN111914987A (en) 2020-11-10

Family

ID=73242939

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910390684.3A Pending CN111914987A (en) 2019-05-10 2019-05-10 Data processing method and device based on neural network, equipment and readable medium

Country Status (1)

Country Link
CN (1) CN111914987A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112907600A (en) * 2021-03-10 2021-06-04 江苏禹空间科技有限公司 Optimization method and system of target detection model
CN113177638A (en) * 2020-12-11 2021-07-27 联合微电子中心(香港)有限公司 Processor and method for generating binarization weights for neural networks
CN113743582A (en) * 2021-08-06 2021-12-03 北京邮电大学 Novel channel shuffling method and device based on stack shuffling
CN112907600B (en) * 2021-03-10 2024-05-24 无锡禹空间智能科技有限公司 Optimization method and system of target detection model

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177638A (en) * 2020-12-11 2021-07-27 联合微电子中心(香港)有限公司 Processor and method for generating binarization weights for neural networks
CN112907600A (en) * 2021-03-10 2021-06-04 江苏禹空间科技有限公司 Optimization method and system of target detection model
CN112907600B (en) * 2021-03-10 2024-05-24 无锡禹空间智能科技有限公司 Optimization method and system of target detection model
CN113743582A (en) * 2021-08-06 2021-12-03 北京邮电大学 Novel channel shuffling method and device based on stack shuffling
CN113743582B (en) * 2021-08-06 2023-11-17 北京邮电大学 Novel channel shuffling method and device based on stack shuffling

Similar Documents

Publication Publication Date Title
US11307864B2 (en) Data processing apparatus and method
US20190325307A1 (en) Estimation of resources utilized by deep learning applications
US10567248B2 (en) Distributed assignment of video analytics tasks in cloud computing environments to reduce bandwidth utilization
CN110413812B (en) Neural network model training method and device, electronic equipment and storage medium
CN108628898B (en) Method, device and equipment for data storage
CN106852185A (en) Parallelly compressed encoder based on dictionary
CN108171189A (en) A kind of method for video coding, video coding apparatus and electronic equipment
CN108595211B (en) Method and apparatus for outputting data
CN109412865B (en) Virtual network resource allocation method, system and electronic equipment
CN117435855B (en) Method for performing convolution operation, electronic device, and storage medium
CN110781849A (en) Image processing method, device, equipment and storage medium
CN111914987A (en) Data processing method and device based on neural network, equipment and readable medium
US11429317B2 (en) Method, apparatus and computer program product for storing data
CN116127925B (en) Text data enhancement method and device based on destruction processing of text
CN111797822A (en) Character object evaluation method and device and electronic equipment
WO2023124371A1 (en) Data processing apparatus and method, and chip, computer device and storage medium
CN111915689A (en) Method, apparatus, electronic device and computer readable medium for generating objective function
CN110674813A (en) Chinese character recognition method and device, computer readable medium and electronic equipment
CN112101511A (en) Sparse convolutional neural network
CN116108810A (en) Text data enhancement method and device
CN107220702B (en) Computer vision processing method and device of low-computing-capacity processing equipment
CN110610450A (en) Data processing method, electronic device, and computer-readable storage medium
CN111078821B (en) Dictionary setting method, dictionary setting device, medium and electronic equipment
CN109308194B (en) Method and apparatus for storing data
CN112036561A (en) Data processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination