CN107423817B - Method and device for realizing deep learning - Google Patents

Method and device for realizing deep learning Download PDF

Info

Publication number
CN107423817B
CN107423817B CN201710250317.4A CN201710250317A CN107423817B CN 107423817 B CN107423817 B CN 107423817B CN 201710250317 A CN201710250317 A CN 201710250317A CN 107423817 B CN107423817 B CN 107423817B
Authority
CN
China
Prior art keywords
neural network
deep learning
parameters
layer
learning model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710250317.4A
Other languages
Chinese (zh)
Other versions
CN107423817A (en
Inventor
周潇
杨俊�
陆天明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Transwarp Technology Shanghai Co Ltd
Original Assignee
Transwarp Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Transwarp Technology Shanghai Co Ltd filed Critical Transwarp Technology Shanghai Co Ltd
Priority to CN201710250317.4A priority Critical patent/CN107423817B/en
Publication of CN107423817A publication Critical patent/CN107423817A/en
Application granted granted Critical
Publication of CN107423817B publication Critical patent/CN107423817B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The method and the device for realizing deep learning are provided, and an intermediate language layer is defined; acquiring a data format in the intermediate language layer through an adapter, converting the data format into a training format required by training, and constructing a neural network diagram; selecting a training data set containing a label to train the neural network graph to obtain a deep learning model; and testing the deep learning model through a test data set containing a label, and realizing deep learning according to a test result. Therefore, through the defined intermediate language layer and the adapter of the deep learning framework, the user only needs to define the deep learning network topological graph by using the intermediate language layer, and the adapter can translate the deep learning network topological graph into the programming language supported by the target framework; and the user does not need to learn the programming language program interface and the corresponding frame structure provided by each frame additionally, thereby greatly reducing the learning cost and reducing most of repeated labor.

Description

Method and device for realizing deep learning
Technical Field
The present application relates to the field of computers, and in particular, to a method and an apparatus for implementing deep learning.
Background
In the field of data mining and deep learning, currently, different deep learning frameworks are constructed according to principles, supported models are different, and batch gradient updating modes are different, including synchronous and asynchronous, so that certain limitations can be caused: firstly, the frame environment is required to be built under a cluster; secondly, the user is required to invest time to learn the framework in order to reproduce the deep learning model, so that additional investment time is required, and the efficiency is greatly reduced.
Content of application
An object of the present application is to provide a method and an apparatus for implementing deep learning, which solve the problems in the prior art that an environment for building a deep learning framework under a cluster is required and a user needs to additionally learn a programming language interface provided by the built framework.
According to one aspect of the application, a method for implementing deep learning is provided, and the method comprises the following steps:
defining an intermediate language layer, wherein the intermediate language layer comprises a dependency relationship among layers in a neural network with a uniform data format, parameters of the neural network layers and parameters of the neural network;
acquiring a data format in the intermediate language layer through an adapter, converting the data format into a training format required by training, and constructing a neural network diagram;
selecting a training data set containing a label to train the neural network graph to obtain a deep learning model;
and testing the deep learning model through a test data set containing a label, and realizing deep learning according to a test result.
Further, the dependency relationship includes:
each node in each neural network layer in the neural network depends on all nodes of the neural network layer above.
Further, the constructing the neural network diagram includes:
and constructing a neural network diagram according to the dependency relationship among the layers after the layers are converted into the training format, the parameters of the neural network layers and the parameters of the neural network.
Further, the constructing the neural network map simultaneously or after, comprises:
and constructing a back propagation diagram of gradient calculation according to the dependency relationship among the layers after being converted into the training format, the parameters of the neural network layers and the parameters of the neural network.
Further, the method comprises:
and iteratively inquiring the dependency relationship among layers in the neural network diagram, the parameters of each neural network layer and the parameters of the neural network according to the constructed back propagation diagram, and optimizing the neural network diagram according to the result of iterative inquiry.
Further, testing the deep learning model through a test data set containing a label, comprising:
predicting the obtained test data set containing the actual label value on the deep learning model to generate a predicted label value;
and obtaining the accuracy result of the depth model test according to the comparison between the predicted label value and the actual label value.
Further, the implementing deep learning according to the test result comprises:
and predicting the data set without the label through the deep learning model according to the accuracy result of the deep model test to generate a predicted data set with the label.
Further, selecting a training data set containing a label to train the neural network graph to obtain a deep learning model, and the method comprises the following steps:
and converting the data format of the deep learning model into a preset uniform data format.
According to another aspect of the present application, there is also provided an apparatus for implementing deep learning, the apparatus including:
the definition device is used for defining an intermediate language layer, wherein the intermediate language layer comprises the dependency relationship among layers in a neural network with a uniform data format, the parameters of each neural network layer and the parameters of the neural network;
the construction device is used for acquiring a data format in the intermediate language layer through an adapter, converting the data format into a training format required by training, and constructing a neural network diagram;
the training device is used for selecting a training data set containing a label to train the neural network graph to obtain a deep learning model;
and the realization device is used for testing the deep learning model through a test data set containing a label and realizing deep learning according to a test result.
Further, the dependency relationship includes:
each node in each neural network layer in the neural network depends on all nodes of the neural network layer above.
Further, the building apparatus is configured to:
and constructing a neural network diagram according to the dependency relationship among the layers after the layers are converted into the training format, the parameters of the neural network layers and the parameters of the neural network.
Further, the building apparatus is configured to:
and constructing a back propagation diagram of gradient calculation according to the dependency relationship among the layers after being converted into the training format, the parameters of the neural network layers and the parameters of the neural network.
Further, the apparatus comprises:
and the optimization device is used for iteratively inquiring the dependency relationship among layers in the neural network diagram, the parameters of each neural network layer and the parameters of the neural network according to the constructed back propagation diagram, and optimizing the neural network diagram according to the result of iterative inquiry.
Further, the implementation apparatus is configured to:
predicting the obtained test data set containing the actual label value on the deep learning model to generate a predicted label value;
and obtaining the accuracy result of the depth model test according to the comparison between the predicted label value and the actual label value.
Further, the implementation apparatus is configured to:
and predicting the data set without the label through the deep learning model according to the accuracy result of the deep model test to generate a predicted data set with the label.
Further, the training device is configured to:
and converting the data format of the deep learning model into a preset uniform data format.
Compared with the prior art, the method has the advantages that an intermediate language layer is defined, wherein the intermediate language layer comprises the dependency relationship among layers in the neural network with a uniform data format, the parameters of the neural network layers and the parameters of the neural network; acquiring a data format in the intermediate language layer through an adapter, converting the data format into a training format required by training, and constructing a neural network diagram; selecting a training data set containing a label to train the neural network graph to obtain a deep learning model; and testing the deep learning model through a test data set containing a label, and realizing deep learning according to a test result. Therefore, through the defined intermediate language layer and the adapter of the deep learning framework, no matter what programming language is supported by each deep learning framework, a user only needs to define a deep learning network topological graph by using the intermediate language layer, and the adapter can translate the deep learning network topological graph into the programming language supported by the target framework; in addition, the user does not need to learn the programming language program interface and the corresponding frame structure provided by each frame, so that the learning cost is greatly reduced, and most of repeated labor is also reduced.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 illustrates a method flow diagram for a deep learning implementation provided in accordance with an aspect of the subject application;
FIG. 2 illustrates a general deep learning framework design in one embodiment of the present application;
fig. 3 is a schematic diagram illustrating an apparatus structure of a deep learning implementation according to another aspect of the present application.
The same or similar reference numbers in the drawings identify the same or similar elements.
Detailed Description
The present application is described in further detail below with reference to the attached figures.
In a typical configuration of the present application, the terminal, the device serving the network, and the trusted party each include one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.
Fig. 1 shows a flowchart of a method for implementing deep learning, according to an aspect of the present application, the method including: step S11 to step S14, in step S11, defining an intermediate language layer, wherein the intermediate language layer comprises the dependency relationship among layers in the neural network with uniform data format, the parameters of each neural network layer and the parameters of the neural network; in step S12, acquiring a data format in the intermediate language layer through an adapter, converting the data format into a training format required for training, and constructing a neural network diagram; in step S13, selecting a training data set containing a label to train the neural network map, so as to obtain a deep learning model; in step S14, the deep learning model is tested by a test data set including a label, and deep learning is realized according to the test result. Therefore, through the defined intermediate language layer and the adapter of the deep learning framework, no matter what programming language is supported by each deep learning framework, a user only needs to define a deep learning network topological graph by using the intermediate language layer, and the adapter can translate the deep learning network topological graph into the programming language supported by the target framework; in addition, the user does not need to learn the programming language program interface and the corresponding frame structure provided by each frame, so that the learning cost is greatly reduced, and most of repeated labor is also reduced.
Specifically, in step S11, defining an intermediate language layer, where the intermediate language layer includes dependencies between layers in a neural network with a uniform data format, parameters of each neural network layer, and parameters of the neural network; wherein the dependency relationship may include: each node in each neural network layer in the neural network depends on all nodes of the neural network layer above. In one embodiment of the application, a universal intermediate language layer is defined to express network topology structures in various deep learning frameworks, and the network topology structures can express the dependency relationship between neural network layers, wherein the dependency relationship is that each node in each neural network layer in a neural network depends on all nodes in the neural network layer above. In the embodiment of the present application, the defined intermediate language layer interprets these dependencies by defining each node of the neural network, and contains parameters of the neural network such as learning rate, iteration number, attenuation coefficient, etc. and parameters of each neural network layer such as number of nodes, activation function, etc. which do not need to be consistent. By defining a universal intermediate language layer, a user can use all frames supported by the defined intermediate language layer to perform tests under the condition that only one deep learning network topological graph is defined, and the method has good expansibility.
Specifically, in step S12, a data format in the intermediate language layer is obtained through an adapter, the data format is converted into a training format required by training, and a neural network diagram is constructed; in an embodiment of the present application, a unified data format in an intermediate language layer is converted into different codes through an adapter of a deep learning framework, specifically, the adapter reads the intermediate language layer containing a network topology, obtains a dependency relationship with the unified data format, parameters of a neural network and parameters of each neural network layer in the intermediate language layer, converts the obtained unified data format of each parameter information into a training format required by training, that is, into a data format of a target deep learning framework, for example, if the target deep learning framework is a learning system (TensorFlow), the adapter converts the intermediate language layer into a code of python api (application program interface) provided by the TensorFlow, and converts the code into a format required by the TensorFlow training; initializing model parameters under the language of a target deep learning framework, and constructing corresponding neural network graphs layer by layer. As will be appreciated by those skilled in the art, the python is an object-oriented, interpreted computer programming language.
Specifically, in step S13, a training data set containing a label is selected to train the neural network diagram, so as to obtain a deep learning model; in an embodiment of the application, a batch of data sets containing label input values are received and randomly divided into a training data set and a testing data set, a neural network is trained under a target deep learning framework, and a deep learning model is output. For example, there is a set of historical user characteristics, behavior data: age, gender, time of last login to the website, login period, etc., with each user having a tag: whether the user has fraudulent behavior; the data is trained to obtain a deep learning model through a deep learning algorithm, and the model has a certain degree of identification on the label through the characteristic.
Next, in step S14, the deep learning model is tested by a test data set including a label, and deep learning is realized according to the test result. According to the embodiment, the test data set containing the label is predicted on the trained model to obtain the accuracy of model test, the label value of the data set without the label is predicted according to the accuracy of the test, or the label value is used for follow-up data mining, machine learning and the like, so that the learning cost of a user is greatly reduced, a network topological graph does not need to be realized under each deep learning frame, most of repeated labor is reduced, and only the deep learning of the network topological graph and parameter adjustment are needed to achieve the optimal result.
In an embodiment of the present application, in step S12, a neural network diagram is constructed according to the dependency relationships between the layers after being converted into the training format, the parameters of each neural network layer, and the parameters of the neural network. The dependency relationship among the layers, the parameters of the neural network layers and the parameters of the neural network are initialized, the initialization process is to convert the uniform data format of the information such as the parameters into the data format required by training through an adapter, and then, each layer of the neural network is constructed by using the information such as the parameters with the converted data format.
Further, in step S12, a back propagation map of gradient computation is constructed according to the dependency relationships among the layers after being converted into the training format, the parameters of the neural network layers, and the parameters of the neural network. According to the embodiment, according to the requirement of deep learning, a back propagation graph of gradient calculation is constructed at the same time or later as the neural network graph is constructed, so that iterative query can be carried out later. Namely, the method may include: and step S12', iteratively inquiring the dependency relationship among layers in the neural network diagram, the parameters of each neural network layer and the parameters of the neural network according to the constructed back propagation diagram, and optimizing the neural network diagram according to the result of iterative inquiry. The constructed back propagation graph is used for iteratively inquiring information such as parameters in the neural network, for example, the number of nodes in the neural network, the dependency relationship among the nodes, whether the parameters of the neural network are missing or not, whether the parameters of the neural network layer are accurate or not and the like, and the constructed neural network graph is optimized according to the result of iterative inquiry, so that the accuracy of a subsequently trained deep learning model is improved.
In an embodiment of the present application, in step S14, predicting an obtained test data set containing an actual tag value on the deep learning model to generate a predicted tag value; and obtaining the accuracy result of the depth model test according to the comparison between the predicted label value and the actual label value. Training a deep learning model by using a training set, testing a test data set containing an actual label on the trained deep learning model, verifying a predicted label value and a real label value, and outputting the accuracy of model testing; for example, there is a set of historical user characteristics, behavior data: age, gender, time of last login to the website, login period, etc., with each user having a tag: whether the user has fraudulent behavior; training data by a deep learning algorithm to obtain a deep learning model, and judging the characteristics of part of existing data by the deep learning model while training the fraudulent behavior, wherein the judgment label is as follows: and whether the deep learning model judges that the user forms a fraud behavior or not is judged, and the judgment label and the real label of the user are compared to obtain the accuracy of the deep learning model in judging the data. It should be noted that the accuracy of the test is determined by parameters of the neural network, such as the number of iterations, the learning rate, and the like, and the accuracy of the test can be improved by adjusting the parameters of the neural network; in addition, the test method can be multiple cross-validation.
In an embodiment of the present application, in step S13, the data format of the deep learning model is converted into a preset uniform data format. The deep learning model is derived by the translator in a common format, which may be, for example,
a neural network model: {
Neural network layer 1: {
And the node 1: {
And (3) weighting: xx
Biasing: xxx
},
And (3) the node 2: {
And (3) weighting: xx
Biasing: xxx
}
}
Therefore, the deep learning model has good expansibility, and a user can operate on a plurality of different deep learning frames by only inheriting and realizing an abstract deep learning frame adapter, an executor and a converter under the condition that a supported deep learning frame is changeable, so that the development cost and the learning cost of the user are greatly reduced, and the research efficiency is greatly improved.
In an embodiment of the present application, in step S14, the data set without tags is predicted by the deep learning model according to the accuracy result of the depth model test, so as to generate a predicted data set with tags. Here, when the accuracy of the depth model test meets the requirement, it may be considered that the output value of the model during prediction is relatively close to the true value, so that the data set without the label is predicted on the trained deep learning model with the general format whose accuracy meets the requirement, and the predicted result is output. For example, a fraud prediction model generated through training is introduced, when a new user registers, although fraud is not formed at first and no label exists, the fraud prediction model can judge that the user will not form fraud in the future through the user behavior characteristics so as to make advance prevention. It should be noted that, when the tested deep learning model is used for prediction, the input value must conform to the format accepted by the pre-trained model, that is, the format of the data to be predicted is the same as the format of the data used in training the deep learning model.
FIG. 2 is a diagram of a generic deep learning framework design in an embodiment of the present application, including a network topology, a deep learning adapter factory including different adapters, a deep learning executor factory including a plurality of different executors, and a model translator, wherein the network topology is constructed by defining an intermediate language layer to uniformly represent the network topology in a neural network; selecting adapters required by a target deep learning framework in a deep learning adapter factory, such as a learning system (TensorFlow) adapter, an MXNet adapter and other adapters, and converting a uniform data format of an intermediate language layer into codes under different frameworks through the selected adapters; and then, executing the codes generated by the adapter by using the executor preinstalled with each deep learning framework, generating a deep learning model, and transmitting the deep learning model to the converter layer, wherein the converter layer converts the deep learning model generated by the executor into a uniform format so as to be directly read next time. And then, a universal interface layer is erected between the deep learning frames, so that the neural network structure designed by the user can be ensured to operate on a plurality of different deep learning frames seamlessly, the method is mainly suitable for the fields of data mining, machine learning and deep learning, the development cost and the learning cost of the user are greatly reduced, and the research efficiency is greatly improved. Through the universal deep learning framework design, a user only needs to define a deep learning network topological graph by using an intermediate language layer, and the adapter can translate the deep learning network topological graph into a programming language supported by a target framework, so that the deep learning network topological graph is not limited by the programming language.
Fig. 3 shows a schematic structural diagram of an apparatus for implementing deep learning according to another aspect of the present application, where the apparatus 1 includes: the device comprises a defining device 11, a constructing device 12, a training device 13 and an implementing device 14, wherein the defining device 11 is used for defining an intermediate language layer, and the intermediate language layer comprises the dependency relationship among layers in a neural network with uniform data format, the parameters of the neural network layers and the parameters of the neural network; the construction device 12 is configured to obtain a data format in the intermediate language layer through an adapter, convert the data format into a training format required by training, and construct a neural network diagram; the training device 13 is used for selecting a training data set containing a label to train the neural network graph to obtain a deep learning model; and the implementation device 14 is used for testing the deep learning model through a test data set containing a label and implementing deep learning according to a test result. Therefore, through the defined intermediate language layer and the adapter of the deep learning framework, no matter what programming language is supported by each deep learning framework, a user only needs to define a deep learning network topological graph by using the intermediate language layer, and the adapter can translate the deep learning network topological graph into the programming language supported by the target framework; in addition, the user does not need to learn the programming language program interface and the corresponding frame structure provided by each frame, so that the learning cost is greatly reduced, and most of repeated labor is also reduced.
Here, the device 1 includes, but is not limited to, a user device, or a device formed by integrating a user device and a network device through a network. The user equipment includes, but is not limited to, any mobile electronic product, such as a smart phone, a PDA, and the like, capable of human-computer interaction with a user through a touch panel, and the mobile electronic product may employ any operating system, such as an android operating system, an iOS operating system, and the like. The network device includes an electronic device capable of automatically performing numerical calculation and information processing according to preset or stored instructions, and the hardware includes but is not limited to a microprocessor, an Application Specific Integrated Circuit (ASIC), a programmable gate array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like. Including, but not limited to, the internet, a wide area network, a metropolitan area network, a local area network, a VPN network, a wireless Ad Hoc network (Ad Hoc network), etc. Preferably, the device 1 may also be a script program running on the user device, or a device formed by integrating the user device and a network device, a touch terminal, or a network device and a touch terminal through a network. Of course, those skilled in the art will appreciate that the above-described apparatus 1 is merely exemplary, and that other existing or future existing apparatus 1, as may be suitable for use in the present application, are also intended to be encompassed within the scope of the present application and are hereby incorporated by reference.
Specifically, the defining device 11 is configured to define an intermediate language layer, where the intermediate language layer includes a dependency relationship between layers in a neural network with a uniform data format, parameters of each neural network layer, and parameters of the neural network; wherein the dependency relationship may include: each node in each neural network layer in the neural network depends on all nodes of the neural network layer above. In one embodiment of the application, a universal intermediate language layer is defined to express network topology structures in various deep learning frameworks, and the network topology structures can express the dependency relationship between neural network layers, wherein the dependency relationship is that each node in each neural network layer in a neural network depends on all nodes in the neural network layer above. In the embodiment of the present application, the defined intermediate language layer interprets these dependencies by defining each node of the neural network, and contains parameters of the neural network such as learning rate, iteration number, attenuation coefficient, etc. and parameters of each neural network layer such as number of nodes, activation function, etc. which do not need to be consistent. By defining a universal intermediate language layer, a user can use all frames supported by the defined intermediate language layer to perform tests under the condition that only one deep learning network topological graph is defined, and the method has good expansibility.
Specifically, the constructing device 12 is configured to obtain a data format in the intermediate language layer through an adapter, convert the data format into a training format required by training, and construct a neural network diagram; in an embodiment of the present application, a unified data format in an intermediate language layer is converted into different codes through an adapter of a deep learning framework, specifically, the adapter reads the intermediate language layer containing a network topology, obtains a dependency relationship with the unified data format, parameters of a neural network and parameters of each neural network layer in the intermediate language layer, converts the obtained unified data format of each parameter information into a training format required by training, that is, into a data format of a target deep learning framework, for example, if the target deep learning framework is a learning system (TensorFlow), the adapter converts the intermediate language layer into a code of python api (application program interface) provided by the TensorFlow, and converts the code into a format required by the TensorFlow training; initializing model parameters under the language of a target deep learning framework, and constructing corresponding neural network graphs layer by layer. As will be appreciated by those skilled in the art, the python is an object-oriented, interpreted computer programming language.
Specifically, the training device 13 is configured to select a training data set containing a label to train the neural network diagram, so as to obtain a deep learning model; in an embodiment of the application, a batch of data sets containing label input values are received and randomly divided into a training data set and a testing data set, a neural network is trained under a target deep learning framework, and a deep learning model is output. For example, there is a set of historical user characteristics, behavior data: age, gender, time of last login to the website, login period, etc., with each user having a tag: whether the user has fraudulent behavior; the data is trained to obtain a deep learning model through a deep learning algorithm, and the model has a certain degree of identification on the label through the characteristic.
Then, the implementing device 14 is used for testing the deep learning model through the test data set containing the label, and implementing the deep learning according to the test result. According to the embodiment, the test data set containing the label is predicted on the trained model to obtain the accuracy of model test, the label value of the data set without the label is predicted according to the accuracy of the test, or the label value is used for follow-up data mining, machine learning and the like, so that the learning cost of a user is greatly reduced, a network topological graph does not need to be realized under each deep learning frame, most of repeated labor is reduced, and only the deep learning of the network topological graph and parameter adjustment are needed to achieve the optimal result.
In an embodiment of the present application, the constructing device 12 is configured to construct a neural network diagram according to the dependency relationships among the layers after being converted into the training format, the parameters of the neural network layers, and the parameters of the neural network. The dependency relationship among the layers, the parameters of the neural network layers and the parameters of the neural network are initialized, the initialization process is to convert the uniform data format of the information such as the parameters into the data format required by training through an adapter, and then, each layer of the neural network is constructed by using the information such as the parameters with the converted data format.
Further, the constructing device 12 is configured to construct a back propagation map of gradient computation according to the dependency relationships among the layers after being converted into the training format, the parameters of the neural network layers, and the parameters of the neural network. According to the embodiment, according to the requirement of deep learning, a back propagation graph of gradient calculation is constructed at the same time or later as the neural network graph is constructed, so that iterative query can be carried out later. I.e. the device 1 may comprise: and the optimization device 12' is used for iteratively inquiring the dependency relationship among the layers in the neural network diagram, the parameters of the neural network layers and the parameters of the neural network according to the constructed back propagation diagram, and optimizing the neural network diagram according to the result of iterative inquiry. The constructed back propagation graph is used for iteratively inquiring information such as parameters in the neural network, for example, the number of nodes in the neural network, the dependency relationship among the nodes, whether the parameters of the neural network are missing or not, whether the parameters of the neural network layer are accurate or not and the like, and the constructed neural network graph is optimized according to the result of iterative inquiry, so that the accuracy of a subsequently trained deep learning model is improved.
In an embodiment of the present application, the implementing device 14 is configured to predict an obtained test data set containing an actual tag value on the deep learning model, and generate a predicted tag value; and obtaining the accuracy result of the depth model test according to the comparison between the predicted label value and the actual label value. Training a deep learning model by using a training set, testing a test data set containing an actual label on the trained deep learning model, verifying a predicted label value and a real label value, and outputting the accuracy of model testing; for example, there is a set of historical user characteristics, behavior data: age, gender, time of last login to the website, login period, etc., with each user having a tag: whether the user has fraudulent behavior; training data by a deep learning algorithm to obtain a deep learning model, and judging the characteristics of part of existing data by the deep learning model while training the fraudulent behavior, wherein the judgment label is as follows: and whether the deep learning model judges that the user forms a fraud behavior or not is judged, and the judgment label and the real label of the user are compared to obtain the accuracy of the deep learning model in judging the data. It should be noted that the accuracy of the test is determined by parameters of the neural network, such as the number of iterations, the learning rate, and the like, and the accuracy of the test can be improved by adjusting the parameters of the neural network; in addition, the test method can be multiple cross-validation.
In an embodiment of the present application, the training device 13 is configured to convert a data format of the deep learning model into a preset uniform data format. The deep learning model is derived by the translator in a common format, which may be, for example,
a neural network model: {
Neural network layer 1: {
And the node 1: {
And (3) weighting: xx
Biasing: xxx
},
And (3) the node 2: {
And (3) weighting: xx
Biasing: xxx
}
}
Therefore, the deep learning model has good expansibility, and a user can operate on a plurality of different deep learning frames by only inheriting and realizing an abstract deep learning frame adapter, an executor and a converter under the condition that a supported deep learning frame is changeable, so that the development cost and the learning cost of the user are greatly reduced, and the research efficiency is greatly improved.
In an embodiment of the present application, the implementing device 14 is configured to predict, according to an accuracy result of the depth model test, a data set without a tag through the deep learning model, and generate a predicted data set with a tag. Here, when the accuracy of the depth model test meets the requirement, it may be considered that the output value of the model during prediction is relatively close to the true value, so that the data set without the label is predicted on the trained deep learning model with the general format whose accuracy meets the requirement, and the predicted result is output. For example, a fraud prediction model generated through training is introduced, when a new user registers, although fraud is not formed at first and no label exists, the fraud prediction model can judge that the user will not form fraud in the future through the user behavior characteristics so as to make advance prevention. It should be noted that, when the tested deep learning model is used for prediction, the input value must conform to the format accepted by the pre-trained model, that is, the format of the data to be predicted is the same as the format of the data used in training the deep learning model.
FIG. 2 is a diagram of a generic deep learning framework design in an embodiment of the present application, including a network topology, a deep learning adapter factory including different adapters, a deep learning executor factory including a plurality of different executors, and a model translator, wherein the network topology is constructed by defining an intermediate language layer to uniformly represent the network topology in a neural network; selecting adapters required by a target deep learning framework in a deep learning adapter factory, such as a learning system (TensorFlow) adapter, an MXNet adapter and other adapters, and converting a uniform data format of an intermediate language layer into codes under different frameworks through the selected adapters; and then, executing the codes generated by the adapter by using the executor preinstalled with each deep learning framework, generating a deep learning model, and transmitting the deep learning model to the converter layer, wherein the converter layer converts the deep learning model generated by the executor into a uniform format so as to be directly read next time. And then, a universal interface layer is erected between the deep learning frames, so that the neural network structure designed by the user can be ensured to operate on a plurality of different deep learning frames seamlessly, the method is mainly suitable for the fields of data mining, machine learning and deep learning, the development cost and the learning cost of the user are greatly reduced, and the research efficiency is greatly improved. Through the universal deep learning framework design, a user only needs to define a deep learning network topological graph by using an intermediate language layer, and the adapter can translate the deep learning network topological graph into a programming language supported by a target framework, so that the deep learning network topological graph is not limited by the programming language.
According to yet another aspect of the present application, there is also provided a computing-based device, comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
defining an intermediate language layer, wherein the intermediate language layer comprises a dependency relationship among layers in a neural network with a uniform data format, parameters of the neural network layers and parameters of the neural network; acquiring a data format in the intermediate language layer through an adapter, converting the data format into a training format required by training, and constructing a neural network diagram; selecting a training data set containing a label to train the neural network graph to obtain a deep learning model; and testing the deep learning model through a test data set containing a label, and realizing deep learning according to a test result.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.
It should be noted that the present application may be implemented in software and/or a combination of software and hardware, for example, implemented using Application Specific Integrated Circuits (ASICs), general purpose computers or any other similar hardware devices. In one embodiment, the software programs of the present application may be executed by a processor to implement the steps or functions described above. Likewise, the software programs (including associated data structures) of the present application may be stored in a computer readable recording medium, such as RAM memory, magnetic or optical drive or diskette and the like. Additionally, some of the steps or functions of the present application may be implemented in hardware, for example, as circuitry that cooperates with the processor to perform various steps or functions.
In addition, some of the present application may be implemented as a computer program product, such as computer program instructions, which when executed by a computer, may invoke or provide methods and/or techniques in accordance with the present application through the operation of the computer. Program instructions which invoke the methods of the present application may be stored on a fixed or removable recording medium and/or transmitted via a data stream on a broadcast or other signal-bearing medium and/or stored within a working memory of a computer device operating in accordance with the program instructions. An embodiment according to the present application comprises an apparatus comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform a method and/or a solution according to the aforementioned embodiments of the present application.
It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the apparatus claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.

Claims (15)

1. A method of deep learning implementation, wherein the method comprises:
defining an intermediate language layer, wherein the intermediate language layer comprises a dependency relationship among layers in a neural network with a uniform data format, parameters of the neural network layers and parameters of the neural network;
acquiring a data format in the intermediate language layer through an adapter, converting the data format into codes under different deep learning frames, and constructing a neural network diagram;
executing codes generated in the adapter by using an actuator pre-installed with each deep learning frame, and selecting a training data set containing labels to train the neural network graph to obtain a deep learning model, wherein the labels comprise the labels of whether the user has fraudulent behaviors or not, which are determined by historical user characteristics and behavior data;
transferring the deep learning model to a converter layer to convert the deep learning model into a unified data format through the converter layer;
and testing the deep learning model through a test data set containing a label to obtain a test result of whether the user forms a fraudulent behavior, and realizing deep learning according to the test result.
2. The method of claim 1, wherein the dependency comprises:
each node in each neural network layer in the neural network depends on all nodes of the neural network layer above.
3. The method of claim 2, wherein the constructing a neural network map comprises:
and constructing a neural network diagram according to the dependency relationship among the layers after the codes under the different-deep learning frames are converted, the parameters of the neural network layers and the parameters of the neural network.
4. The method of claim 3, wherein the method comprises:
and constructing a back propagation diagram of gradient calculation according to the dependency relationship among the layers after the codes under the different-deep learning framework are converted, the parameters of the neural network layers and the parameters of the neural network.
5. The method of claim 4, wherein the method comprises:
and iteratively inquiring the dependency relationship among layers in the neural network diagram, the parameters of each neural network layer and the parameters of the neural network according to the constructed back propagation diagram, and optimizing the neural network diagram according to the result of iterative inquiry.
6. The method of claim 1, wherein testing the deep learning model through a test dataset containing tags comprises:
predicting the obtained test data set containing the actual label value on the deep learning model to generate a predicted label value;
and obtaining the accuracy result of the depth model test according to the comparison between the predicted label value and the actual label value.
7. The method of claim 6, wherein the implementing deep learning according to the results of the testing comprises:
and predicting the data set without the label through the deep learning model according to the accuracy result of the deep model test to generate a predicted data set with the label.
8. An apparatus for deep learning implementation, wherein the apparatus comprises:
the definition device is used for defining an intermediate language layer, wherein the intermediate language layer comprises the dependency relationship among layers in a neural network with a uniform data format, the parameters of each neural network layer and the parameters of the neural network;
the construction device is used for acquiring a data format in the intermediate language layer through an adapter, converting the data format into codes under different deep learning frames and constructing a neural network diagram;
the training device is used for executing the codes generated in the adapter by using an actuator pre-installed with each deep learning frame, selecting a training data set containing labels to train the neural network graph to obtain a deep learning model, wherein the labels comprise the labels of whether the user has fraudulent behaviors or not, and the labels are determined by the characteristics and behavior data of the historical user;
the implementation device is used for transmitting the deep learning model to the converter layer so as to convert the deep learning model into a uniform data format through the converter layer;
and the implementation device is used for testing the deep learning model through the test data set containing the label to obtain a test result of whether the user forms the fraudulent behavior, and implementing deep learning according to the test result.
9. The device of claim 8, wherein the dependency comprises:
each node in each neural network layer in the neural network depends on all nodes of the neural network layer above.
10. The apparatus of claim 9, wherein the building means is for:
and constructing a neural network diagram according to the dependency relationship among the layers after the codes under the different-deep learning frames are converted, the parameters of the neural network layers and the parameters of the neural network.
11. The apparatus of claim 10, wherein the building means is for:
and constructing a back propagation diagram of gradient calculation according to the dependency relationship among the layers after the codes under the different-deep learning framework are converted, the parameters of the neural network layers and the parameters of the neural network.
12. The apparatus of claim 11, wherein the apparatus comprises:
and the optimization device is used for iteratively inquiring the dependency relationship among layers in the neural network diagram, the parameters of each neural network layer and the parameters of the neural network according to the constructed back propagation diagram, and optimizing the neural network diagram according to the result of iterative inquiry.
13. The apparatus of claim 8, wherein the implementing means is to:
predicting the obtained test data set containing the actual label value on the deep learning model to generate a predicted label value;
and obtaining the accuracy result of the depth model test according to the comparison between the predicted label value and the actual label value.
14. The apparatus of claim 13, wherein the training device is to:
and predicting the data set without the label through the deep learning model according to the accuracy result of the deep model test to generate a predicted data set with the label.
15. A computing-based device, comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
defining an intermediate language layer, wherein the intermediate language layer comprises a dependency relationship among layers in a neural network with a uniform data format, parameters of the neural network layers and parameters of the neural network;
acquiring a data format in the intermediate language layer through an adapter, converting the data format into codes under different deep learning frames, and constructing a neural network diagram;
executing codes generated in the adapter by using an actuator pre-installed with each deep learning frame, and selecting a training data set containing labels to train the neural network graph to obtain a deep learning model, wherein the labels comprise the labels of whether the user has fraudulent behaviors or not, which are determined by historical user characteristics and behavior data;
transferring the deep learning model to a converter layer to convert the deep learning model into a unified data format through the converter layer;
and testing the deep learning model through a test data set containing a label to obtain a test result of whether the user forms a fraudulent behavior, and realizing deep learning according to the test result.
CN201710250317.4A 2017-04-17 2017-04-17 Method and device for realizing deep learning Active CN107423817B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710250317.4A CN107423817B (en) 2017-04-17 2017-04-17 Method and device for realizing deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710250317.4A CN107423817B (en) 2017-04-17 2017-04-17 Method and device for realizing deep learning

Publications (2)

Publication Number Publication Date
CN107423817A CN107423817A (en) 2017-12-01
CN107423817B true CN107423817B (en) 2020-09-01

Family

ID=60424102

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710250317.4A Active CN107423817B (en) 2017-04-17 2017-04-17 Method and device for realizing deep learning

Country Status (1)

Country Link
CN (1) CN107423817B (en)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038544B (en) * 2017-12-04 2020-11-13 华南师范大学 Neural network deep learning method and system based on big data and deep learning
CN109978148B (en) * 2017-12-28 2020-06-23 中科寒武纪科技股份有限公司 Integrated circuit chip device and related product
CN109496319A (en) * 2018-01-15 2019-03-19 深圳鲲云信息科技有限公司 Artificial intelligence process device hardware optimization method, system, storage medium, terminal
CN108319456B (en) * 2018-01-29 2021-03-09 徐磊 Development method of programming-free deep learning application
CN108279881B (en) * 2018-02-11 2021-05-28 深圳竹信科技有限公司 Cross-platform implementation framework and method based on deep learning prediction part
CN110308899B (en) * 2018-03-27 2023-12-29 上海寒武纪信息科技有限公司 Language source program generation method and device for neural network processor
CN108985448B (en) * 2018-06-06 2020-11-17 北京大学 Neural network representation standard framework structure
CN109542745B (en) * 2018-11-20 2021-11-19 郑州云海信息技术有限公司 IO test method, device, equipment and medium
CN110033091B (en) 2018-12-13 2020-09-01 阿里巴巴集团控股有限公司 Model-based prediction method and device
CN109670544A (en) * 2018-12-13 2019-04-23 广州小狗机器人技术有限公司 A kind of object detecting apparatus and its acquisition methods, object detecting system
CN110674923A (en) * 2019-08-15 2020-01-10 山东领能电子科技有限公司 Rapid model verification method among multiple neural network frames
CN110533170A (en) * 2019-08-30 2019-12-03 陕西思科锐迪网络安全技术有限责任公司 A kind of deep learning neural network building method of graphic programming
CN110705714B (en) * 2019-09-27 2022-07-22 上海联影医疗科技股份有限公司 Deep learning model detection method, deep learning platform and computer equipment
CN110928849A (en) * 2019-11-27 2020-03-27 上海眼控科技股份有限公司 Method and device for preprocessing meteorological data, computer equipment and storage medium
US11301754B2 (en) * 2019-12-10 2022-04-12 Sony Corporation Sharing of compressed training data for neural network training
CN111078480B (en) * 2019-12-17 2023-09-01 北京奇艺世纪科技有限公司 Exception recovery method and server
WO2022037689A1 (en) * 2020-08-20 2022-02-24 第四范式(北京)技术有限公司 Data form-based data processing method and machine learning application method
CN112270403B (en) * 2020-11-10 2022-03-29 北京百度网讯科技有限公司 Method, device, equipment and storage medium for constructing deep learning network model
CN112363856A (en) * 2020-11-19 2021-02-12 北京计算机技术及应用研究所 Method for realizing interoperation of deep learning framework and application program based on DDS
CN113222121B (en) * 2021-05-31 2023-08-29 杭州海康威视数字技术股份有限公司 Data processing method, device and equipment
CN114511100B (en) * 2022-04-15 2023-01-13 支付宝(杭州)信息技术有限公司 Graph model task implementation method and system supporting multi-engine framework

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104537303A (en) * 2014-12-30 2015-04-22 中国科学院深圳先进技术研究院 Distinguishing system and method for phishing website
CN106529673A (en) * 2016-11-17 2017-03-22 北京百度网讯科技有限公司 Deep learning network training method and device based on artificial intelligence

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104537303A (en) * 2014-12-30 2015-04-22 中国科学院深圳先进技术研究院 Distinguishing system and method for phishing website
CN106529673A (en) * 2016-11-17 2017-03-22 北京百度网讯科技有限公司 Deep learning network training method and device based on artificial intelligence

Also Published As

Publication number Publication date
CN107423817A (en) 2017-12-01

Similar Documents

Publication Publication Date Title
CN107423817B (en) Method and device for realizing deep learning
CN110278175B (en) Graph structure model training and garbage account identification method, device and equipment
WO2019095782A1 (en) Data sample label processing method and apparatus
JP6822509B2 (en) Data processing methods and electronic devices
US11271957B2 (en) Contextual anomaly detection across assets
CN111079944B (en) Transfer learning model interpretation realization method and device, electronic equipment and storage medium
CN111340220A (en) Method and apparatus for training a predictive model
US11960980B2 (en) Machine learning system to predict causal treatment effects of actions performed on websites or applications
CN111369258A (en) Entity object type prediction method, device and equipment
CN111368973A (en) Method and apparatus for training a hyper-network
CN113159934A (en) Method and system for predicting passenger flow of network, electronic equipment and storage medium
CN111126422B (en) Method, device, equipment and medium for establishing industry model and determining industry
CN116542673B (en) Fraud identification method and system applied to machine learning
US20220138557A1 (en) Deep Hybrid Graph-Based Forecasting Systems
CN113419971A (en) Android system service vulnerability detection method and related device
CN116541069A (en) Key function evaluation method, device, electronic equipment, medium and program product
US11501172B2 (en) Accurately identifying members of training data in variational autoencoders by reconstruction error
US20220269868A1 (en) Structure self-aware model for discourse parsing on multi-party dialogues
CN111767290B (en) Method and apparatus for updating user portraits
CN110929209A (en) Method and device for sending information
CN114510592A (en) Image classification method and device, electronic equipment and storage medium
CN113971183A (en) Method and device for training entity marking model and electronic equipment
CN112348161A (en) Neural network training method, neural network training device and electronic equipment
CN113128677A (en) Model generation method and device
CN112348045A (en) Training method and training device for neural network and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: 200233 11-12 / F, building B, 88 Hongcao Road, Xuhui District, Shanghai

Patentee after: Star link information technology (Shanghai) Co.,Ltd.

Address before: 200233 11-12 / F, building B, 88 Hongcao Road, Xuhui District, Shanghai

Patentee before: TRANSWARP TECHNOLOGY (SHANGHAI) Co.,Ltd.

CP01 Change in the name or title of a patent holder