CN107450961A - A kind of distributed deep learning system and its building method, method of work based on Docker containers - Google Patents
A kind of distributed deep learning system and its building method, method of work based on Docker containers Download PDFInfo
- Publication number
- CN107450961A CN107450961A CN201710866197.0A CN201710866197A CN107450961A CN 107450961 A CN107450961 A CN 107450961A CN 201710866197 A CN201710866197 A CN 201710866197A CN 107450961 A CN107450961 A CN 107450961A
- Authority
- CN
- China
- Prior art keywords
- deep learning
- host
- distributed
- distributed deep
- docker containers
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
- G06F8/61—Installation
- G06F8/63—Image based installation; Cloning; Build to order
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Stored Programmes (AREA)
Abstract
The present invention relates to a kind of distributed deep learning system and its building method, method of work based on Docker containers, including a server host, the first distributed deep learning platform, the second distributed deep learning platform.The present invention utilizes Docker containerization technique, and multiple distributed deep learning systems are carried simultaneously on a server host.The improvement of the present invention is mainly reflected in three aspects:First, whole system can be realized on a server host, it is not necessary to more main frames, save cost;Second, container is created by template mirror image, process is simple, it is not necessary to which repetition is built, and is avoided the occurrence of mistake and is lost time;Third, the CPU of server can be utilized maximumlly, hardware resource is no longer wasted.
Description
Technical field
The present invention relates to a kind of distributed deep learning system based on Docker containers and its building method, work side
Method, belong to cloud computing technical field of virtualization.
Background technology
In essence, cloud computing refers to that user terminal obtains storage, calculating, database calculating money by remotely connecting
Source.Virtualization technology is one of core composition of cloud computing technology, is fully to integrate various calculating and storage resource and efficiently
The key technology utilized, including server virtualization and desktop virtualization.Docker as emerging lightweight virtualization technology,
Compared with traditional VM, its more light weight, toggle speed faster, can run hundreds and thousands of individual containers simultaneously, so non-on separate unit hardware
Often it is adapted to extending transversely by starting a large amount of containers progress in the peak traffic phase.
Deep learning platform uses unit processing at present, seldom uses distributed deep learning platform, distributed first
Deep learning platform building is more complicated, and required hardware configuration is more.But compared to the deep learning platform of unit, distribution
Formula deep learning platform can carry out computing faster.
Current these following problems of technology generally existing on the market:
1) unit deep learning platform is done using server, CPU amounts of calculation are enough, but can not be fully utilized, and cause to provide
Source wastes.
2) distributed deep learning platform is, it is necessary to multiple host is built, the CPU limited calculated amounts of every main frame, it is desirable to take
It is big to build host number needed for extensive platform, it is costly.
3) distributed deep learning platform building process is cumbersome, and using main frame building method, every main frame all needs to carry out phase
Same step.But repeat that different mistakes occurs during same steps, cause the process of building slow.
Chinese patent literature CN106657248A disclose a kind of Network Load Balance system based on Docker containers and
Its building method, method of work.Using basic technology of the Docker container techniques as the system, Docker container saving is utilized
The characteristics of hardware resource can create a large amount of containers in a server host, a whole set of network is realized on a server host
SiteServer LBS;It can be created using Docker containers by mirror image second level, and the container created by mirror image can guarantee that
The characteristics of exactly the same, realize the shunting that the Web server amount of conducting interviews or data flow are conveniently added by container mirror image.
But there is following defect in the patent:When creating mirror image using Dockerfile, it is impossible to visualization and some in test mirrors picture
Whether file configuration succeeds.
The content of the invention
In view of the shortcomings of the prior art, the invention provides a kind of distributed deep learning system based on Docker containers
System;
Present invention also offers the building method and method of work of above-mentioned distributed deep learning system;
The present invention utilizes Docker containerization technique, and multiple distributed depth are carried simultaneously on a server host
Learning system.The improvement of the present invention is mainly reflected in three aspects:First, will configuration by using Docker commit instructions
Good container generation mirror image, realizes whether some file configurations in visualization and test mirrors picture succeed;Second, whole system can
To be realized on a server host, it is not necessary to more main frames, save cost;Third, container is created by template mirror image, mistake
Journey is simple, it is not necessary to which repetition is built, and is avoided the occurrence of mistake and is lost time;Fourth, the CPU of server can be utilized maximumlly, no longer
Waste hardware resource.
Term is explained:
1st, Hadoop distributed platforms, the distributed system architecture developed by Apache funds club is referred to.User
Distributed program can be developed in the case where not knowing about distributed low-level details.The power of cluster is made full use of to carry out at a high speed
Computing and storage.Hadoop realizes a distributed file system Hadoop Distributed File System, referred to as
HDFS.HDFS has the characteristics of high fault tolerance, and is designed to be deployed on cheap hardware;And it provides high-throughput
The data of access application, it is adapted to those to have the application program of super large data set.HDFS relaxes POSIX requirement, can
To access the data in file system in the form of streaming.
2nd, Spark, refer to that the class Hadoop MapReduce's that UC Berkeley AMP lab are increased income is general parallel
Computational frame, the Distributed Calculation that Spark is realized based on map reduce algorithms, possesses possessed by Hadoop MapReduce
Advantage;But what it is different from MapReduce is that output and result can be stored in internal memory among Job, so as to no longer need to read and write
HDFS, therefore Spark can preferably be applied to the algorithm that data mining and machine learning etc. need the map reduce of iteration.
3rd, NameNode, the NameSpace of file system is managed.It maintains all in file system tree and whole tree
File and catalogue.These information are permanently stored on local disk with two document forms:NameSpace image file and editor
Journal file.NameNode also records in each file the back end information where each piece, but its not persistence
The positional information of block, because these information are rebuild when system starts by back end.
The technical scheme is that:
A kind of distributed deep learning system based on Docker containers, including a host and multiple Docker hold
Device, Hadoop distributed platforms, Spark are installed on host, the first distributed deep learning is also equipped with host and is put down
Platform or the second distributed deep learning platform;Hadoop distributed platforms, Spark are installed on each Docker containers, each
The first distributed deep learning platform or the second distributed deep learning platform are also equipped with Docker containers.
Server host is as host, as the hardware support of whole platform, the first distributed deep learning platform and
Second distributed deep learning platform is two kinds of currently available distributed deep learning platforms, is all increased income by Yahoo, is current
The distributed deep learning platform of main flow.
First distributed deep learning platform, the second distributed deep learning platform are used to help carry out deep learning
Instrument, different from unit deep learning platform;Hardware foundation of the server host as whole distributed deep learning system, is needed
Possess higher position reason ability, stability, reliability etc. to require.
According to currently preferred, the model DELL PowerEdge R730 of the host, first distribution
The model CaffeOnSpark of deep learning platform, the model of the second distributed deep learning platform
TensorFlowOnSpark。
DELL PowerEdge R730 server, it is configured to 48 core CPU, 96G internal memories, 8TB local hard drives;Caffe、
TensorFlow is two most popular at present unit deep learning platforms, based on Caffe, TensorFlow
CaffeOnSpark, TensorFlowOnSpark are that the distributed deep learning based on Hadoop/Spark that Yahoo increases income is put down
Platform.
The building method of the above-mentioned distributed deep learning system based on Docker containers, specific steps include:
(1) host is prepared, host is the server host;Ubuntu14.04 operating systems are installed;
Ubuntu14.04 can be mounted directly as metastable version in the (SuSE) Linux OS for supporting Docker with order line
Configure Docker environment;
(2) main folder needed for Docker containers is established under host root, main folder includes being capable of carry
File, carry out training pattern required for deep learning, training dataset for preserving, test data set, code and match somebody with somebody
Put file;
(3) Hadoop distributed platforms, Spark are installed in host;To support that CaffeOnSpark is distributed deep
Spend learning platform or TensorFlowOnSpark distribution deep learning platforms;Test Hadoop distributed platforms, Spark are
It is no to install successfully;If installed successfully, into step (4), otherwise, step (3) is repeated;
(4) model CaffeOnSpark the first distributed deep learning platform or model is installed in host
TensorFlowOnSpark the second distributed deep learning platform, configure the IP of the host node;By the host during system operation
Machine is as host node;
(5) blank vessel is created on host;
(6) Hadoop distributed platforms, Spark are installed on the blank vessel;
(7) model CaffeOnSpark the first distributed deep learning is installed on the container after step (6) installation
Second distributed deep learning platform of platform or model TensorFlowOnSpark, configures the IP from node;System is transported
Using the container as from node during row;
(8) container after being installed by Docker commit instructions using step (7) is template establishment mirror image;
(9) mirror image created with step (8) creates multiple Docker containers, and with configuring the IP of each Docker containers
Location.
It is as follows whether test Hadoop distributed platforms install successful step:Perform NameNode formatting, success
If, can be appreciated that " successfully formatted " and " Exitting with status 0 " prompting, if
" Exitting with status 1 " are then errors.If prompt Error in this step:JAVA_HOME is not set
And could not be found. mistake, then JAVA_HOME environmental variances are set just not set over there before explanation,
JAVA_HOME variables are please first set by study course, otherwise behind process do not gone down.Then NameNode is opened
With DataNode finger daemons, if there are following SSH promptings, yes is inputted.
It is as follows whether test Spark installs successful step:Have under spark/examples/src/main catalogues
Spark example procedure, there is the version of the language such as Scala, Java, Python, R.Run an example procedure SparkPi (i.e.
Calculate π approximation), very more operation informations can be exported during execution, output result is not easily found, and can be ordered by grep
Order is filtered, and the operation result after filtering obtains π 5 decimal approximations.
The method of work of the above-mentioned distributed deep learning system based on Docker containers, specific steps include:
(1) Hadoop platform and Spark in the host are started, the host is as whole distributed depth
The host node of learning system, and start Hadoop platform and Spark in several described Docker containers, several described Docker
Container is as whole distributed deep learning system from node;
(2) it is capable of the training pattern being stored under the file of carry required for deep learning training, training in host
Data set, test data set, code and configuration file;
(3) trained by script startup deep learning, deep learning training mission is assigned to each from node by host node
Carry out parallel training.
Beneficial effects of the present invention are:
1st, the present invention can erect distributed deep learning platform in the case where using a server host.
2nd, when needing more distributed nodes, energy quick opening container, which is matched somebody with somebody, postpones addition node.
3rd, the CPU computing resources of server are made full use of.
Brief description of the drawings
Fig. 1 is the structured flowchart of the distributed deep learning system of the invention based on Docker containers;
Embodiment
The present invention is further qualified with reference to Figure of description and embodiment, but not limited to this.
Embodiment 1
A kind of distributed deep learning system based on Docker containers, as shown in figure 1, including a host and multiple
Docker containers, Hadoop distributed platforms, Spark are installed on host, it is distributed deep that first is also equipped with host
Spend learning platform or the second distributed deep learning platform;Be provided with each Docker containers Hadoop distributed platforms,
Spark, the first distributed deep learning platform or the second distributed deep learning platform are also equipped with each Docker containers.
Server host is as host, as the hardware support of whole platform, the first distributed deep learning platform and
Second distributed deep learning platform is two kinds of currently available distributed deep learning platforms, is all increased income by Yahoo, is current
The distributed deep learning platform of main flow.
First distributed deep learning platform, the second distributed deep learning platform are used to help carry out deep learning
Instrument, different from unit deep learning platform;Hardware foundation of the server host as whole distributed deep learning system, is needed
Possess higher position reason ability, stability, reliability etc. to require.
The model DELL PowerEdge R730 of host, the model of the first distributed deep learning platform
CaffeOnSpark, the model TensorFlowOnSpark of the second distributed deep learning platform.
DELL PowerEdge R730 server, it is configured to 48 core CPU, 96G internal memories, 8TB local hard drives;Caffe、
TensorFlow is two most popular at present unit deep learning platforms, based on Caffe, TensorFlow
CaffeOnSpark, TensorFlowOnSpark are that the distributed deep learning based on Hadoop/Spark that Yahoo increases income is put down
Platform.
Embodiment 2
The building method of the distributed deep learning system based on Docker containers described in embodiment 1, specific steps bag
Include:
(1) host is prepared, host is server host;Ubuntu14.04 operating systems are installed;
Ubuntu14.04 can be mounted directly as metastable version in the (SuSE) Linux OS for supporting Docker with order line
Configure Docker environment;
(2) main folder needed for Docker containers is established under host root, main folder includes being capable of carry
File, carry out training pattern required for deep learning, training dataset for preserving, test data set, code and match somebody with somebody
Put file;
(3) Hadoop distributed platforms, Spark are installed in host;To support that CaffeOnSpark is distributed deep
Spend learning platform or TensorFlowOnSpark distribution deep learning platforms;Test Hadoop distributed platforms, Spark are
It is no to install successfully;If installed successfully, into step (4), otherwise, step (3) is repeated;
It is as follows whether test Hadoop distributed platforms install successful step:Perform NameNode formatting, success
If, can be appreciated that " successfully formatted " and " Exitting with status 0 " prompting, if
" Exitting with status 1 " are then errors.If prompt Error in this step:JAVA_HOME is not set
And could not be found. mistake, then JAVA_HOME environmental variances are set just not set over there before explanation,
JAVA_HOME variables are please first set by study course, otherwise behind process do not gone down.Then NameNode is opened
With DataNode finger daemons, if there are following SSH promptings, yes is inputted.
It is as follows whether test Spark installs successful step:Have under spark/examples/src/main catalogues
Spark example procedure, there is the version of the language such as Scala, Java, Python, R.Run an example procedure SparkPi (i.e.
Calculate π approximation), very more operation informations can be exported during execution, output result is not easily found, and can be ordered by grep
Order is filtered, and the operation result after filtering obtains π 5 decimal approximations.
(4) model CaffeOnSpark the first distributed deep learning platform or model is installed in host
TensorFlowOnSpark the second distributed deep learning platform, configure the IP of the host node;By the host during system operation
Machine is as host node;
(5) blank vessel is created on host;
(6) Hadoop distributed platforms, Spark are installed on the blank vessel;
(7) model CaffeOnSpark the first distributed deep learning is installed on the container after step (6) installation
Second distributed deep learning platform of platform or model TensorFlowOnSpark, configures the IP from node;System is transported
Using the container as from node during row;
(8) container after being installed by Docker commit instructions using step (7) is template establishment mirror image;
(9) mirror image created with step (8) creates multiple Docker containers, and with configuring the IP of each Docker containers
Location.
Embodiment 3
The method of work of the distributed deep learning system based on Docker containers described in embodiment 1, specific steps bag
Include:
(1) Hadoop platform and Spark in the host are started, the host is as whole distributed depth
The host node of learning system, and start Hadoop platform and Spark in several described Docker containers, several described Docker
Container is as whole distributed deep learning system from node;
(2) it is capable of the training pattern being stored under the file of carry required for deep learning training, training in host
Data set, test data set, code and configuration file;
(3) trained by script startup deep learning, deep learning training mission is assigned to each from node by host node
Carry out parallel training.
Claims (4)
1. a kind of distributed deep learning system based on Docker containers, it is characterised in that including a host and multiple
Docker containers, Hadoop distributed platforms, Spark are installed on host, it is distributed deep that first is also equipped with host
Spend learning platform or the second distributed deep learning platform;Be provided with each Docker containers Hadoop distributed platforms,
Spark, the first distributed deep learning platform or the second distributed deep learning platform are also equipped with each Docker containers.
2. a kind of distributed deep learning system based on Docker containers according to claim 1, it is characterised in that described
The model DELL PowerEdge R730 of host, the model of the first distributed deep learning platform
CaffeOnSpark, the model TensorFlowOnSpark of the second distributed deep learning platform.
3. a kind of building method of distributed deep learning system based on Docker containers according to claim 1 or 2,
Characterized in that, specific steps include:
(1) host is prepared, host is server host;
(2) main folder needed for Docker containers is established under host root, main folder includes the text for being capable of carry
Part is pressed from both sides, and training pattern, training dataset, test data set, code and configuration text required for deep learning are carried out for preservation
Part;
(3) Hadoop distributed platforms, Spark are installed in host;Whether test Hadoop distributed platforms, Spark pacify
Dress up work(;If installed successfully, into step (4), otherwise, step (3) is repeated;
(4) model CaffeOnSpark the first distributed deep learning platform or model is installed in host
TensorFlowOnSpark the second distributed deep learning platform, configure the IP of the host node;
(5) blank vessel is created on host;
(6) Hadoop distributed platforms, Spark are installed on the blank vessel;
(7) model CaffeOnSpark the first distributed deep learning platform is installed on the container after step (6) installation
Or model TensorFlowOnSpark the second distributed deep learning platform, configure the IP from node;
(8) container after being installed by Docker commit instructions using step (7) is template establishment mirror image;
(9) mirror image created with step (8) creates multiple Docker containers, and configures the IP address of each Docker containers.
4. a kind of method of work of distributed deep learning system based on Docker containers according to claim 1 or 2,
Characterized in that, specific steps include:
(1) Hadoop platform and Spark in the host are started, the host is as whole distributed deep learning system
The host node of system, and start Hadoop platform and Spark in several described Docker containers, several described Docker containers
As whole distributed deep learning system from node;
(2) it is capable of the training pattern being stored under the file of carry required for deep learning training, training data in host
Collection, test data set, code and configuration file;
(3) trained by script startup deep learning, deep learning training mission is assigned to each from node progress by host node
Parallel training.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710866197.0A CN107450961B (en) | 2017-09-22 | 2017-09-22 | Distributed deep learning system based on Docker container and construction method and working method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710866197.0A CN107450961B (en) | 2017-09-22 | 2017-09-22 | Distributed deep learning system based on Docker container and construction method and working method thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107450961A true CN107450961A (en) | 2017-12-08 |
CN107450961B CN107450961B (en) | 2020-10-16 |
Family
ID=60498100
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710866197.0A Active CN107450961B (en) | 2017-09-22 | 2017-09-22 | Distributed deep learning system based on Docker container and construction method and working method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107450961B (en) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108255968A (en) * | 2017-12-26 | 2018-07-06 | 曙光信息产业(北京)有限公司 | A kind of design method of big data parallel file system |
CN108958892A (en) * | 2018-08-14 | 2018-12-07 | 郑州云海信息技术有限公司 | A kind of method and apparatus creating the container for deep learning operation |
CN109063842A (en) * | 2018-07-06 | 2018-12-21 | 无锡雪浪数制科技有限公司 | A kind of machine learning platform of compatible many algorithms frame |
CN109086134A (en) * | 2018-07-19 | 2018-12-25 | 郑州云海信息技术有限公司 | A kind of operation method and device of deep learning operation |
CN109146084A (en) * | 2018-09-06 | 2019-01-04 | 郑州云海信息技术有限公司 | A kind of method and device of the machine learning based on cloud computing |
CN109254830A (en) * | 2018-09-04 | 2019-01-22 | 郑州云海信息技术有限公司 | Visual management method and device in deep learning system |
CN109284184A (en) * | 2018-03-07 | 2019-01-29 | 中山大学 | A kind of building method of the distributed machines learning platform based on containerization technique |
CN109358944A (en) * | 2018-09-17 | 2019-02-19 | 深算科技(重庆)有限公司 | Deep learning distributed arithmetic method, apparatus, computer equipment and storage medium |
CN109522089A (en) * | 2018-11-02 | 2019-03-26 | 成都三零凯天通信实业有限公司 | Based on the distributed view of virtualized environment as recognition methods |
CN109961151A (en) * | 2017-12-21 | 2019-07-02 | 同方威视科技江苏有限公司 | For the system for calculating service of machine learning and for the method for machine learning |
CN110245003A (en) * | 2019-06-06 | 2019-09-17 | 中信银行股份有限公司 | A kind of machine learning uniprocessor algorithm arranging system and method |
CN110554995A (en) * | 2019-08-13 | 2019-12-10 | 武汉中海庭数据技术有限公司 | Deep learning model management method and system |
WO2020001564A1 (en) * | 2018-06-29 | 2020-01-02 | 杭州海康威视数字技术股份有限公司 | Method, apparatus, and system for processing tasks |
CN110866605A (en) * | 2018-08-27 | 2020-03-06 | 北京京东尚科信息技术有限公司 | Data model training method and device, electronic equipment and readable medium |
CN111343219A (en) * | 2018-12-18 | 2020-06-26 | 同方威视技术股份有限公司 | Computing service cloud platform |
CN112394944A (en) * | 2019-08-13 | 2021-02-23 | 阿里巴巴集团控股有限公司 | Distributed development method, device, storage medium and computer equipment |
WO2022134001A1 (en) * | 2020-12-25 | 2022-06-30 | 深圳晶泰科技有限公司 | Machine learning model framework development method and system based on containerization technology |
US11954521B2 (en) | 2018-03-30 | 2024-04-09 | Huawei Cloud Computing Technologies Co., Ltd. | Deep learning job scheduling method and system and related device |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105577503A (en) * | 2016-01-18 | 2016-05-11 | 浪潮集团有限公司 | Cloud switch system based on Docker and realization method thereof |
CN105740048A (en) * | 2016-01-26 | 2016-07-06 | 华为技术有限公司 | Image management method, device and system |
CN105871988A (en) * | 2015-12-14 | 2016-08-17 | 乐视云计算有限公司 | Service deployment method and device |
CN106657248A (en) * | 2016-11-01 | 2017-05-10 | 山东大学 | Docker container based network load balancing system and establishment method and operating method thereof |
US20170139816A1 (en) * | 2015-11-17 | 2017-05-18 | Alexey Sapozhnikov | Computerized method and end-to-end "pilot as a service" system for controlling start-up/enterprise interactions |
CN106850621A (en) * | 2017-02-07 | 2017-06-13 | 南京云创大数据科技股份有限公司 | A kind of method based on container cloud fast construction Hadoop clusters |
-
2017
- 2017-09-22 CN CN201710866197.0A patent/CN107450961B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170139816A1 (en) * | 2015-11-17 | 2017-05-18 | Alexey Sapozhnikov | Computerized method and end-to-end "pilot as a service" system for controlling start-up/enterprise interactions |
CN105871988A (en) * | 2015-12-14 | 2016-08-17 | 乐视云计算有限公司 | Service deployment method and device |
CN105577503A (en) * | 2016-01-18 | 2016-05-11 | 浪潮集团有限公司 | Cloud switch system based on Docker and realization method thereof |
CN105740048A (en) * | 2016-01-26 | 2016-07-06 | 华为技术有限公司 | Image management method, device and system |
CN106657248A (en) * | 2016-11-01 | 2017-05-10 | 山东大学 | Docker container based network load balancing system and establishment method and operating method thereof |
CN106850621A (en) * | 2017-02-07 | 2017-06-13 | 南京云创大数据科技股份有限公司 | A kind of method based on container cloud fast construction Hadoop clusters |
Non-Patent Citations (1)
Title |
---|
ODDBILLOW: "TensorFlowOnSpark安装教程", 《HTTPS://BLOG.CSDN.NET/QUITOZANG/ARTICLE/DETAILS/71437179》 * |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109961151B (en) * | 2017-12-21 | 2021-05-14 | 同方威视科技江苏有限公司 | System of computing services for machine learning and method for machine learning |
CN109961151A (en) * | 2017-12-21 | 2019-07-02 | 同方威视科技江苏有限公司 | For the system for calculating service of machine learning and for the method for machine learning |
CN108255968A (en) * | 2017-12-26 | 2018-07-06 | 曙光信息产业(北京)有限公司 | A kind of design method of big data parallel file system |
CN109284184A (en) * | 2018-03-07 | 2019-01-29 | 中山大学 | A kind of building method of the distributed machines learning platform based on containerization technique |
US11954521B2 (en) | 2018-03-30 | 2024-04-09 | Huawei Cloud Computing Technologies Co., Ltd. | Deep learning job scheduling method and system and related device |
WO2020001564A1 (en) * | 2018-06-29 | 2020-01-02 | 杭州海康威视数字技术股份有限公司 | Method, apparatus, and system for processing tasks |
CN109063842A (en) * | 2018-07-06 | 2018-12-21 | 无锡雪浪数制科技有限公司 | A kind of machine learning platform of compatible many algorithms frame |
CN109086134A (en) * | 2018-07-19 | 2018-12-25 | 郑州云海信息技术有限公司 | A kind of operation method and device of deep learning operation |
CN108958892A (en) * | 2018-08-14 | 2018-12-07 | 郑州云海信息技术有限公司 | A kind of method and apparatus creating the container for deep learning operation |
CN110866605A (en) * | 2018-08-27 | 2020-03-06 | 北京京东尚科信息技术有限公司 | Data model training method and device, electronic equipment and readable medium |
CN109254830A (en) * | 2018-09-04 | 2019-01-22 | 郑州云海信息技术有限公司 | Visual management method and device in deep learning system |
CN109146084A (en) * | 2018-09-06 | 2019-01-04 | 郑州云海信息技术有限公司 | A kind of method and device of the machine learning based on cloud computing |
CN109358944A (en) * | 2018-09-17 | 2019-02-19 | 深算科技(重庆)有限公司 | Deep learning distributed arithmetic method, apparatus, computer equipment and storage medium |
CN109522089A (en) * | 2018-11-02 | 2019-03-26 | 成都三零凯天通信实业有限公司 | Based on the distributed view of virtualized environment as recognition methods |
CN111343219A (en) * | 2018-12-18 | 2020-06-26 | 同方威视技术股份有限公司 | Computing service cloud platform |
CN111343219B (en) * | 2018-12-18 | 2022-08-02 | 同方威视技术股份有限公司 | Computing service cloud platform |
CN110245003A (en) * | 2019-06-06 | 2019-09-17 | 中信银行股份有限公司 | A kind of machine learning uniprocessor algorithm arranging system and method |
CN112394944A (en) * | 2019-08-13 | 2021-02-23 | 阿里巴巴集团控股有限公司 | Distributed development method, device, storage medium and computer equipment |
CN110554995A (en) * | 2019-08-13 | 2019-12-10 | 武汉中海庭数据技术有限公司 | Deep learning model management method and system |
WO2022134001A1 (en) * | 2020-12-25 | 2022-06-30 | 深圳晶泰科技有限公司 | Machine learning model framework development method and system based on containerization technology |
Also Published As
Publication number | Publication date |
---|---|
CN107450961B (en) | 2020-10-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107450961A (en) | A kind of distributed deep learning system and its building method, method of work based on Docker containers | |
US10204033B2 (en) | Method and system for semantic test suite reduction | |
US20210011688A1 (en) | Automatic discovery of microservices from monolithic applications | |
US10162735B2 (en) | Distributed system test automation framework | |
EP3161610B1 (en) | Optimized browser rendering process | |
CN111241203B (en) | Hive data warehouse synchronization method, system, equipment and storage medium | |
CN111709527A (en) | Operation and maintenance knowledge map library establishing method, device, equipment and storage medium | |
US20160267117A1 (en) | Answering natural language table queries through semantic table representation | |
CN104102701B (en) | A kind of historical data based on hive is achieved and querying method | |
JP6903755B2 (en) | Data integration job conversion | |
WO2019051919A1 (en) | Method and apparatus for constructing mirror image | |
US20170371709A1 (en) | Optimizing simultaneous startup or modification of inter-dependent machines with specified priorities | |
CN109213498A (en) | A kind of configuration method and server of internet web front-end | |
US9401957B2 (en) | System and method for synchronization between servers | |
US11960578B2 (en) | Correspondence of external operations to containers and mutation events | |
US11593419B2 (en) | User-centric ontology population with user refinement | |
CN103077034B (en) | hybrid virtualization platform JAVA application migration method and system | |
CN111435367A (en) | Knowledge graph construction method, system, equipment and storage medium | |
JP6329552B2 (en) | Reference data segmentation from single table to multiple tables | |
US8862544B2 (en) | Grid based replication | |
CN111352664A (en) | Distributed machine learning task starting method, system, equipment and storage medium | |
US11436249B1 (en) | Transformation of composite tables into structured database content | |
CN112486460A (en) | Method, system, device and medium for automatically importing interface document | |
US20240152511A1 (en) | Transliteration of machine interpretable languages for enhanced compaction | |
US11720533B2 (en) | Automated classification of data types for databases |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |