CN107450961A - A kind of distributed deep learning system and its building method, method of work based on Docker containers - Google Patents

A kind of distributed deep learning system and its building method, method of work based on Docker containers Download PDF

Info

Publication number
CN107450961A
CN107450961A CN201710866197.0A CN201710866197A CN107450961A CN 107450961 A CN107450961 A CN 107450961A CN 201710866197 A CN201710866197 A CN 201710866197A CN 107450961 A CN107450961 A CN 107450961A
Authority
CN
China
Prior art keywords
deep learning
host
distributed
distributed deep
docker containers
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710866197.0A
Other languages
Chinese (zh)
Other versions
CN107450961B (en
Inventor
张舒
吴大雷
张秀真
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ji'nan Junda Information Technology Co Ltd
Original Assignee
Ji'nan Junda Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ji'nan Junda Information Technology Co Ltd filed Critical Ji'nan Junda Information Technology Co Ltd
Priority to CN201710866197.0A priority Critical patent/CN107450961B/en
Publication of CN107450961A publication Critical patent/CN107450961A/en
Application granted granted Critical
Publication of CN107450961B publication Critical patent/CN107450961B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/61Installation
    • G06F8/63Image based installation; Cloning; Build to order
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Stored Programmes (AREA)

Abstract

The present invention relates to a kind of distributed deep learning system and its building method, method of work based on Docker containers, including a server host, the first distributed deep learning platform, the second distributed deep learning platform.The present invention utilizes Docker containerization technique, and multiple distributed deep learning systems are carried simultaneously on a server host.The improvement of the present invention is mainly reflected in three aspects:First, whole system can be realized on a server host, it is not necessary to more main frames, save cost;Second, container is created by template mirror image, process is simple, it is not necessary to which repetition is built, and is avoided the occurrence of mistake and is lost time;Third, the CPU of server can be utilized maximumlly, hardware resource is no longer wasted.

Description

A kind of distributed deep learning system and its building method based on Docker containers, Method of work
Technical field
The present invention relates to a kind of distributed deep learning system based on Docker containers and its building method, work side Method, belong to cloud computing technical field of virtualization.
Background technology
In essence, cloud computing refers to that user terminal obtains storage, calculating, database calculating money by remotely connecting Source.Virtualization technology is one of core composition of cloud computing technology, is fully to integrate various calculating and storage resource and efficiently The key technology utilized, including server virtualization and desktop virtualization.Docker as emerging lightweight virtualization technology, Compared with traditional VM, its more light weight, toggle speed faster, can run hundreds and thousands of individual containers simultaneously, so non-on separate unit hardware Often it is adapted to extending transversely by starting a large amount of containers progress in the peak traffic phase.
Deep learning platform uses unit processing at present, seldom uses distributed deep learning platform, distributed first Deep learning platform building is more complicated, and required hardware configuration is more.But compared to the deep learning platform of unit, distribution Formula deep learning platform can carry out computing faster.
Current these following problems of technology generally existing on the market:
1) unit deep learning platform is done using server, CPU amounts of calculation are enough, but can not be fully utilized, and cause to provide Source wastes.
2) distributed deep learning platform is, it is necessary to multiple host is built, the CPU limited calculated amounts of every main frame, it is desirable to take It is big to build host number needed for extensive platform, it is costly.
3) distributed deep learning platform building process is cumbersome, and using main frame building method, every main frame all needs to carry out phase Same step.But repeat that different mistakes occurs during same steps, cause the process of building slow.
Chinese patent literature CN106657248A disclose a kind of Network Load Balance system based on Docker containers and Its building method, method of work.Using basic technology of the Docker container techniques as the system, Docker container saving is utilized The characteristics of hardware resource can create a large amount of containers in a server host, a whole set of network is realized on a server host SiteServer LBS;It can be created using Docker containers by mirror image second level, and the container created by mirror image can guarantee that The characteristics of exactly the same, realize the shunting that the Web server amount of conducting interviews or data flow are conveniently added by container mirror image. But there is following defect in the patent:When creating mirror image using Dockerfile, it is impossible to visualization and some in test mirrors picture Whether file configuration succeeds.
The content of the invention
In view of the shortcomings of the prior art, the invention provides a kind of distributed deep learning system based on Docker containers System;
Present invention also offers the building method and method of work of above-mentioned distributed deep learning system;
The present invention utilizes Docker containerization technique, and multiple distributed depth are carried simultaneously on a server host Learning system.The improvement of the present invention is mainly reflected in three aspects:First, will configuration by using Docker commit instructions Good container generation mirror image, realizes whether some file configurations in visualization and test mirrors picture succeed;Second, whole system can To be realized on a server host, it is not necessary to more main frames, save cost;Third, container is created by template mirror image, mistake Journey is simple, it is not necessary to which repetition is built, and is avoided the occurrence of mistake and is lost time;Fourth, the CPU of server can be utilized maximumlly, no longer Waste hardware resource.
Term is explained:
1st, Hadoop distributed platforms, the distributed system architecture developed by Apache funds club is referred to.User Distributed program can be developed in the case where not knowing about distributed low-level details.The power of cluster is made full use of to carry out at a high speed Computing and storage.Hadoop realizes a distributed file system Hadoop Distributed File System, referred to as HDFS.HDFS has the characteristics of high fault tolerance, and is designed to be deployed on cheap hardware;And it provides high-throughput The data of access application, it is adapted to those to have the application program of super large data set.HDFS relaxes POSIX requirement, can To access the data in file system in the form of streaming.
2nd, Spark, refer to that the class Hadoop MapReduce's that UC Berkeley AMP lab are increased income is general parallel Computational frame, the Distributed Calculation that Spark is realized based on map reduce algorithms, possesses possessed by Hadoop MapReduce Advantage;But what it is different from MapReduce is that output and result can be stored in internal memory among Job, so as to no longer need to read and write HDFS, therefore Spark can preferably be applied to the algorithm that data mining and machine learning etc. need the map reduce of iteration.
3rd, NameNode, the NameSpace of file system is managed.It maintains all in file system tree and whole tree File and catalogue.These information are permanently stored on local disk with two document forms:NameSpace image file and editor Journal file.NameNode also records in each file the back end information where each piece, but its not persistence The positional information of block, because these information are rebuild when system starts by back end.
The technical scheme is that:
A kind of distributed deep learning system based on Docker containers, including a host and multiple Docker hold Device, Hadoop distributed platforms, Spark are installed on host, the first distributed deep learning is also equipped with host and is put down Platform or the second distributed deep learning platform;Hadoop distributed platforms, Spark are installed on each Docker containers, each The first distributed deep learning platform or the second distributed deep learning platform are also equipped with Docker containers.
Server host is as host, as the hardware support of whole platform, the first distributed deep learning platform and Second distributed deep learning platform is two kinds of currently available distributed deep learning platforms, is all increased income by Yahoo, is current The distributed deep learning platform of main flow.
First distributed deep learning platform, the second distributed deep learning platform are used to help carry out deep learning Instrument, different from unit deep learning platform;Hardware foundation of the server host as whole distributed deep learning system, is needed Possess higher position reason ability, stability, reliability etc. to require.
According to currently preferred, the model DELL PowerEdge R730 of the host, first distribution The model CaffeOnSpark of deep learning platform, the model of the second distributed deep learning platform TensorFlowOnSpark。
DELL PowerEdge R730 server, it is configured to 48 core CPU, 96G internal memories, 8TB local hard drives;Caffe、 TensorFlow is two most popular at present unit deep learning platforms, based on Caffe, TensorFlow CaffeOnSpark, TensorFlowOnSpark are that the distributed deep learning based on Hadoop/Spark that Yahoo increases income is put down Platform.
The building method of the above-mentioned distributed deep learning system based on Docker containers, specific steps include:
(1) host is prepared, host is the server host;Ubuntu14.04 operating systems are installed; Ubuntu14.04 can be mounted directly as metastable version in the (SuSE) Linux OS for supporting Docker with order line Configure Docker environment;
(2) main folder needed for Docker containers is established under host root, main folder includes being capable of carry File, carry out training pattern required for deep learning, training dataset for preserving, test data set, code and match somebody with somebody Put file;
(3) Hadoop distributed platforms, Spark are installed in host;To support that CaffeOnSpark is distributed deep Spend learning platform or TensorFlowOnSpark distribution deep learning platforms;Test Hadoop distributed platforms, Spark are It is no to install successfully;If installed successfully, into step (4), otherwise, step (3) is repeated;
(4) model CaffeOnSpark the first distributed deep learning platform or model is installed in host TensorFlowOnSpark the second distributed deep learning platform, configure the IP of the host node;By the host during system operation Machine is as host node;
(5) blank vessel is created on host;
(6) Hadoop distributed platforms, Spark are installed on the blank vessel;
(7) model CaffeOnSpark the first distributed deep learning is installed on the container after step (6) installation Second distributed deep learning platform of platform or model TensorFlowOnSpark, configures the IP from node;System is transported Using the container as from node during row;
(8) container after being installed by Docker commit instructions using step (7) is template establishment mirror image;
(9) mirror image created with step (8) creates multiple Docker containers, and with configuring the IP of each Docker containers Location.
It is as follows whether test Hadoop distributed platforms install successful step:Perform NameNode formatting, success If, can be appreciated that " successfully formatted " and " Exitting with status 0 " prompting, if " Exitting with status 1 " are then errors.If prompt Error in this step:JAVA_HOME is not set And could not be found. mistake, then JAVA_HOME environmental variances are set just not set over there before explanation, JAVA_HOME variables are please first set by study course, otherwise behind process do not gone down.Then NameNode is opened With DataNode finger daemons, if there are following SSH promptings, yes is inputted.
It is as follows whether test Spark installs successful step:Have under spark/examples/src/main catalogues Spark example procedure, there is the version of the language such as Scala, Java, Python, R.Run an example procedure SparkPi (i.e. Calculate π approximation), very more operation informations can be exported during execution, output result is not easily found, and can be ordered by grep Order is filtered, and the operation result after filtering obtains π 5 decimal approximations.
The method of work of the above-mentioned distributed deep learning system based on Docker containers, specific steps include:
(1) Hadoop platform and Spark in the host are started, the host is as whole distributed depth The host node of learning system, and start Hadoop platform and Spark in several described Docker containers, several described Docker Container is as whole distributed deep learning system from node;
(2) it is capable of the training pattern being stored under the file of carry required for deep learning training, training in host Data set, test data set, code and configuration file;
(3) trained by script startup deep learning, deep learning training mission is assigned to each from node by host node Carry out parallel training.
Beneficial effects of the present invention are:
1st, the present invention can erect distributed deep learning platform in the case where using a server host.
2nd, when needing more distributed nodes, energy quick opening container, which is matched somebody with somebody, postpones addition node.
3rd, the CPU computing resources of server are made full use of.
Brief description of the drawings
Fig. 1 is the structured flowchart of the distributed deep learning system of the invention based on Docker containers;
Embodiment
The present invention is further qualified with reference to Figure of description and embodiment, but not limited to this.
Embodiment 1
A kind of distributed deep learning system based on Docker containers, as shown in figure 1, including a host and multiple Docker containers, Hadoop distributed platforms, Spark are installed on host, it is distributed deep that first is also equipped with host Spend learning platform or the second distributed deep learning platform;Be provided with each Docker containers Hadoop distributed platforms, Spark, the first distributed deep learning platform or the second distributed deep learning platform are also equipped with each Docker containers.
Server host is as host, as the hardware support of whole platform, the first distributed deep learning platform and Second distributed deep learning platform is two kinds of currently available distributed deep learning platforms, is all increased income by Yahoo, is current The distributed deep learning platform of main flow.
First distributed deep learning platform, the second distributed deep learning platform are used to help carry out deep learning Instrument, different from unit deep learning platform;Hardware foundation of the server host as whole distributed deep learning system, is needed Possess higher position reason ability, stability, reliability etc. to require.
The model DELL PowerEdge R730 of host, the model of the first distributed deep learning platform CaffeOnSpark, the model TensorFlowOnSpark of the second distributed deep learning platform.
DELL PowerEdge R730 server, it is configured to 48 core CPU, 96G internal memories, 8TB local hard drives;Caffe、 TensorFlow is two most popular at present unit deep learning platforms, based on Caffe, TensorFlow CaffeOnSpark, TensorFlowOnSpark are that the distributed deep learning based on Hadoop/Spark that Yahoo increases income is put down Platform.
Embodiment 2
The building method of the distributed deep learning system based on Docker containers described in embodiment 1, specific steps bag Include:
(1) host is prepared, host is server host;Ubuntu14.04 operating systems are installed; Ubuntu14.04 can be mounted directly as metastable version in the (SuSE) Linux OS for supporting Docker with order line Configure Docker environment;
(2) main folder needed for Docker containers is established under host root, main folder includes being capable of carry File, carry out training pattern required for deep learning, training dataset for preserving, test data set, code and match somebody with somebody Put file;
(3) Hadoop distributed platforms, Spark are installed in host;To support that CaffeOnSpark is distributed deep Spend learning platform or TensorFlowOnSpark distribution deep learning platforms;Test Hadoop distributed platforms, Spark are It is no to install successfully;If installed successfully, into step (4), otherwise, step (3) is repeated;
It is as follows whether test Hadoop distributed platforms install successful step:Perform NameNode formatting, success If, can be appreciated that " successfully formatted " and " Exitting with status 0 " prompting, if " Exitting with status 1 " are then errors.If prompt Error in this step:JAVA_HOME is not set And could not be found. mistake, then JAVA_HOME environmental variances are set just not set over there before explanation, JAVA_HOME variables are please first set by study course, otherwise behind process do not gone down.Then NameNode is opened With DataNode finger daemons, if there are following SSH promptings, yes is inputted.
It is as follows whether test Spark installs successful step:Have under spark/examples/src/main catalogues Spark example procedure, there is the version of the language such as Scala, Java, Python, R.Run an example procedure SparkPi (i.e. Calculate π approximation), very more operation informations can be exported during execution, output result is not easily found, and can be ordered by grep Order is filtered, and the operation result after filtering obtains π 5 decimal approximations.
(4) model CaffeOnSpark the first distributed deep learning platform or model is installed in host TensorFlowOnSpark the second distributed deep learning platform, configure the IP of the host node;By the host during system operation Machine is as host node;
(5) blank vessel is created on host;
(6) Hadoop distributed platforms, Spark are installed on the blank vessel;
(7) model CaffeOnSpark the first distributed deep learning is installed on the container after step (6) installation Second distributed deep learning platform of platform or model TensorFlowOnSpark, configures the IP from node;System is transported Using the container as from node during row;
(8) container after being installed by Docker commit instructions using step (7) is template establishment mirror image;
(9) mirror image created with step (8) creates multiple Docker containers, and with configuring the IP of each Docker containers Location.
Embodiment 3
The method of work of the distributed deep learning system based on Docker containers described in embodiment 1, specific steps bag Include:
(1) Hadoop platform and Spark in the host are started, the host is as whole distributed depth The host node of learning system, and start Hadoop platform and Spark in several described Docker containers, several described Docker Container is as whole distributed deep learning system from node;
(2) it is capable of the training pattern being stored under the file of carry required for deep learning training, training in host Data set, test data set, code and configuration file;
(3) trained by script startup deep learning, deep learning training mission is assigned to each from node by host node Carry out parallel training.

Claims (4)

1. a kind of distributed deep learning system based on Docker containers, it is characterised in that including a host and multiple Docker containers, Hadoop distributed platforms, Spark are installed on host, it is distributed deep that first is also equipped with host Spend learning platform or the second distributed deep learning platform;Be provided with each Docker containers Hadoop distributed platforms, Spark, the first distributed deep learning platform or the second distributed deep learning platform are also equipped with each Docker containers.
2. a kind of distributed deep learning system based on Docker containers according to claim 1, it is characterised in that described The model DELL PowerEdge R730 of host, the model of the first distributed deep learning platform CaffeOnSpark, the model TensorFlowOnSpark of the second distributed deep learning platform.
3. a kind of building method of distributed deep learning system based on Docker containers according to claim 1 or 2, Characterized in that, specific steps include:
(1) host is prepared, host is server host;
(2) main folder needed for Docker containers is established under host root, main folder includes the text for being capable of carry Part is pressed from both sides, and training pattern, training dataset, test data set, code and configuration text required for deep learning are carried out for preservation Part;
(3) Hadoop distributed platforms, Spark are installed in host;Whether test Hadoop distributed platforms, Spark pacify Dress up work(;If installed successfully, into step (4), otherwise, step (3) is repeated;
(4) model CaffeOnSpark the first distributed deep learning platform or model is installed in host TensorFlowOnSpark the second distributed deep learning platform, configure the IP of the host node;
(5) blank vessel is created on host;
(6) Hadoop distributed platforms, Spark are installed on the blank vessel;
(7) model CaffeOnSpark the first distributed deep learning platform is installed on the container after step (6) installation Or model TensorFlowOnSpark the second distributed deep learning platform, configure the IP from node;
(8) container after being installed by Docker commit instructions using step (7) is template establishment mirror image;
(9) mirror image created with step (8) creates multiple Docker containers, and configures the IP address of each Docker containers.
4. a kind of method of work of distributed deep learning system based on Docker containers according to claim 1 or 2, Characterized in that, specific steps include:
(1) Hadoop platform and Spark in the host are started, the host is as whole distributed deep learning system The host node of system, and start Hadoop platform and Spark in several described Docker containers, several described Docker containers As whole distributed deep learning system from node;
(2) it is capable of the training pattern being stored under the file of carry required for deep learning training, training data in host Collection, test data set, code and configuration file;
(3) trained by script startup deep learning, deep learning training mission is assigned to each from node progress by host node Parallel training.
CN201710866197.0A 2017-09-22 2017-09-22 Distributed deep learning system based on Docker container and construction method and working method thereof Active CN107450961B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710866197.0A CN107450961B (en) 2017-09-22 2017-09-22 Distributed deep learning system based on Docker container and construction method and working method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710866197.0A CN107450961B (en) 2017-09-22 2017-09-22 Distributed deep learning system based on Docker container and construction method and working method thereof

Publications (2)

Publication Number Publication Date
CN107450961A true CN107450961A (en) 2017-12-08
CN107450961B CN107450961B (en) 2020-10-16

Family

ID=60498100

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710866197.0A Active CN107450961B (en) 2017-09-22 2017-09-22 Distributed deep learning system based on Docker container and construction method and working method thereof

Country Status (1)

Country Link
CN (1) CN107450961B (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108255968A (en) * 2017-12-26 2018-07-06 曙光信息产业(北京)有限公司 A kind of design method of big data parallel file system
CN108958892A (en) * 2018-08-14 2018-12-07 郑州云海信息技术有限公司 A kind of method and apparatus creating the container for deep learning operation
CN109063842A (en) * 2018-07-06 2018-12-21 无锡雪浪数制科技有限公司 A kind of machine learning platform of compatible many algorithms frame
CN109086134A (en) * 2018-07-19 2018-12-25 郑州云海信息技术有限公司 A kind of operation method and device of deep learning operation
CN109146084A (en) * 2018-09-06 2019-01-04 郑州云海信息技术有限公司 A kind of method and device of the machine learning based on cloud computing
CN109254830A (en) * 2018-09-04 2019-01-22 郑州云海信息技术有限公司 Visual management method and device in deep learning system
CN109284184A (en) * 2018-03-07 2019-01-29 中山大学 A kind of building method of the distributed machines learning platform based on containerization technique
CN109358944A (en) * 2018-09-17 2019-02-19 深算科技(重庆)有限公司 Deep learning distributed arithmetic method, apparatus, computer equipment and storage medium
CN109522089A (en) * 2018-11-02 2019-03-26 成都三零凯天通信实业有限公司 Based on the distributed view of virtualized environment as recognition methods
CN109961151A (en) * 2017-12-21 2019-07-02 同方威视科技江苏有限公司 For the system for calculating service of machine learning and for the method for machine learning
CN110245003A (en) * 2019-06-06 2019-09-17 中信银行股份有限公司 A kind of machine learning uniprocessor algorithm arranging system and method
CN110554995A (en) * 2019-08-13 2019-12-10 武汉中海庭数据技术有限公司 Deep learning model management method and system
WO2020001564A1 (en) * 2018-06-29 2020-01-02 杭州海康威视数字技术股份有限公司 Method, apparatus, and system for processing tasks
CN110866605A (en) * 2018-08-27 2020-03-06 北京京东尚科信息技术有限公司 Data model training method and device, electronic equipment and readable medium
CN111343219A (en) * 2018-12-18 2020-06-26 同方威视技术股份有限公司 Computing service cloud platform
CN112394944A (en) * 2019-08-13 2021-02-23 阿里巴巴集团控股有限公司 Distributed development method, device, storage medium and computer equipment
WO2022134001A1 (en) * 2020-12-25 2022-06-30 深圳晶泰科技有限公司 Machine learning model framework development method and system based on containerization technology
US11954521B2 (en) 2018-03-30 2024-04-09 Huawei Cloud Computing Technologies Co., Ltd. Deep learning job scheduling method and system and related device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105577503A (en) * 2016-01-18 2016-05-11 浪潮集团有限公司 Cloud switch system based on Docker and realization method thereof
CN105740048A (en) * 2016-01-26 2016-07-06 华为技术有限公司 Image management method, device and system
CN105871988A (en) * 2015-12-14 2016-08-17 乐视云计算有限公司 Service deployment method and device
CN106657248A (en) * 2016-11-01 2017-05-10 山东大学 Docker container based network load balancing system and establishment method and operating method thereof
US20170139816A1 (en) * 2015-11-17 2017-05-18 Alexey Sapozhnikov Computerized method and end-to-end "pilot as a service" system for controlling start-up/enterprise interactions
CN106850621A (en) * 2017-02-07 2017-06-13 南京云创大数据科技股份有限公司 A kind of method based on container cloud fast construction Hadoop clusters

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170139816A1 (en) * 2015-11-17 2017-05-18 Alexey Sapozhnikov Computerized method and end-to-end "pilot as a service" system for controlling start-up/enterprise interactions
CN105871988A (en) * 2015-12-14 2016-08-17 乐视云计算有限公司 Service deployment method and device
CN105577503A (en) * 2016-01-18 2016-05-11 浪潮集团有限公司 Cloud switch system based on Docker and realization method thereof
CN105740048A (en) * 2016-01-26 2016-07-06 华为技术有限公司 Image management method, device and system
CN106657248A (en) * 2016-11-01 2017-05-10 山东大学 Docker container based network load balancing system and establishment method and operating method thereof
CN106850621A (en) * 2017-02-07 2017-06-13 南京云创大数据科技股份有限公司 A kind of method based on container cloud fast construction Hadoop clusters

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ODDBILLOW: "TensorFlowOnSpark安装教程", 《HTTPS://BLOG.CSDN.NET/QUITOZANG/ARTICLE/DETAILS/71437179》 *

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109961151B (en) * 2017-12-21 2021-05-14 同方威视科技江苏有限公司 System of computing services for machine learning and method for machine learning
CN109961151A (en) * 2017-12-21 2019-07-02 同方威视科技江苏有限公司 For the system for calculating service of machine learning and for the method for machine learning
CN108255968A (en) * 2017-12-26 2018-07-06 曙光信息产业(北京)有限公司 A kind of design method of big data parallel file system
CN109284184A (en) * 2018-03-07 2019-01-29 中山大学 A kind of building method of the distributed machines learning platform based on containerization technique
US11954521B2 (en) 2018-03-30 2024-04-09 Huawei Cloud Computing Technologies Co., Ltd. Deep learning job scheduling method and system and related device
WO2020001564A1 (en) * 2018-06-29 2020-01-02 杭州海康威视数字技术股份有限公司 Method, apparatus, and system for processing tasks
CN109063842A (en) * 2018-07-06 2018-12-21 无锡雪浪数制科技有限公司 A kind of machine learning platform of compatible many algorithms frame
CN109086134A (en) * 2018-07-19 2018-12-25 郑州云海信息技术有限公司 A kind of operation method and device of deep learning operation
CN108958892A (en) * 2018-08-14 2018-12-07 郑州云海信息技术有限公司 A kind of method and apparatus creating the container for deep learning operation
CN110866605A (en) * 2018-08-27 2020-03-06 北京京东尚科信息技术有限公司 Data model training method and device, electronic equipment and readable medium
CN109254830A (en) * 2018-09-04 2019-01-22 郑州云海信息技术有限公司 Visual management method and device in deep learning system
CN109146084A (en) * 2018-09-06 2019-01-04 郑州云海信息技术有限公司 A kind of method and device of the machine learning based on cloud computing
CN109358944A (en) * 2018-09-17 2019-02-19 深算科技(重庆)有限公司 Deep learning distributed arithmetic method, apparatus, computer equipment and storage medium
CN109522089A (en) * 2018-11-02 2019-03-26 成都三零凯天通信实业有限公司 Based on the distributed view of virtualized environment as recognition methods
CN111343219A (en) * 2018-12-18 2020-06-26 同方威视技术股份有限公司 Computing service cloud platform
CN111343219B (en) * 2018-12-18 2022-08-02 同方威视技术股份有限公司 Computing service cloud platform
CN110245003A (en) * 2019-06-06 2019-09-17 中信银行股份有限公司 A kind of machine learning uniprocessor algorithm arranging system and method
CN112394944A (en) * 2019-08-13 2021-02-23 阿里巴巴集团控股有限公司 Distributed development method, device, storage medium and computer equipment
CN110554995A (en) * 2019-08-13 2019-12-10 武汉中海庭数据技术有限公司 Deep learning model management method and system
WO2022134001A1 (en) * 2020-12-25 2022-06-30 深圳晶泰科技有限公司 Machine learning model framework development method and system based on containerization technology

Also Published As

Publication number Publication date
CN107450961B (en) 2020-10-16

Similar Documents

Publication Publication Date Title
CN107450961A (en) A kind of distributed deep learning system and its building method, method of work based on Docker containers
US10204033B2 (en) Method and system for semantic test suite reduction
US20210011688A1 (en) Automatic discovery of microservices from monolithic applications
US10162735B2 (en) Distributed system test automation framework
EP3161610B1 (en) Optimized browser rendering process
CN111241203B (en) Hive data warehouse synchronization method, system, equipment and storage medium
CN111709527A (en) Operation and maintenance knowledge map library establishing method, device, equipment and storage medium
US20160267117A1 (en) Answering natural language table queries through semantic table representation
CN104102701B (en) A kind of historical data based on hive is achieved and querying method
JP6903755B2 (en) Data integration job conversion
WO2019051919A1 (en) Method and apparatus for constructing mirror image
US20170371709A1 (en) Optimizing simultaneous startup or modification of inter-dependent machines with specified priorities
CN109213498A (en) A kind of configuration method and server of internet web front-end
US9401957B2 (en) System and method for synchronization between servers
US11960578B2 (en) Correspondence of external operations to containers and mutation events
US11593419B2 (en) User-centric ontology population with user refinement
CN103077034B (en) hybrid virtualization platform JAVA application migration method and system
CN111435367A (en) Knowledge graph construction method, system, equipment and storage medium
JP6329552B2 (en) Reference data segmentation from single table to multiple tables
US8862544B2 (en) Grid based replication
CN111352664A (en) Distributed machine learning task starting method, system, equipment and storage medium
US11436249B1 (en) Transformation of composite tables into structured database content
CN112486460A (en) Method, system, device and medium for automatically importing interface document
US20240152511A1 (en) Transliteration of machine interpretable languages for enhanced compaction
US11720533B2 (en) Automated classification of data types for databases

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant