CN107733696A - A kind of machine learning and artificial intelligence application all-in-one dispositions method - Google Patents

A kind of machine learning and artificial intelligence application all-in-one dispositions method Download PDF

Info

Publication number
CN107733696A
CN107733696A CN201710881113.0A CN201710881113A CN107733696A CN 107733696 A CN107733696 A CN 107733696A CN 201710881113 A CN201710881113 A CN 201710881113A CN 107733696 A CN107733696 A CN 107733696A
Authority
CN
China
Prior art keywords
machine learning
artificial intelligence
layer
frame
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710881113.0A
Other languages
Chinese (zh)
Inventor
李云鹏
倪岭
任义龙
张建
刘伟佳
赵志强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Days Mdt Infotech Ltd
Original Assignee
Nanjing Days Mdt Infotech Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Days Mdt Infotech Ltd filed Critical Nanjing Days Mdt Infotech Ltd
Priority to CN201710881113.0A priority Critical patent/CN107733696A/en
Publication of CN107733696A publication Critical patent/CN107733696A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0823Configuration setting characterised by the purposes of a change of settings, e.g. optimising configuration for enhancing reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/30Definitions, standards or architectural aspects of layered protocol stacks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Signal Processing (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Fuzzy Systems (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a kind of machine learning and artificial intelligence application all-in-one dispositions method, this method through the following steps that realize:(1) system architecture is built, the system architecture is logically separated into application layer, computation layer and accumulation layer;(2) building network framework, the network architecture is logically divided into extranets, management net, calculates net and storage net;(3) design is optimized to the autgmentability of system, computing resource is increased using framework extending transversely, improves performance, memory capacity is increased using layer architecture.All-in-one dispositions method disclosed by the invention is by using abundant algorithms library, high-performance calculation engine etc., design that the integrated system architecture of function, flexible networking be topological and the outstanding efficient system expandability, so that all-in-one accelerates big data machine learning rate, the operational efficiency of artificial intelligence analysis's program is improved.

Description

A kind of machine learning and artificial intelligence application all-in-one dispositions method
Technical field
The present invention relates to machine learning and field of artificial intelligence, more particularly to a kind of machine learning and artificial intelligence should With all-in-one dispositions method.
Background technology
Artificial intelligence is just suggested early in the last century 50's, and it is cybernetics, information theory, computer science, mathematics A variety of subjects such as logic, neuro-physiology, psychology, linguistics, pedagogy, medical science, engineering technology and philosophy interpenetrate Cross discipline.People dream of with that time just incipient computer come construct complexity, possess and the same sample of the wisdom of humanity The machine of matter characteristic.This omnipotent machine, it has our all perception (or even more more than people), and we are all Rationality, it can think deeply as us.Machine learning is to study the learning behavior that the mankind were simulated or realized to computer how, to obtain New knowledge or skills are taken, the existing structure of knowledge is reorganized and is allowed to constantly improve the performance of itself.It is artificial intelligence Core, it is the fundamental way for making computer that there is intelligence, every field of its application throughout artificial intelligence.Machine learning is most basic Way, be to parse data using algorithm, from learning, then the event in real world made a policy and predicted.With Traditional different to solve particular task, the software program of hard coded, machine learning is to be passed through with substantial amounts of data come " training " How various algorithms complete task from data learning.
Machine learning is a very popular field in Artificial Intelligence Development, the research purpose of machine learning, is desirable to Computer has the ability for obtaining knowledge from real world as the mankind, while establishes the computational theory of study, and construction is various Learning system is simultaneously applied it in every field.Machine learning research mainly has three directions, first, to simulate the mankind's Learning process is set out, it is intended to establishes the understanding physiological mode of study, the development of this direction and cognitive science is closely related;Two It is basic research, develops the theories of learning of various suitable machine features, inquire into all possible learning method, comparative anthropology practises The similarities and differences with machine learning are with contacting;Third, application study, establishes various practical learning systems or knowledge acquisition aid, Automatic acquisition knowledge system is established in the application field of Artificial Intelligence Science, is accumulated experience, is improved knowledge base and control knowledge, enter And the level of intelligence of machine is set to be similar to the mankind.
At present, the scientific and technological giant including Baidu and Google, 2016 in artificial intelligence be dropped in 20,000,000,000 to Between 30000000000 dollars, wherein in 90% input research and development and deployment, also 10%, which is used for artificial intelligence, purchases.Current artificial intelligence Exogenous investment of the speed 3 times since 2013 can have been invested to increase.Artificial Intelligence Development field is concentrated mainly on high-tech/electricity Letter, automobile/assembling and financial services.Machine learning can be used to assist in industrial quarters and solve problem conscientiously, particularly instantly Focus, such as deep learning, influence unmanned, that artificial intelligence assistant is to industrial quarters be huge.
Big data has promoted the development of artificial intelligence, meanwhile, the development of artificial intelligence also allows data to produce huge value, As " intelligent data ".Artificial intelligence is now had been supplied in various big data applications, such as:Search is recommended, shopping is recommended, voice is known Not, image recognition, chat robots, intelligent medical etc..Machine learning and artificial intelligence are continuous on the basis of big data Grow up, in order to allow rambling mass data to produce value, it is necessary to be carried out using complicated network model to data Analyze in large quantities, the model of high-accuracy at ability training, this just needs huge amount of calculation, therefore computing capability is to engineering Practise and the development of artificial intelligence becomes more and more important.
Current big data machine learning algorithm and artificial intelligence analysis are a relatively inefficient use, and resources occupation rate is higher. It is slower to the processing speed of mass data, and requirement of the mass data to hardware in processing procedure is high, can not meet number According to the intelligence computation requirement of driving enterprise's rapid growth.
The content of the invention
To solve the deficiencies in the prior art, it is an object of the invention to provide a kind of machine learning and artificial intelligence application one Body machine dispositions method, the process employs special design and a variety of optimisation techniques so that all-in-one have superelevation calculate performance, The speed of service of program can be significantly speeded up, the machine learning and artificial intelligence application being suitably applied under big data environment.
In order to realize above-mentioned target, the present invention adopts the following technical scheme that:A kind of machine learning and artificial intelligence application All-in-one dispositions method, it is characterised in that comprise the following steps:
Step 1, data storage and data processing are isolated, using the Shared-Nothing framves of enhanced scalability Structure builds overall system architecture, and the system architecture is logically separated into application layer, computation layer and accumulation layer, and application layer, Computation layer and accumulation layer all use distributed structure/architecture;
Step 2, building network framework, network architecture are divided into single chassis networking topology or multi-frame networking topology, the net Network framework is logically divided into extranets, management net, calculates net and storage net;
Step 3, design is optimized to the autgmentability of system.
Further, the application layer is according to the application node for being actually needed configuration varying number;The computation layer according to It is actually needed the calculate node of configuration varying number;The accumulation layer is according to the memory node for being actually needed configuration varying number.
Further, the calculate node configures following software stack:
Support a variety of programming languages;
API for machine learning and deep learning is provided;
It is integrated with deep learning framework TensorFlow;
It is integrated with the distributed computing framework Spark optimized;
The distributed memory file system Alluxio that optimized is integrated with to accelerate reading and writing data;
It is integrated with the RDMA characteristics optimized.
Further, the memory node provides two kinds of storage services of database and universal file system;The data Storehouse includes relevant database PostgreSQL and sequential type database, and the relevant database PostgreSQL uses HAWQ Distributed architecture, the sequential type database use OpenTSDB+Hase distributed architectures;The universal file system uses HDFS+Ceph mixed structures, HAQW bottoms use HDFS.
Further, extranets network interface card and management net network interface card are disposed on the application node, is disposed in the calculate node Manage net network interface card and calculate storage net network interface card, management net network interface card is disposed on the memory node and calculates storage net network interface card.
Further, the single chassis networking topology includes a frame, and construction method is:
An Ethernet switch is equipped with, the port number of the Ethernet switch is more than or equal to total node in frame Number;
It is equipped with one and calculates storage network switch, the port number for calculating storage network switch is more than or equal in frame Total node number;
It is equipped with an outside network switch.
Further, the multi-frame networking topology includes multiple frames, and construction method is:
Each frame is equipped with an Ethernet switch, and the port number of the Ethernet switch is more than total section in frame Points, and reserve port to connect other frames;
Each frame is equipped with one and calculates storage network switch, and the port number for calculating storage network switch is more than frame Interior total node number, and reserve port to connect other frames;
It is equipped with appropriate number of outside network switch;
Core switch is equipped with, the management network switch of each frame is connected to the core switch using simply tree-like On.
Further, the storage network switch that calculates is InfiniBand interchangers, the InfiniBand of each frame The multiple core switch of interchanger connection form fat tree construction.
Further, the autgmentability to system optimizes comprising the following steps that for design:
Performance is improved using framework extending transversely;
Memory capacity is increased using layer architecture.
Further, described the step of using framework extending transversely to improve performance for:
Increase the calculate node number in the computation layer;
Increase the appropriate network switch.
The present invention is advantageous in that:
(1) all-in-one is isolated data storage and data processing, using the Shared-Nothing of enhanced scalability Framework, client, data processing, data storage are separated, are logically divided into three levels:Application layer, computation layer, storage Layer.Each level uses distributed structure/architecture, can reach higher calculating concurrency and reading and writing data concurrency, while allow whole Individual system is with good expansibility, reliability and maintainability.
(2) distributed super fusion hardware structure and the ingenious collocation with software stack, avoid storage and the waste of computing resource, The stability of data analysis streamline is ensured, lifts analysis efficiency.For the every aspect of hardware structure, including CPU, internal memory, Hierarchical storage, GPU have carried out special optimization, have fully excavated the ability of hardware.Go back Deep integrating simultaneously The frameworks such as TensorFlow, a large amount of optimizations are carried out to distributed machines learning algorithm and communication mechanism.
(3) making full use of by framework, algorithm improvement and hardware, realize that the calculating of the order of magnitude accelerates, reduce enterprise To big data infrastructure and the input of manpower.By data cleansing, modeling analysis, high quality, significant information are obtained, from And excavate data value.
Brief description of the drawings
Fig. 1 is flow chart of the present invention;
Fig. 2 is the integral frame schematic diagram of system;
Fig. 3 is the deployment of components framework schematic diagram of system.
Embodiment
Make specific introduce to the present invention below in conjunction with the drawings and specific embodiments.
Shown in reference picture 1, a kind of machine learning of the present invention and artificial intelligence application all-in-one dispositions method, including following step Suddenly:
Step 1, all-in-one are isolated data storage and data processing, using the Shared- of enhanced scalability Nothing frameworks, three layers can be divided on the whole:Application layer, computation layer, accumulation layer.All-in-one uses layer architecture, has possessed The hardware protection of whole redundancy, any one calculate node or memory node break down, and can ensure that data will not lose, And all-in-one still is able to normal work, the reliability of system is drastically increased.
Client, data processing, data storage are separated, are logically divided into three levels, each level uses Distributed structure/architecture, higher calculating concurrency and reading and writing data concurrency can be reached, while allow whole system to have well Scalability, reliability and maintainability.
Wherein, application layer mainly runs user interface service, as processing login, monitoring, management, calculating task layout/carry The work such as friendship.It is required that CPU and memory configurations are medium, storage capacity configuration is low;The calculating that computation layer is used for performing user's submission is appointed Business.It is required that CPU and memory configurations are high, storage capacity configuration is low;Accumulation layer is mainly that calculate node provides massive store.It is required that CPU and memory configurations are low, and storage capacity configuration is high.As shown in Fig. 2 all-in-one can also be actually needed according to different, neatly Application node, calculate node and the memory node of varying number, and the scalability with height are configured, is integrated with WebUI, money The functions such as source control, system monitoring, scheduling of resource, task management.
Shown in reference picture 3.Application node allows user easily to carry out task management, system by providing a Web UI The way to manages such as monitoring, resource management, outermost layer of the application node as all-in-one, it is exposed to user's operation.Application node, tool Body will provide following functional interface:Application management (using submission/using deletion/application state inquiry), data storage and inquiry (copy/paste/upload/download/establishment/movement/is deleted for (structured storage interface/unstructured memory interface), file management Except), monitoring resource (GPU/CPU/Memory/Network/Disk/Others), management (resource management/Role Management/user Manage/assure reason/node administration).
Calculate node optimizes for a large amount of consumings of computing resource, using the software stack specially designed:
A. the support of a variety of programming languages, such as Python, R, Java, Scala are provided;
B. the API for machine learning and deep learning is provided, while also supports some other general-purpose computations API;
C. it is integrated with deep learning framework TensorFlow.Tensorflow is as deep learning framework, substantial amounts of application Based on being developed on this framework, all-in-one is integrated with the framework so that the application based on this Development of Framework can be directly at it Upper operation.
D. it is integrated with the distributed computing framework Spark optimized.Spark is an efficient distributed computing system, On this basis, Spark underlying algorithm storehouse is optimized so that distributed task scheduling has operation faster in each calculate node Speed.
E. the distributed memory file system Alluxio that optimized is integrated with to accelerate reading and writing data.Alluxio is one Distributed memory file system, it is allowed to file is reliably shared with the speed of internal memory in cluster frameworks, on this basis, Further it is optimized so that Scheduling Framework can preferably utilize Alluxio distributed memory characteristic.
F. the RDMA characteristics optimized are integrated with:JXIO.RDMA (Remote Direct Memory Access) technology can To solve the delay issue that servers' data is handled in network transmission.RDMA is by network the directly incoming computer of data Memory block, data are moved quickly into remote system stored device from a system, without being had any impact to operating system, So only need to use seldom cpu resource.It eliminates external memory storage duplication and text exchange operation, in releasing Deposit bandwidth and cpu cycle is used to provide application program capacity.
The characteristics of according further to all-in-one Distributed Architecture, the calculating platform of optimization can be deployed to each calculate node In, because upper strata is using mesos progress task schedulings and resource management, therefore the role of each calculate node is identical , do not differentiate between the concepts of master and worker nodes.
The memory node of all-in-one is responsible for providing store function, and two kinds of main offer database and universal file system are deposited Storage service.Database can be divided into two classes, respectively relevant database PostgreSQL and sequential type database.Distribution collection PostgreSQL HAWQ and the OpenTSDB+Hase of time series database is respectively adopted in group's scheme.Shown in reference picture 3, file system System equally uses distributed frame, and using HDFS+Ceph mixed structures, HAQW bottoms use HDFS, and other components use Ceph is stored.
Therefore, upper layer data management tool can automatically select data and be stored on HDFS according to the storage mode of file Or on Ceph.Database software layer is deployed in computing cluster, and file system software is deployed in storage cluster, each company-data Management function is positioned such that:
1) application cluster:
The unified access interface of data is provided;
The import/export interfaces of large-scale data are provided;
Dispose the management software client of database;
Dispose the monitoring tools of database positioning.
2) computing cluster:
Dispose database management language;
SQL/REST api interfaces are provided.
3) storage cluster:
Using HDFS, Ceph distributed file system of mixing;
Support block storage, object storage.
Step 2, building network framework, it is divided into two kinds of single chassis networking topological sum multi-frame networking topology, logically draws It is divided into extranets, management net, calculates net and storage net.
Extranets:For connecting the interchanger of user, the network for accessing all-in-one service is externally provided.External network connects Mouth uses common 1Gbps Ethernets, and extranets network interface card is only disposed on application node.
Manage net:Calculating task etc. is submitted for monitoring, managing each node of all-in-one, and to calculate node.This A little tasks are not very high to network bandwidth and delay requirement, while to avoid influenceing calculating net and storage net, using independently of meter The common 1Gbps Ethernets for calculating net and storage net can be, it is necessary to which all deployment manages net network interface card on each node.
Calculate net:For connecting each calculate node, very high to network delay requirement (height uses InfiniBand nets with version Card).
Store net:It is very high to network bandwidth and delay requirement for connecting each memory node, here using high bandwidth and The 56Gbps InfiniBand (standard edition can use 10Gbps RoCE network interface cards) of low delay by storage net and calculate net fusion For a network.Except all disposing InfiniBand network interface cards in calculate node and memory node, it is contemplated that application node may also The data of access memory node are needed, so application node can contemplate also deployment InfiniBand network interface cards.
All-in-one mounting means in units of frame, in each frame can include several application nodes, calculate node and Memory node.Each frame is equipped with an Ethernet switch (management net), an InfiniBand interchanger (calculates storage Net), the port number of every interchanger should be not less than the total node number in the frame.If necessary to extend multiple frames, exchange Machine will also reserve certain port number to be connected to other frames.As for outside network switch, it is contemplated that application node is relatively It is few, it may be considered that multiple frames share an interchanger.Networking for multiple frames is, it is necessary to increase extra core switch To connect each frame, the management network switch of each frame, a core switch can be converged to using simply tree-like. And using InfiniBand calculating to store network switch needs to use multiple core switch to form fat tree construction to ensure There is full bandwidth pathway between any two node.
Step 3, all-in-one design deployment makes it have the good system expandability, be broadly divided into behavior extension and Expanding storage depth.
All-in-one uses framework extending transversely, can be by increasing the calculate node in computation layer, so as to increase entirety Computing resource (GPU/CPU/Memory), and then improve the application program speed of service., may when rolling up calculate node Network is caused to turn into bottleneck, in order to keep computing resource (GPU/CPU/Memory), storage and network to be in a kind of balanced mode Above, it is necessary to which increasing the appropriate network switch goes solve the problems, such as network bottleneck.All-in-one uses layer architecture, by data storage list Member separates with calculation processing unit, therefore, in the case of a large amount of storages are needed, directly can laterally increase memory capacity, non- It is often convenient.
The basic principles, principal features and advantages of the present invention have been shown and described above.The technical staff of the industry should Understand, the invention is not limited in any way for above-described embodiment, all to be obtained by the way of equivalent substitution or equivalent transformation Technical scheme, all fall within protection scope of the present invention.

Claims (10)

1. a kind of machine learning and artificial intelligence application all-in-one dispositions method, it is characterised in that comprise the following steps:
Step 1, data storage and data processing are isolated, taken using the Shared-Nothing frameworks of enhanced scalability Overall system architecture is built, the system architecture is logically separated into application layer, computation layer and accumulation layer, and application layer, calculating Layer and accumulation layer all use distributed structure/architecture;
Step 2, building network framework, network architecture are divided into single chassis networking topology or multi-frame networking topology, the network rack Structure is logically divided into extranets, management net, calculates net and storage net;
Step 3, design is optimized to the autgmentability of system.
2. a kind of machine learning according to claim 1 and artificial intelligence application all-in-one dispositions method, it is characterised in that In step 1, the application layer is according to the application node for being actually needed configuration varying number;The computation layer is according to being actually needed Configure the calculate node of varying number;The accumulation layer is according to the memory node for being actually needed configuration varying number.
3. a kind of machine learning according to claim 2 and artificial intelligence application all-in-one dispositions method, it is characterised in that The calculate node configures following software stack:
A. a variety of programming languages are supported;
B., API for machine learning and deep learning is provided;
C. it is integrated with deep learning framework TensorFlow;
D. it is integrated with the distributed computing framework Spark optimized;
E. the distributed memory file system Alluxio that optimized is integrated with to accelerate reading and writing data;
F. the RDMA characteristics optimized are integrated with.
4. a kind of machine learning according to claim 2 and artificial intelligence application all-in-one dispositions method, it is characterised in that The memory node provides two kinds of storage services of database and universal file system;The database includes relevant database PostgreSQL and sequential type database, the relevant database PostgreSQL uses HAWQ distributed architectures, when described Sequence type database uses OpenTSDB+Hase distributed architectures;The universal file system is using HDFS+Ceph mixing knots Structure, HAQW bottoms use HDFS.
5. a kind of machine learning according to claim 2 and artificial intelligence application all-in-one dispositions method, it is characterised in that Extranets network interface card and management net network interface card are disposed on the application node, management net network interface card is disposed in the calculate node and calculating is deposited Net network interface card is stored up, management net network interface card is disposed on the memory node and calculates storage net network interface card.
6. according to a kind of machine learning according to any one of claims 1 to 5 and artificial intelligence application all-in-one dispositions method, Characterized in that, the topology of single chassis networking described in step 2 includes a frame, construction method is:
An Ethernet switch is equipped with, the port number of the Ethernet switch is more than or equal to the total node number in frame;
It is equipped with one and calculates storage network switch, the port number for calculating storage network switch is more than or equal to total in frame Nodes;
It is equipped with an outside network switch.
7. according to a kind of machine learning according to any one of claims 1 to 5 and artificial intelligence application all-in-one dispositions method, Characterized in that, the topology of multi-frame networking described in step 2 includes multiple frames, construction method is:
Each frame is equipped with an Ethernet switch, and the port number of the Ethernet switch is more than total node in frame Number, and reserve port to connect other frames;
Each frame is equipped with one and calculates storage network switch, and the port number for calculating storage network switch is more than in frame Total node number, and reserve port to connect other frames;
It is equipped with appropriate number of outside network switch;
Core switch is equipped with, the management network switch of each frame is connected on the core switch using simply tree-like.
8. a kind of machine learning according to claim 7 and artificial intelligence application all-in-one dispositions method, it is characterised in that The storage network switch that calculates is InfiniBand interchangers, and the InfiniBand interchangers connection of each frame is multiple described Core switch forms fat tree construction.
9. a kind of machine learning and artificial intelligence application all-in-one dispositions method according to any one of claim 2~5, Characterized in that, comprising the following steps that for design is optimized to the autgmentability of system described in step 1:
Performance is improved using framework extending transversely;
Memory capacity is increased using layer architecture.
10. a kind of machine learning according to claim 9 and artificial intelligence application all-in-one dispositions method, its feature exist In, described the step of using framework extending transversely to improve performance for:
Increase the calculate node number in the computation layer;
Increase the appropriate network switch.
CN201710881113.0A 2017-09-26 2017-09-26 A kind of machine learning and artificial intelligence application all-in-one dispositions method Pending CN107733696A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710881113.0A CN107733696A (en) 2017-09-26 2017-09-26 A kind of machine learning and artificial intelligence application all-in-one dispositions method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710881113.0A CN107733696A (en) 2017-09-26 2017-09-26 A kind of machine learning and artificial intelligence application all-in-one dispositions method

Publications (1)

Publication Number Publication Date
CN107733696A true CN107733696A (en) 2018-02-23

Family

ID=61206966

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710881113.0A Pending CN107733696A (en) 2017-09-26 2017-09-26 A kind of machine learning and artificial intelligence application all-in-one dispositions method

Country Status (1)

Country Link
CN (1) CN107733696A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109460827A (en) * 2018-11-01 2019-03-12 郑州云海信息技术有限公司 A kind of deep learning environment is built and optimization method and system
CN109600440A (en) * 2018-12-13 2019-04-09 国网河北省电力有限公司石家庄供电分公司 A kind of electric power sale big data processing method
CN110765201A (en) * 2019-09-16 2020-02-07 北京许继电气有限公司 Method and system for processing time series data under PostgreSQL database construction
US20200117664A1 (en) * 2018-10-15 2020-04-16 Ocient Inc. Generation of a query plan in a database system
WO2020147601A1 (en) * 2019-01-16 2020-07-23 阿里巴巴集团控股有限公司 Graph learning system
CN113315794A (en) * 2020-02-26 2021-08-27 宝山钢铁股份有限公司 Hardware architecture of computing system network for online intelligent analysis of blast furnace production
CN114661637A (en) * 2022-02-28 2022-06-24 中国科学院上海天文台 Data processing system and method for radio astronomical data intensive scientific operation

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104767813A (en) * 2015-04-08 2015-07-08 江苏国盾科技实业有限责任公司 Public bank big data service platform based on openstack
CN105450553A (en) * 2014-09-24 2016-03-30 英特尔公司 Mechanism for management controllers to learn the control plane hierarchy in a data center environment
US20160162611A1 (en) * 2014-12-08 2016-06-09 Tata Consultancy Services Limited Modeling and simulation of infrastructure architecture for big data
CN106790367A (en) * 2016-11-15 2017-05-31 山东省科学院自动化研究所 The vehicle safety hidden danger early warning of big data treatment and accident reproduction system and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105450553A (en) * 2014-09-24 2016-03-30 英特尔公司 Mechanism for management controllers to learn the control plane hierarchy in a data center environment
US20160162611A1 (en) * 2014-12-08 2016-06-09 Tata Consultancy Services Limited Modeling and simulation of infrastructure architecture for big data
CN104767813A (en) * 2015-04-08 2015-07-08 江苏国盾科技实业有限责任公司 Public bank big data service platform based on openstack
CN106790367A (en) * 2016-11-15 2017-05-31 山东省科学院自动化研究所 The vehicle safety hidden danger early warning of big data treatment and accident reproduction system and method

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200117664A1 (en) * 2018-10-15 2020-04-16 Ocient Inc. Generation of a query plan in a database system
US11977545B2 (en) * 2018-10-15 2024-05-07 Oclient Inc. Generation of an optimized query plan in a database system
CN109460827A (en) * 2018-11-01 2019-03-12 郑州云海信息技术有限公司 A kind of deep learning environment is built and optimization method and system
CN109600440A (en) * 2018-12-13 2019-04-09 国网河北省电力有限公司石家庄供电分公司 A kind of electric power sale big data processing method
WO2020147601A1 (en) * 2019-01-16 2020-07-23 阿里巴巴集团控股有限公司 Graph learning system
CN111444309A (en) * 2019-01-16 2020-07-24 阿里巴巴集团控股有限公司 System for learning graph
CN111444309B (en) * 2019-01-16 2023-04-14 阿里巴巴集团控股有限公司 System for learning graph
CN110765201A (en) * 2019-09-16 2020-02-07 北京许继电气有限公司 Method and system for processing time series data under PostgreSQL database construction
CN113315794A (en) * 2020-02-26 2021-08-27 宝山钢铁股份有限公司 Hardware architecture of computing system network for online intelligent analysis of blast furnace production
CN114661637A (en) * 2022-02-28 2022-06-24 中国科学院上海天文台 Data processing system and method for radio astronomical data intensive scientific operation

Similar Documents

Publication Publication Date Title
CN107733696A (en) A kind of machine learning and artificial intelligence application all-in-one dispositions method
CN107391719A (en) Distributed stream data processing method and system in a kind of cloud environment
CN105183834B (en) A kind of traffic big data semantic applications method of servicing based on ontology library
CN111400326B (en) Smart city data management system and method thereof
CN103246749B (en) The matrix database system and its querying method that Based on Distributed calculates
CN106790718A (en) Service call link analysis method and system
CN106339509A (en) Power grid operation data sharing system based on large data technology
CN110222005A (en) Data processing system and its method for isomery framework
CN106815338A (en) A kind of real-time storage of big data, treatment and inquiry system
CN107679192A (en) More cluster synergistic data processing method, system, storage medium and equipment
CN104111996A (en) Health insurance outpatient clinic big data extraction system and method based on hadoop platform
CN105468702A (en) Large-scale RDF data association path discovery method
CN107077364A (en) The compiling of the program specification based on figure of the automatic cluster of figure component is used based on the identification that specific FPDP is connected
CN104268695A (en) Multi-center watershed water environment distributed cluster management system and method
CN107104894A (en) Controller in network control system
CN103177094B (en) Cleaning method of data of internet of things
CN106951552A (en) A kind of user behavior data processing method based on Hadoop
CN108595473A (en) A kind of big data application platform based on cloud computing
CN105469204A (en) Reassembling manufacturing enterprise integrated evaluation system based on deeply integrated big data analysis technology
CN104539730B (en) Towards the load-balancing method of video in a kind of HDFS
CN106453618A (en) Remote sensing image processing service cloud platform system based on G-Cloud cloud computing
CN107343021A (en) A kind of Log Administration System based on big data applied in state's net cloud
CN110347636A (en) Data execute body and its data processing method
CN108108466A (en) A kind of distributed system journal query analysis method and device
CN103646051A (en) Big-data parallel processing system and method based on column storage

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 210000 4 floor, 5 software Avenue, Yuhuatai District, Nanjing, Jiangsu, 180

Applicant after: Nanjing Tian Zhi Zhi Technology Co., Ltd.

Address before: 210000 4 floor, 5 software Avenue, Yuhuatai District, Nanjing, Jiangsu, 180

Applicant before: Nanjing days Mdt InfoTech Ltd

CB02 Change of applicant information
CB02 Change of applicant information

Address after: 201100 no.1628, sushao Road, Minhang District, Shanghai

Applicant after: Shanghai Tiantian smart core semiconductor Co., Ltd

Address before: 210000 4 floor of No. 180, No. 180, Yuhuatai District, Yuhuatai District, Jiangsu

Applicant before: ILUVATAR COREX Inc.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180223