FR3055438B1 - Compilation pour traitement parallele base sur gpu de dispositif de noeud - Google Patents
Compilation pour traitement parallele base sur gpu de dispositif de noeud Download PDFInfo
- Publication number
- FR3055438B1 FR3055438B1 FR1757193A FR1757193A FR3055438B1 FR 3055438 B1 FR3055438 B1 FR 3055438B1 FR 1757193 A FR1757193 A FR 1757193A FR 1757193 A FR1757193 A FR 1757193A FR 3055438 B1 FR3055438 B1 FR 3055438B1
- Authority
- FR
- France
- Prior art keywords
- gpu
- task routine
- task
- node device
- compilation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000005192 partition Methods 0.000 abstract 2
- 238000006243 chemical reaction Methods 0.000 abstract 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/3004—Arrangements for executing specific machine instructions to perform operations on memory
- G06F9/30043—LOAD or STORE instructions; Clear instruction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5066—Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5044—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
- G06F8/456—Parallelism detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30145—Instruction analysis, e.g. decoding, instruction word fields
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30181—Instruction operation extension or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3887—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/448—Execution paradigms, e.g. implementations of programming paradigms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5055—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering software capabilities, i.e. software resources associated or available to the machine
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5072—Grid computing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0654—Management of faults, events, alarms or notifications using network fault recovery
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
- H04L67/1008—Server selection for load balancing based on parameters of servers, e.g. available memory or workload
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/561—Adding application-functional data or data for application control, e.g. adding metadata
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/509—Offload
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Mathematical Physics (AREA)
- Library & Information Science (AREA)
- Debugging And Monitoring (AREA)
- Devices For Executing Special Programs (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Multi Processors (AREA)
- Advance Control (AREA)
- Stored Programmes (AREA)
Abstract
Un appareil peut inclure un processeur et une mémoire pour stocker des instructions qui amènent le processeur à effectuer des opérations comprenant: en réponse à une détermination qu’un GPU d’un dispositif de nœud est disponible, déterminer si une routine de tâche peut être compilée pour générer une routine de tâche de GPU pour exécution par le GPU pour amener à la réalisation d’instances multiples d’une tâche de la routine de tâche au moins partiellement en parallèle sans dépendances entre ces dernières ; et en réponse à une détermination que la routine de tâche peut être compilée pour générer la routine de tâche de GPU : employer une règle de conversion pour convertir la routine de tâche en la routine de tâche de GPU ; compiler la routine de tâche de GPU pour exécution par le GPU ; et affecter la réalisation de la tâche avec une partition d’ensemble de données au dispositif de nœud pour permettre une réalisation des instances multiples avec la partition d’ensemble de données par le GPU.
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201662379512P | 2016-08-25 | 2016-08-25 | |
US62379512 | 2016-08-25 | ||
US201662394411P | 2016-09-14 | 2016-09-14 | |
US62394411 | 2016-09-14 | ||
US15/422,285 US9760376B1 (en) | 2016-02-01 | 2017-02-01 | Compilation for node device GPU-based parallel processing |
Publications (2)
Publication Number | Publication Date |
---|---|
FR3055438A1 FR3055438A1 (fr) | 2018-03-02 |
FR3055438B1 true FR3055438B1 (fr) | 2022-07-29 |
Family
ID=59778869
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
FR1757193A Active FR3055438B1 (fr) | 2016-08-25 | 2017-07-28 | Compilation pour traitement parallele base sur gpu de dispositif de noeud |
Country Status (9)
Country | Link |
---|---|
CN (1) | CN107783782B (fr) |
BE (1) | BE1025002B1 (fr) |
CA (1) | CA2974556C (fr) |
DE (1) | DE102017213160B4 (fr) |
DK (1) | DK179709B1 (fr) |
FR (1) | FR3055438B1 (fr) |
GB (1) | GB2553424B (fr) |
HK (1) | HK1245439B (fr) |
NO (1) | NO343250B1 (fr) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111327921A (zh) * | 2018-12-17 | 2020-06-23 | 深圳市炜博科技有限公司 | 视频数据处理方法及设备 |
CN109743453B (zh) * | 2018-12-29 | 2021-01-05 | 出门问问信息科技有限公司 | 一种分屏显示方法及装置 |
CN110163791B (zh) * | 2019-05-21 | 2020-04-17 | 中科驭数(北京)科技有限公司 | 数据计算流图的gpu处理方法及装置 |
CN111984322B (zh) * | 2020-09-07 | 2023-03-24 | 北京航天数据股份有限公司 | 一种控制指令传输方法及装置 |
CN112783506B (zh) * | 2021-01-29 | 2022-09-30 | 展讯通信(上海)有限公司 | 一种模型运行方法及相关装置 |
CN118227384A (zh) * | 2024-05-24 | 2024-06-21 | 北京蓝耘科技股份有限公司 | 一种局域gpu数据共享方法、***及存储介质 |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8134561B2 (en) * | 2004-04-16 | 2012-03-13 | Apple Inc. | System for optimizing graphics operations |
US8549500B2 (en) * | 2007-02-14 | 2013-10-01 | The Mathworks, Inc. | Saving and loading graphical processing unit (GPU) arrays providing high computational capabilities in a computing environment |
US8938723B1 (en) * | 2009-08-03 | 2015-01-20 | Parallels IP Holdings GmbH | Use of GPU for support and acceleration of virtual machines and virtual environments |
US8310492B2 (en) * | 2009-09-03 | 2012-11-13 | Ati Technologies Ulc | Hardware-based scheduling of GPU work |
US8769510B2 (en) * | 2010-04-08 | 2014-07-01 | The Mathworks, Inc. | Identification and translation of program code executable by a graphical processing unit (GPU) |
DE102013208418A1 (de) * | 2012-05-09 | 2013-11-14 | Nvidia Corp. | Verfahren und System zur separaten Kompilierung von Geräte-Code, welcher in Host-Code eingebettet ist |
US9152601B2 (en) * | 2013-05-09 | 2015-10-06 | Advanced Micro Devices, Inc. | Power-efficient nested map-reduce execution on a cloud of heterogeneous accelerated processing units |
EP2887219A1 (fr) * | 2013-12-23 | 2015-06-24 | Deutsche Telekom AG | Système et procédé de programmation de tâches à réalité augmentée mobile |
US9632761B2 (en) * | 2014-01-13 | 2017-04-25 | Red Hat, Inc. | Distribute workload of an application to a graphics processing unit |
US9235871B2 (en) * | 2014-02-06 | 2016-01-12 | Oxide Interactive, LLC | Method and system of a command buffer between a CPU and GPU |
-
2017
- 2017-07-26 CA CA2974556A patent/CA2974556C/fr active Active
- 2017-07-27 BE BE2017/5528A patent/BE1025002B1/fr active IP Right Grant
- 2017-07-28 GB GB1712171.6A patent/GB2553424B/en active Active
- 2017-07-28 FR FR1757193A patent/FR3055438B1/fr active Active
- 2017-07-31 DE DE102017213160.8A patent/DE102017213160B4/de active Active
- 2017-08-01 CN CN201710647374.6A patent/CN107783782B/zh active Active
- 2017-08-01 DK DKPA201770596A patent/DK179709B1/en active IP Right Grant
- 2017-08-01 NO NO20171277A patent/NO343250B1/en unknown
-
2018
- 2018-04-04 HK HK18104475.6A patent/HK1245439B/zh unknown
Also Published As
Publication number | Publication date |
---|---|
DK201770596A1 (en) | 2018-03-12 |
BE1025002A1 (fr) | 2018-09-14 |
FR3055438A1 (fr) | 2018-03-02 |
GB201712171D0 (en) | 2017-09-13 |
CA2974556C (fr) | 2018-06-05 |
DE102017213160A1 (de) | 2018-03-01 |
BE1025002B1 (fr) | 2018-09-17 |
GB2553424B (en) | 2018-11-21 |
NO20171277A1 (en) | 2018-02-26 |
DK179709B1 (en) | 2019-04-09 |
NO343250B1 (en) | 2018-12-27 |
DE102017213160B4 (de) | 2023-05-25 |
CN107783782A (zh) | 2018-03-09 |
HK1245439B (zh) | 2019-12-06 |
CN107783782B (zh) | 2019-03-15 |
GB2553424A (en) | 2018-03-07 |
CA2974556A1 (fr) | 2018-02-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
FR3055438B1 (fr) | Compilation pour traitement parallele base sur gpu de dispositif de noeud | |
US10606654B2 (en) | Data processing method and apparatus | |
BR112018075358A8 (pt) | Processo de seleção hierárquica | |
BR112018071814A2 (pt) | um método e sistema para controlar a execução de um contrato utilizando uma tabela de dispersâo distribuída e um livro-razão distribuído através de peer-to-peer | |
BR112018003372A2 (pt) | método para fornecer recomendações de barbeamento em estágios, programa de computador executável em uma unidade de processamento, sistema de cuidados pessoais, e aparelho de barbeamento | |
EP2778907A3 (fr) | Boucles de parallélisation en présence d'alias possibles de mémoire | |
AR051014A1 (es) | Sistema y metodo para la migracion de un producto en varios idiomas | |
MA38014B1 (fr) | Ordonnancement modal de charges de travail dans un système multi-processeur hétérogène sur une puce | |
JP2010204979A5 (ja) | コンパイル方法 | |
WO2015015225A3 (fr) | Outil de développement logiciel | |
GB2479479A (en) | Methods and system for document reconstruction | |
BR112019013067B8 (pt) | Método e dispositivo para processamento de serviço | |
BR112015031100A2 (pt) | geração automatizada de casos de teste escritos e manuais | |
NO20171576A1 (en) | Enhancing oilfield operations with cognitive computing | |
MX2019001134A (es) | Sistemas y metodos para la deteccion de reacciones quimioluminiscentes. | |
BR112015023786A2 (pt) | desambiguação não-determinística e casamento de dados locais empresariais | |
ATE429673T1 (de) | Dynamische bios-ausführung und gleichzeitiger update für einen blade-server | |
Podobas et al. | Evaluating high-level design strategies on FPGAs for high-performance computing | |
Dreuning et al. | A beginner’s guide to estimating and improving performance portability | |
RU2014125439A (ru) | Способ обработки поискового запроса и сервер | |
FR3028974B1 (fr) | Methodes et systemes de generation de scenarios de tests de performances d'une application serveur | |
Malyshkin et al. | Control flow usage to improve performance of fragmented programs execution | |
Taylor | High Performance Computing of Hydrologic Models Using HTCondor | |
ES2548033T3 (es) | Procedimiento para el funcionamiento de un dispositivo dosificador | |
Lu et al. | Efficient utilization of launched threads on GPUs: The spherical harmonic transform as a case study |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PLFP | Fee payment |
Year of fee payment: 2 |
|
PLFP | Fee payment |
Year of fee payment: 4 |
|
PLFP | Fee payment |
Year of fee payment: 5 |
|
PLSC | Publication of the preliminary search report |
Effective date: 20211203 |
|
PLFP | Fee payment |
Year of fee payment: 6 |
|
PLFP | Fee payment |
Year of fee payment: 7 |