CN103177414A - Structure-based dependency graph node similarity concurrent computation method - Google Patents

Structure-based dependency graph node similarity concurrent computation method Download PDF

Info

Publication number
CN103177414A
CN103177414A CN2013101022817A CN201310102281A CN103177414A CN 103177414 A CN103177414 A CN 103177414A CN 2013101022817 A CN2013101022817 A CN 2013101022817A CN 201310102281 A CN201310102281 A CN 201310102281A CN 103177414 A CN103177414 A CN 103177414A
Authority
CN
China
Prior art keywords
matrix
similarity
gpu
node
cpu
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013101022817A
Other languages
Chinese (zh)
Other versions
CN103177414B (en
Inventor
冯伟
万亮
谭志羽
鲁志超
江健民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen kanghongtai Technology Co.,Ltd.
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201310102281.7A priority Critical patent/CN103177414B/en
Publication of CN103177414A publication Critical patent/CN103177414A/en
Application granted granted Critical
Publication of CN103177414B publication Critical patent/CN103177414B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a structure-based dependency graph node similarity concurrent computation method. The method comprises the following steps of: a CPU (central processing unit) end is taken as a host machine end, reading a plurality of story texts or graphics, and building a graph model to obtain adjacent matrixes of the graphics; a GPU (graphics processing unit) end is taken as an equipment end, receiving the adjacent matrixes output by the CPU end, and computing the adjacent matrixes by the GPU end; and obtaining the adjacent matrixes by the GPU end, and transmitting to the CPU end. According to the method, after the concurrent method is used, the similarity computation speed can be greatly accelerated, the higher computation precision can be guaranteed, and the efficiency and precision requirements of the computation and the application of a mass of medias can be met; and the experiment result shows that on the premise of similar precision, the acceleration algorithm provided by the method obtains the speed-up ratio which is averagely more than 100 times as high as the speed-up ratio obtained by the existing algorithm.

Description

A kind of node of graph similarity parallel calculating method based on structure
Technical field
The present invention relates to the media computation field, particularly a kind of node of graph similarity parallel calculating method based on structure.
Background technology
At present in the media computation field, when solving the problem such as image segmentation, content retrieval and coupling, by the design of graphics model, spread to obtain corresponding result based on the similarity between node.Simple, the node of graph similarity calculate be node in evaluation map (for example: super pixel) a kind of means of structural similarity.
Usually adopt the descriptor between node to go to measure two similarities between node in prior art, carry out the similarity diffusion based on the similarity relation between the node neighbours and syntople.
The inventor finds to exist at least in prior art following shortcoming and defect in realizing process of the present invention:
Along with the increase of figure scale, can greatly increase the computing time of similarity diffusion, increased the complexity of calculating, and complexity even can reach O(kn 4), can't satisfy the needs in practical application.
Summary of the invention
The invention provides a kind of node of graph similarity parallel calculating method based on structure, this method has reduced complexity and the computing time of calculating, and has satisfied the needs in the practical application, sees for details hereinafter and describes:
A kind of node of graph similarity parallel calculating method based on structure said method comprising the steps of:
(1) the CPU end reads in a plurality of story text or image as host side, sets up graph model, obtains adjacency matrix;
(2) the GPU end as equipment end, receives the adjacency matrix of CPU end transmission, and the GPU end calculates adjacency matrix;
(3) the GPU end obtains adjacency matrix, and transfers to the CPU end.
When the CPU end read in a plurality of story text as host side, the step that described GPU end calculates adjacency matrix W was specially: described GPU end calculates the first adjacency matrix, that is,
1) pass through node a and the location index computing node of b in described the first adjacency matrix to (a, b) corresponding block index and thread index in grid, wherein grid is the grid of GPU kernel function, and block is the thread block in grid, and thread is the thread in thread block;
2) GPU end be in the first adjacency matrix each node to (a, b) the corresponding thread of the similarity dispensed between, that is: by block index and thread index search node to (a, b) corresponding thread, hold by the similarity of corresponding thread computes node to (a, b) at GPU.
When the CPU end read in a plurality of image as host side, the step that described GPU end calculates adjacency matrix W was specially: described GPU end calculates the second adjacency matrix, comprising:
Location matrix P when 1) searching for the k-1 time iteration k-1Middle nonzero value is charged to respectively row with line index, column index and the respective value of nonzero element, and col is in three arrays of value;
2) by location matrix P k-1Calculate the location matrix P of the K time iteration k:
3) calculate diagonal element and;
4) M that this iteration is obtained kAdd in S (a, b): S (a, b)=S (a, b)+M k
When the CPU end read in a plurality of image as host side, described method also comprised:
Described CPU end obtains transition matrix T, and described GPU end receives described transition matrix T as equipment end.
When the CPU end reads in a plurality of image as host side, described method also comprises: the structure that described transition matrix T is stored with the row compression is stored as sparse matrix, the step that described GPU end calculates adjacency matrix W is specially: described GPU end calculates the second adjacency matrix, comprising:
1) GPU end kernel function parallel computation similarity is called in the circulation of CPU end for K time;
2) the GPU end is passed result of calculation back the CPU end;
3) GPU end calculating location matrix P kDiagonal line and M k:
Figure BDA00002974879500021
4) the GPU end calculates the value S (a, b) of the similarity of corresponding element in similarity matrix s: S (a, b)=S (a, b)+M k
The circulation of described CPU end is called GPU end kernel function parallel computation similarity for K time and is specifically comprised:
A) calculate T iIn nonzero value index x;
B) calculate T jIn nonzero value index y;
C) calculate the similarity of manipulative indexing;
D) calculate node to (a, b) position in location matrix;
E) upgrade location matrix P k:
Figure BDA00002974879500022
The beneficial effect of technical scheme provided by the invention is: the CPU end reads in a plurality of story text or image as host side, sets up graph model, obtains adjacency matrix; The GPU end calculates adjacency matrix, and transfers to the CPU end; Improve the precision of calculating chart node similarity by this method, reduced complexity and the computing time of calculating, satisfied the needs in the practical application; Experimental result shows, under the prerequisite of similarity precision, accelerating algorithm proposed by the invention has obtained average speed-up ratio more than 100 times.
Description of drawings
Fig. 1 (a) and Fig. 1 (b) are original graph;
The result that the conspicuousness that Fig. 1 (c) and Fig. 1 (d) calculate for this method detects;
The result that the conspicuousness that Fig. 1 (e) and Fig. 1 (f) calculate for prior art detects;
Fig. 2 is a kind of process flow diagram of the node of graph similarity parallel calculating method based on structure;
Fig. 3 is another process flow diagram of a kind of node of graph similarity parallel calculating method based on structure;
Fig. 4 is another process flow diagram of a kind of node of graph similarity parallel calculating method based on structure.
Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, embodiment of the present invention is described further in detail below in conjunction with accompanying drawing.
In the CUDA programming model, there are two concepts of main frame and equipment.Computer CPU is commonly called host side (Host), and its GPU is commonly called equipment end (Device).A main frame and a plurality of equipment can be arranged in General System.
Use the CUDA programming model to programme, can assign the task to host side and equipment end is completed separately.Wherein host side is responsible for processing the logic of function and things and the calculating of suitable serial, and equipment end can be used for carrying out the thread-level task that is fit to highly-parallel.CPU, GPU have separate memory address space separately: the internal memory of host side and the video memory of equipment end.CUDA uses same syntax of figs and the function of former C language manipulation internal memory to the operation of internal memory, the operation video memory needs to call function relevant with memory management in CUDA API, these operations have comprised application, release and initialization video memory space, and the copy of data between internal memory and video memory etc.Be fit to parallel part in having analyzed program after, just can consider that these tasks that can walk abreast are transferred to GPU to be calculated.
For complexity and the computing time of reducing calculating, satisfy the needs in practical application, the embodiment of the present invention provides a kind of node of graph similarity parallel calculating method based on structure, and the method comprises the following steps:
Embodiment 1
The 101:CPU end reads in a plurality of story text as host side, sets up graph model, obtains the first adjacency matrix W of figure;
Node in figure represents the word in story, the similarity between the limit representation node between node, and according to the first similarity measurement rule, word between story inside word and story is set up the relation on limit in graph model, obtain the first adjacency matrix W of figure.
Wherein, the first similarity measurement rule is set according to the needs in practical application, and for example: the distance between the frequency that the similarity measurement between the inner word of story is occurred by word A, the frequency that word B occurs and word A and the common appearance of word B and two words is determined jointly less than the number of times of preset value; Between story, the similarity measurement of word is determined by the frequency of word A appearance and the frequency of word B appearance, and preset value is set by the needs in practical application.
The 102:GPU end receives the first adjacency matrix W of CPU end transmission as equipment end, and the GPU end calculates the first adjacency matrix W;
Wherein, the first adjacency matrix W is definite by the right similarity of all nodes in matrix, and this step specifically comprises:
1) be GPU end line journey Distribution Calculation task: the location index computing node in the first adjacency matrix W is to (a by node a and b, b) corresponding block index and thread index in grid, wherein grid is the grid of GPU kernel function, block is the thread block in grid, and thread is the thread in thread block;
2) GPU end be in the first adjacency matrix W each node to (a, b) the corresponding thread of the similarity dispensed between, that is: by block index and thread index search node to (a, b) corresponding thread, hold by the similarity of corresponding thread computes node to (a, b) at GPU:
For a given digraph G, make s (a, b) come similarity between representation node a and node b, these two node SimRank similarities are defined as follows:
For the situation of a=b, s (a, b)=1; For the situation of a ≠ b, s (a, b) is calculated as follows:
S ( a , b ) = C | I ( a ) | | I ( b ) | Σ i = 1 | I ( a ) | Σ j = 1 | I ( b ) | S ( I i ( a ) , I j ( b ) )
Wherein, C is the constant coefficient between 0,1; | I (a) | and | I (b) | represent that respectively a and b's enters neighbours' number; I i(a) i of representation node a enters neighbours, I j(b) j of representation node b enters neighbours.
This full point to algorithm in, calculate simultaneously all nodes between similarity use for next iterative computation.Make R kThe SimRank similarity of the k time iteration between (a, b) expression (a, b), S (a, b)=lim K → ∞R k(a, b).
For R kIterative manner is used in the calculating of (a, b).During initialization, when a is not equal to b, R (a, b)=0, otherwise it equals 1, then carries out iteration by following formula:
R k + 1 ( a + b ) = C | I ( a ) | | I ( b ) | Σ i = 1 | I ( a ) | Σ j = 1 | I ( b ) | R k ( I i ( a ) , I j ( b ) )
The 103:GPU end obtains the first adjacency matrix W, and transfers to the CPU end.
Embodiment 2
The 201:CPU end is set up graph model as many images of host side input, obtains the second adjacency matrix W and the transition matrix T of figure;
Super pixel in node representative image in figure, similarity between limit representation node between node, and according to the second similarity measurement rule between the super pixel of an image inside and set up the relation on limit between the super pixel of different images in graph model, finally calculate the second adjacency matrix W, and calculate transition matrix T by the second adjacency matrix W, and read in algorithm parameter constant decay factor C and error e rr.
Wherein, the second similarity measurement rule is set according to the needs in practical application, and for example: the super pixel of each piece is calculated obtained range descriptors, what image was interior calculates its similarity by range descriptors between super pixel in twos; Calculate its similarity by range descriptors between super pixel in twos between image.
During specific implementation, the CPU end also needs:
1) initialization similarity matrix and location matrix: similarity matrix is initialized as zero, s (a, b)=0; Location matrix is initialized as 1,
Figure BDA00002974879500052
A kind of character that has due to this algorithm, that is: a, the internodal similarity of b two are equal to two random walk persons each set out since a, b, the possibility of First sight when advancing at random in converse digraph.
Figure BDA00002974879500053
Expression has been walked k during the step, and two random walk person a, b are at the location matrix that had not met before, and the element in matrix [i, j] represents that namely migration person a, b have walked k and gone on foot the probability that arrives respectively i, j afterwards.
2) calculate number of times K repeatly, K=(int) log ΔC。
Wherein, C represents decay factor: C=0.5, Δ represents error: Δ=0.01.
The 202:GPU end receives the second adjacency matrix W and the transition matrix T of CPU end transmission as equipment end, and the GPU end calculates the second adjacency matrix W;
Wherein, this step specifically comprises:
1) search P k-1Nonzero value in matrix (location matrix during the k-1 time iteration) is charged to respectively row with line index, column index and the respective value of nonzero element, and col is in three arrays of value;
At search P k-1During matrix, when running into nonzero element, deposit line number in array row, row number deposit array col in, and respective value deposits array value in.
2) pass through P k-1The location matrix P of the K time iteration of matrix computations k: by following formula calculating location matrix P k:
P k = C · Σ i = 1 | V | Σ j = 1 ; j ≠ i | V | ( P k - 1 ) ij · ( T i ' T j )
Wherein, T i, T jBe the row vector of transition matrix T, T i' be T iTransposition, | V| is neighbours' number of node.
During specific implementation, the algorithm of corresponding kernel function is as follows: at first, calculate horizontal ordinate and the ordinate that will calculate in location matrix according to built-in variable; Then, initialized location matrix P k, make P k[a, b]=0, go to calculate according to following formula by circulation n time (n is the number of figure node) at last:
P k[a,b]=P k[a,b]+T[i,a]*T[j,b]*value[m]
Wherein, T[i, a] represent that the i of transition matrix T is capable, a column element; T[j, b] represent that the j of transition matrix T is capable, the b column element; Value[m] be P k-1Value in [a, b].
2) calculate diagonal element and;
M k = Σ i = 1 n P k [ i , i ]
3) M that this iteration is obtained kAdd in S (a, b): S (a, b)=S (a, b)+M k
The 203:GPU end obtains the second adjacency matrix W, and transfers to the CPU end.
Embodiment 3
When the larger of figure and when more sparse, in order to improve the speed of calculating the second adjacency matrix, this method can also be calculated the second adjacency matrix by step 302.
The 301:CPU end is set up graph model as many images of host side input to it, obtains the second adjacency matrix W and the transition matrix T of figure, and transition matrix T is stored as sparse matrix with CRS (Compressed Row Storage, compressed line storage) structure;
This file layout is stored following vector with continuous core position: val line number group is sequentially stored the non-vanishing matrix element element with the behavior master, the column index of each element in col storage of array val array, the rowptr vector is stored the element sequence number that begins delegation in the val array.
Super pixel in node representative image in figure, similarity between limit representation node between node, and according to the second similarity measurement rule between the super pixel of an image inside and set up the relation on limit between the super pixel of different images in graph model, finally calculate the second adjacency matrix W, and calculate transition matrix T by the second adjacency matrix W, and read in algorithm parameter constant C and error e rr.
Wherein, the second similarity measurement rule is set according to the needs in practical application, and for example: the super pixel of each piece is calculated obtained range descriptors, what image was interior calculates its similarity by range descriptors between super pixel in twos; Calculate its similarity by range descriptors between super pixel in twos between image.
During specific implementation, the CPU end also needs:
1) initialization similarity matrix and location matrix: similarity matrix is initialized as zero, S (a, b)=0; Location matrix is initialized as 1,
Figure BDA00002974879500071
Wherein, a kind of character that has due to this algorithm, that is: a, the internodal similarity of b two are equal to two random walk persons each set out since a, b, the possibility of First sight when advancing at random in converse digraph.
Figure BDA00002974879500072
Expression has been walked k during the step, and two random walk person a, b are at the location matrix that had not met before, and the element in matrix [i, j] represents that namely migration person a, b have walked k and gone on foot the probability that arrives respectively i, j afterwards.
2) calculate number of times K repeatly, K=(int) log ΔC。
The 302:GPU end receives the second adjacency matrix W and the transition matrix T of CPU end transmission as equipment end, and the GPU end calculates the second adjacency matrix W;
Wherein, this step specifically comprises:
1) GPU end kernel function parallel computation similarity is called in the circulation of CPU end for K time;
Each nonzero element to corresponding in transition matrix T calls kernel function, starts accordingly a thread, and this thread adds location matrix P after corresponding nonzero value is calculated kMiddle correspondence position wherein, mainly comprises the following steps:
A) calculate T iIn nonzero value index x: calculate and represent that this thread need to use T iIn x nonzero value.
B) calculate T jIn nonzero value index y: calculate and represent that this thread need to use T jIn y nonzero value.
C) calculate the similarity of manipulative indexing: according to x, y, with corresponding T i, T jIn value fetch calculating, similarity is designated as s.
D) calculate node to (a, b) position in location matrix: according to index x, y, calculate the index in the location matrix that s should insert;
Wherein, transition matrix T is the result after the converse digraph adjacency matrix column criterion of carrying out.
E) upgrade location matrix P k:
Figure BDA00002974879500073
2) the GPU end is passed result of calculation back the CPU end: with P kMatrix is passed the CPU end back;
3) GPU end calculating location matrix P kDiagonal line and M k:
Figure BDA00002974879500081
4) the GPU end calculates the value S (a, b) of the similarity of corresponding element in similarity matrix s: S (a, b)=S (a, b)+M k
The 303:GPU end obtains the second adjacency matrix W, and transfers to the CPU end.
The below verifies the feasibility of this method with concrete experiment, see for details hereinafter to describe:
One, story text
Choose five story text, the corresponding theme id of each story text, be respectively: 20001,20015,20039,20070 and 20076, adopt prior art and this method that above-mentioned five story text are calculated respectively, obtain corresponding adjacency matrix, and then obtain the similarity of each story text, then calculate between theme average similarity and average operating time in average similarity, theme, be respectively shown in table 1 and table 2.
Table 1
Figure BDA00002974879500082
Table 2
Figure BDA00002974879500083
Figure BDA00002974879500091
Contrast by table 1 and table 2 adopts in the theme that prior art and this method get that between average similarity and theme, the ratio of average similarity is more or less the same as can be known, but the average operating time of this method is far smaller than the average operating time of prior art.
Two, many saliencies are collaborative detects
Adopt prior art and this method simultaneously Fig. 1 (a) and Fig. 1 (b) to be processed, get the result that the conspicuousness of calculating for this algorithm in Fig. 1 (c) and Fig. 1 (d) detects, and the result that the conspicuousness that Fig. 1 (e) and Fig. 1 (f) calculate for original serial algorithm detects.As can be seen from the figure, when conspicuousness result of calculation was more or less the same, be far smaller than the working time of prior art the working time of this method, referring to table 3.
Table 3
? Working time
Serial Simrank 1.98minutes
Parallel Simrank 1.27minutes
It will be appreciated by those skilled in the art that accompanying drawing is the schematic diagram of a preferred embodiment, the invention described above embodiment sequence number does not represent the quality of embodiment just to description.
The above is only preferred embodiment of the present invention, and is in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of doing, is equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (6)

1. the node of graph similarity parallel calculating method based on structure, is characterized in that, said method comprising the steps of:
(1) the CPU end reads in a plurality of story text or image as host side, sets up graph model, obtains adjacency matrix;
(2) the GPU end as equipment end, receives the adjacency matrix of CPU end transmission, and the GPU end calculates adjacency matrix;
(3) the GPU end obtains adjacency matrix, and transfers to the CPU end.
2. a kind of node of graph similarity parallel calculating method based on structure according to claim 1, it is characterized in that, when the CPU end read in a plurality of story text as host side, the step that described GPU end calculates adjacency matrix W was specially: described GPU end calculates the first adjacency matrix, namely
1) pass through node a and the location index computing node of b in described the first adjacency matrix to (a, b) corresponding block index and thread index in grid, wherein grid is the grid of GPU kernel function, and block is the thread block in grid, and thread is the thread in thread block;
2) GPU end be in the first adjacency matrix each node to (a, b) the corresponding thread of the similarity dispensed between, that is: by block index and thread index search node to (a, b) corresponding thread, hold by the similarity of corresponding thread computes node to (a, b) at GPU.
3. a kind of node of graph similarity parallel calculating method based on structure according to claim 2, it is characterized in that, when the CPU end read in a plurality of image as host side, the step that described GPU end calculates adjacency matrix W was specially: described GPU end calculates the second adjacency matrix, comprising:
Location matrix P when 1) searching for the k-1 time iteration k-1Middle nonzero value is charged to respectively row with line index, column index and the respective value of nonzero element, and col is in three arrays of value;
2) by location matrix P k-1Calculate the location matrix P of the K time iteration k:
3) calculate diagonal element and;
4) M that this iteration is obtained kAdd in S (a, b): S (a, b)=S (a, b)+M k
4. a kind of node of graph similarity parallel calculating method based on structure according to claim 1, is characterized in that, when the CPU end read in a plurality of image as host side, described method also comprised:
Described CPU end obtains transition matrix T, and described GPU end receives described transition matrix T as equipment end.
5. a kind of node of graph similarity parallel calculating method based on structure according to claim 4, it is characterized in that, when the CPU end reads in a plurality of image as host side, described method also comprises: the structure that described transition matrix T is stored with the row compression is stored as sparse matrix, the step that described GPU end calculates adjacency matrix W is specially: described GPU end calculates the second adjacency matrix, comprising:
1) GPU end kernel function parallel computation similarity is called in the circulation of CPU end for K time;
2) the GPU end is passed result of calculation back the CPU end;
3) GPU end calculating location matrix P kDiagonal line and M k:
Figure FDA00002974879400021
4) the GPU end calculates the value S (a, b) of the similarity of corresponding element in similarity matrix s: S (a, b)=S (a, b)+M k
6. a kind of node of graph similarity parallel calculating method based on structure according to claim 5, is characterized in that, the circulation of described CPU end is called GPU end kernel function parallel computation similarity for K time and specifically comprised:
A) calculate T iIn nonzero value index x;
B) calculate T jIn nonzero value index y;
C) calculate the similarity of manipulative indexing;
D) calculate node to (a, b) position in location matrix;
E) upgrade location matrix P k:
Figure FDA00002974879400022
CN201310102281.7A 2013-03-27 2013-03-27 A kind of node of graph similarity parallel calculating method of structure based Active CN103177414B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310102281.7A CN103177414B (en) 2013-03-27 2013-03-27 A kind of node of graph similarity parallel calculating method of structure based

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310102281.7A CN103177414B (en) 2013-03-27 2013-03-27 A kind of node of graph similarity parallel calculating method of structure based

Publications (2)

Publication Number Publication Date
CN103177414A true CN103177414A (en) 2013-06-26
CN103177414B CN103177414B (en) 2015-12-09

Family

ID=48637247

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310102281.7A Active CN103177414B (en) 2013-03-27 2013-03-27 A kind of node of graph similarity parallel calculating method of structure based

Country Status (1)

Country Link
CN (1) CN103177414B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103427844A (en) * 2013-07-26 2013-12-04 华中科技大学 High-speed lossless data compression method based on GPU-CPU hybrid platform
CN104360985A (en) * 2014-10-20 2015-02-18 浪潮电子信息产业股份有限公司 Method and device for realizing clustering algorithm based on MIC
WO2016138836A1 (en) * 2015-03-03 2016-09-09 华为技术有限公司 Similarity measurement method and equipment
CN106204669A (en) * 2016-07-05 2016-12-07 电子科技大学 A kind of parallel image compression sensing method based on GPU platform
CN106202224A (en) * 2016-06-29 2016-12-07 北京百度网讯科技有限公司 Search processing method and device
WO2018149299A1 (en) * 2017-02-20 2018-08-23 平安科技(深圳)有限公司 Method of identifying social insurance fraud, device, apparatus, and computer storage medium
CN110263209A (en) * 2019-06-27 2019-09-20 北京百度网讯科技有限公司 Method and apparatus for generating information
CN110851987A (en) * 2019-11-14 2020-02-28 上汽通用五菱汽车股份有限公司 Method, apparatus and storage medium for predicting calculated duration based on acceleration ratio
CN111078957A (en) * 2019-12-18 2020-04-28 无锡恒鼎超级计算中心有限公司 Storage method based on graph storage structure
CN111860588A (en) * 2020-06-12 2020-10-30 华为技术有限公司 Training method for graph neural network and related equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050231521A1 (en) * 2004-04-16 2005-10-20 John Harper System for reducing the number of programs necessary to render an image
CN102436545A (en) * 2011-10-13 2012-05-02 苏州东方楷模医药科技有限公司 Diversity analysis method based on chemical structure with CPU (Central Processing Unit) acceleration

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050231521A1 (en) * 2004-04-16 2005-10-20 John Harper System for reducing the number of programs necessary to render an image
CN102436545A (en) * 2011-10-13 2012-05-02 苏州东方楷模医药科技有限公司 Diversity analysis method based on chemical structure with CPU (Central Processing Unit) acceleration

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
K.A. HAWICK, ET AL.: "Parallel graph component labelling with GPUs and CUDA", 《PARALLEL COMPUTING》 *
PAWAN HARISH,ET AL.: "Accelerating Large Graph Algorithms on the GPU Using CUDA", 《HIGH PERFORMANCE COMPUTING – HIPC 2007, LECTURE NOTES IN COMPUTER SCIENCE》 *
YONGPENG ZHANG, ET AL.: "Large-Scale Multi-Dimensional Document Clustering on GPU Clusters", 《PARALLEL & DISTRIBUTED PROCESSING (IPDPS), 2010 IEEE INTERNATIONAL SYMPOSIUM ON》 *
张聪,等: "基于GPU的数学形态学运算并行加速研究", 《电子设计工程》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103427844A (en) * 2013-07-26 2013-12-04 华中科技大学 High-speed lossless data compression method based on GPU-CPU hybrid platform
CN103427844B (en) * 2013-07-26 2016-03-02 华中科技大学 A kind of high-speed lossless data compression method based on GPU and CPU mixing platform
CN104360985A (en) * 2014-10-20 2015-02-18 浪潮电子信息产业股份有限公司 Method and device for realizing clustering algorithm based on MIC
CN105989154B (en) * 2015-03-03 2020-07-14 华为技术有限公司 Similarity measurement method and equipment
US10579703B2 (en) 2015-03-03 2020-03-03 Huawei Technologies Co., Ltd. Similarity measurement method and device
CN105989154A (en) * 2015-03-03 2016-10-05 华为技术有限公司 Similarity measurement method and equipment
WO2016138836A1 (en) * 2015-03-03 2016-09-09 华为技术有限公司 Similarity measurement method and equipment
CN106202224A (en) * 2016-06-29 2016-12-07 北京百度网讯科技有限公司 Search processing method and device
CN106202224B (en) * 2016-06-29 2022-01-07 北京百度网讯科技有限公司 Search processing method and device
CN106204669A (en) * 2016-07-05 2016-12-07 电子科技大学 A kind of parallel image compression sensing method based on GPU platform
WO2018149299A1 (en) * 2017-02-20 2018-08-23 平安科技(深圳)有限公司 Method of identifying social insurance fraud, device, apparatus, and computer storage medium
CN110263209B (en) * 2019-06-27 2021-07-09 北京百度网讯科技有限公司 Method and apparatus for generating information
CN110263209A (en) * 2019-06-27 2019-09-20 北京百度网讯科技有限公司 Method and apparatus for generating information
CN110851987A (en) * 2019-11-14 2020-02-28 上汽通用五菱汽车股份有限公司 Method, apparatus and storage medium for predicting calculated duration based on acceleration ratio
CN110851987B (en) * 2019-11-14 2022-09-09 上汽通用五菱汽车股份有限公司 Method, apparatus and storage medium for predicting calculated duration based on acceleration ratio
CN111078957A (en) * 2019-12-18 2020-04-28 无锡恒鼎超级计算中心有限公司 Storage method based on graph storage structure
CN111860588A (en) * 2020-06-12 2020-10-30 华为技术有限公司 Training method for graph neural network and related equipment
CN111860588B (en) * 2020-06-12 2024-06-21 华为技术有限公司 Training method for graphic neural network and related equipment

Also Published As

Publication number Publication date
CN103177414B (en) 2015-12-09

Similar Documents

Publication Publication Date Title
CN103177414A (en) Structure-based dependency graph node similarity concurrent computation method
US8400458B2 (en) Method and system for blocking data on a GPU
CN108921188B (en) Parallel CRF method based on Spark big data platform
CN108170639B (en) Tensor CP decomposition implementation method based on distributed environment
US20180121388A1 (en) Symmetric block sparse matrix-vector multiplication
CN108140061B (en) Method, storage medium, and system for determining co-occurrence in graph
CN105739951A (en) GPU-based L1 minimization problem fast solving method
CN106709503A (en) Large spatial data clustering algorithm K-DBSCAN based on density
CN106202224B (en) Search processing method and device
US11037356B2 (en) System and method for executing non-graphical algorithms on a GPU (graphics processing unit)
CN113435521A (en) Neural network model training method and device and computer readable storage medium
CN114138231B (en) Method, circuit and SOC for executing matrix multiplication operation
CN109460398A (en) Complementing method, device and the electronic equipment of time series data
CN106484532B (en) GPGPU parallel calculating method towards SPH fluid simulation
CN110264392B (en) Strong connection graph detection method based on multiple GPUs
CN114492753A (en) Sparse accelerator applied to on-chip training
CN104572588A (en) Matrix inversion processing method and device
CN117093538A (en) Sparse Cholesky decomposition hardware acceleration system and solving method thereof
CN104156268B (en) The load distribution of MapReduce and thread structure optimization method on a kind of GPU
CN106844024A (en) The GPU/CPU dispatching methods and system of a kind of self study run time forecast model
CN110059813A (en) The method, device and equipment of convolutional neural networks is updated using GPU cluster
DE102023105572A1 (en) Efficient matrix multiplication and addition with a group of warps
US20220129755A1 (en) Incorporating a ternary matrix into a neural network
CN114116208A (en) Short wave radiation transmission mode three-dimensional acceleration method based on GPU
US9600446B2 (en) Parallel multicolor incomplete LU factorization preconditioning processor and method of use thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220112

Address after: 518000 room 4009a, No. 38, Liuxian Third Road, district 72, Bao'an District, Shenzhen, Guangdong Province (office space)

Patentee after: Shenzhen kanghongtai Technology Co.,Ltd.

Address before: 300072 Tianjin City, Nankai District Wei Jin Road No. 92

Patentee before: Tianjin University

TR01 Transfer of patent right