CN111027619B - Memristor array-based K-means classifier and classification method thereof - Google Patents

Memristor array-based K-means classifier and classification method thereof Download PDF

Info

Publication number
CN111027619B
CN111027619B CN201911248887.5A CN201911248887A CN111027619B CN 111027619 B CN111027619 B CN 111027619B CN 201911248887 A CN201911248887 A CN 201911248887A CN 111027619 B CN111027619 B CN 111027619B
Authority
CN
China
Prior art keywords
data
memristor array
classified
clustering center
memristor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911248887.5A
Other languages
Chinese (zh)
Other versions
CN111027619A (en
Inventor
李祎
周厚继
陈佳
缪向水
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201911248887.5A priority Critical patent/CN111027619B/en
Publication of CN111027619A publication Critical patent/CN111027619A/en
Application granted granted Critical
Publication of CN111027619B publication Critical patent/CN111027619B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds

Abstract

The invention discloses a memristor array-based K-means classifier and a classification method thereof, dimension information of a clustering center of a K-means algorithm is taken as a training weight, mapped and stored in a memristor array, simulating the dimension information of a clustering center by using the neural network weight, realizing the calculation of Euclidean distance based on the gradient characteristic of a memristor, and directly realizes the online update of each weight of the clustering center on a hardware circuit, realizes the data clustering of a large amount of non-normalized data on the basis of the hardware circuit, reduces the computational complexity caused by data normalization and the circuit complexity caused by the change of the computational weight of an external circuit, meanwhile, the data complexity in the data distance calculation process is reduced, the data storage time and the operation power consumption are reduced, the data interaction consumption is saved, and the calculation time is shorter.

Description

Memristor array-based K-means classifier and classification method thereof
Technical Field
The invention belongs to the technical field of artificial neural networks, and particularly relates to a memristor array-based K-means classifier and a classification method thereof.
Background
With the advent of the network age, the emergence of large amounts of data has made it increasingly difficult to classify data and extract data features that are effective therein. Data classification is the process of algorithmically identifying and grouping together data points that have identical or similar characteristics. The core of classification is to obtain features between different sample data and calculate its generalized distance (or similarity) to achieve the purpose of distinguishing different samples. With the increase of data volume, the computation amount of the classification algorithm increases geometrically, which requires higher data computation and processing capacity of a computing system CPU. The 'von Neumann bottleneck' under the existing computing architecture greatly limits the data classification capability under the big data environment. Memristors are considered one of the best candidates to break the "von neumann bottleneck" limitation with their efficient computing-integrated capabilities as well as parallel computing capabilities.
The K-means algorithm is used as a basic unsupervised clustering algorithm, has the remarkable advantages of high convergence rate, simplicity in operation, few adjustable parameters and the like, and can effectively process the data clustering problem in a big data environment. In the existing application, the problems in the data classification research based on the memristor array are mainly reflected in that: (1) the application of the memristor array is realized based on a networked structure, the existing classification realized by utilizing the memristor array is mainly concentrated on complex network algorithms such as a BP (back propagation) neural network and a multilayer perceptron, the network structure is realized by combining software and hardware at the same time, the classification cannot be realized by independently depending on the hardware structure of the memristor, and the research on non-networked structure clustering algorithms such as K-means is still in the primary stage. (2) In the traditional K-means algorithm, a cluster center adopts a memoryless mean value updating mode, continuity does not exist before and after the update of the cluster center, the cluster center cannot be effectively combined with a weight value updating mode of a neural network, the online update of the weight and the complete expression of the Euclidean distance cannot be realized, and the calculation complexity is high. (3) The K-means algorithm relies on the calculation of the Euclidean distance between the clustering center and sample data to realize clustering, and the traditional Euclidean distance calculation method based on hardware cannot realize square term calculation of input data, so that the error of a data classification result is large, and the accuracy is low.
In summary, it is an urgent need to solve the above-mentioned problems to provide a K-means classifier with low computational complexity and high accuracy and a classification method thereof.
Disclosure of Invention
In view of the above defects or improvement requirements of the prior art, the invention provides a memristor array-based K-means classifier and a classification method thereof, and aims to solve the problem of high computational complexity caused by the fact that online updating of weights and complete expression of Euclidean distances cannot be realized on hardware in the prior art.
In order to achieve the above object, in a first aspect, the present invention provides a memristor array-based K-means classifier, including a first control module, a memristor array, a second control module, a data comparison module, and an output module;
the memristor array comprises a first memristor array, a second memristor array, a third memristor array and a fourth memristor array, each bit line of the first memristor array is connected with each bit line of the fourth memristor array, each bit line of the second memristor array is connected with each bit line of the third memristor array, each word line of the first memristor array is connected with each word line of the second memristor array, and each word line of the third memristor array is connected with each word line of the fourth memristor array;
the first control module is used for randomly selecting a clustering center from an input data set to be classified, respectively storing the clustering center into a first memristor array and a second memristor array after being subjected to writing voltage coding, and respectively storing data to be classified in the data set to be classified into a third memristor array and a fourth memristor array after being subjected to writing voltage coding; after reading voltage coding is carried out on the data to be classified and the opposite numbers of the weights of the clustering center, the data to be classified and the opposite numbers of the weights of the clustering center are respectively applied to bit lines of the second memristor array and the first memristor array, wherein the information of each dimension of the clustering center is the weight;
the memristor array is used for realizing dot product operation between the data to be classified after the read voltage coding input by the first control module and the opposite number of each weight and the self-stored data on the clustering center and the row where the data to be classified are located, accumulating the obtained result according to the row and outputting the accumulated result to the second control module;
the second control module is used for subtracting the calculation results of the row where the data to be classified and the clustering center input by the memristor array are located to obtain the Euclidean distance between the clustering center and the data to be classified, and outputting the Euclidean distance to the data comparison module;
the data comparison module is used for dividing the data to be classified into the class where the clustering center closest to the data to be classified is located, and outputting the classification result to the second control module and the output module respectively;
the second control module is also used for determining the row where the clustering center to be updated is located according to the classification result input by the data comparison module, and respectively outputting the data to be classified in the memristor array and the row where the clustering center to be updated is located after reading voltage coding is carried out on the preset learning rate and the opposite number of the preset learning rate;
the memristor array is also used for realizing the dot product operation between the preset learning rate and the inverse number thereof input by the second control module and the self-stored data on the row where the to-be-classified data and the to-be-updated clustering center are respectively located, accumulating the obtained results according to columns to obtain each weight change value, and outputting the weight change value to the first control module;
the first control module is also used for respectively outputting each weight change value input by the memristor array to a memristor array bit line after being subjected to write coding;
the memristor array is also used for updating the weight of the clustering center to be updated based on each weight change value input on the bit line of the first control module;
and the output module is used for outputting the classification result of the data to be classified input by the data comparison module when the weight of the clustering center is not changed any more.
Further preferably, the memristor array is in translational symmetry with reference to a center line.
Further preferably, the memristor array size is (k +1) × 2M, where k is the number of cluster classes and M is the dimension of sample data; the first memristor array and the second memristor array are in translational symmetry by taking a central line as a reference, and are formed by k rows of memristors and M columns of memristors; the third memristor array and the fourth memristor array are in translational symmetry with the center line as a reference and are formed by 1 row and M columns of memristors.
In a second aspect, the invention provides a memristor array-based K-means classification method, which comprises the following steps:
s1, randomly selecting k data from the data set to be classified as an initial clustering center, and respectively storing the k data into a first memristor array and a second memristor array after writing voltage coding, wherein k is the clustering number;
s2, selecting first data in a data set to be classified as data to be classified, and storing the data to be classified into a third memristor array and a fourth memristor array after writing voltage coding;
s3, after reading voltage coding is carried out on the data to be classified and the opposite numbers of the weights of the first clustering center, the data to be classified and the opposite numbers of the weights of the first clustering center are respectively applied to bit lines of a second memristor array and a first memristor array, dot product operation between the data to be classified and the opposite numbers of the weights of the first clustering center, which are input by the first control module, and self-stored data is respectively realized on the rows where the first clustering center and the data to be classified are located, and the obtained results are accumulated according to the rows and then subtracted to obtain the Euclidean distance between the first clustering center and the data to be classified;
s4, sequentially calculating Euclidean distances between the data to be classified and the rest clustering centers according to the method in the step S3;
s5, dividing the data to be classified into the class where the clustering center closest to the data to be classified is located, and determining the row where the clustering center to be updated is located according to the classification result;
s6, respectively inputting the preset learning rate and the inverse number thereof after the reading voltage coding to the row where the data to be classified and the cluster center to be updated are located, realizing the dot product operation between the preset learning rate and the inverse number thereof input by the second control module and the self-stored data, accumulating the obtained results according to columns to obtain the change value of each weight of the cluster center to be updated, writing the obtained change value into the memristor node of the cluster center to be updated, and updating the weight;
s7, sequentially dividing the residual data in the data set to be classified into corresponding categories according to the method of the steps S2-S6;
s8, repeating the steps S2-S7 to iterate until the weight of each cluster center is not changed;
the first memristor array is connected with each bit line of the fourth memristor array, the second memristor array is connected with each bit line of the third memristor array, the first memristor array is connected with each word line of the second memristor array, the third memristor array is connected with each word line of the fourth memristor array, and each dimension information of the clustering center is the weight.
Further preferably, after the data is written into the memristor by the writing voltage coding, the conductance value of the memristor is linearly related to the actual size of the data.
Further preferably, a first cluster center in the first memristor array is selected, a read voltage encoded coefficient-1 is applied to a bit line of the first cluster center, and the opposite number of each weight of the first cluster center is obtained.
Further preferably, the euclidean distance is determined by an amount of charge accumulation due to an output current on the memristor row, and the amount of accumulated charge is proportional to the euclidean distance.
Further preferably, the cluster center to be updated is the cluster center closest to the data to be classified.
Further preferably, the weight change value Δ W is represented by:
ΔW=η(Ui-Wp)
wherein eta represents the learning rate, UiFor the ith data to be classified, WpIs the cluster center to be updated.
In general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects:
1. the invention provides a memristor array-based K-means classifier, which is characterized in that a memristor array structure is utilized, a K-means clustering center with practical significance is directly mapped and stored into an array node, the structural networking of an algorithm is realized, all dimension information of the clustering center is used as the weight of the network, the practical significance of the network weight is increased, the non-normalized input data clustering is realized, and the calculation complexity caused by data normalization is reduced; by applying the conductance value gradient characteristic of the memristor to calculation of Euclidean distance and weight updating of multi-dimensional data, the problem of high calculation complexity caused by the fact that complete expression of Euclidean distance and online updating of weight cannot be achieved on hardware in the prior art is solved.
2. The invention provides a memristor array-based K-means classification method, which is characterized in that all dimension information of a clustering center of a K-means algorithm is used as a training weight, the conductance value gradient characteristic of a memristor is applied to the calculation of Euclidean distance of multi-dimensional data, the problem of complete expression of the Euclidean distance on the memristor array is solved, the learning rate is directly applied to the memristor array through voltage coding, the online updating of the clustering center on a hardware circuit is further realized, the circuit complexity caused by the calculation weight change of an external circuit is greatly reduced, and the time and energy consumption of data interaction are saved.
3. The memristor gradient characteristic is applied to Euclidean distance simulation calculation, the method can be used for calculating the Euclidean distance between input data and a clustering center to realize K-means clustering, and the problem of similarity calculation of algorithms such as KNN (K nearest neighbor) and RBF (radial basis function) neural networks and the like in other similar algorithms in a hardware circuit can be solved.
4. According to the K-means classifier based on the memristor array, due to the high-density structure of the nanoscale memristor array and the information storage capacity of the memristor resistor, the circuit size is small, the energy consumption is lower than that of a traditional CMOS structure, the overall performance is better than that of an existing computing framework, and the K-means classifier based on the memristor array is more suitable for an edge computing scene.
Drawings
FIG. 1 is a structural schematic diagram of a memristor array-based K-means classifier provided by the invention;
FIG. 2 is a schematic diagram of a memristor array structure provided by the present disclosure;
FIG. 3 is a flow chart of the K-means clustering algorithm provided by the present invention;
FIG. 4 is a diagram of a neural network weight mapping scheme provided by the present invention;
FIG. 5 is a schematic diagram of a memristor array based weight reading method provided by the present disclosure;
FIG. 6 is a schematic diagram of a method for calculating Euclidean distances between data to be classified and a cluster center based on a memristor array according to the present invention;
FIG. 7 is a schematic diagram of an update method for a cluster center based on a memristor array provided by the present invention; wherein, the diagram (a) is a schematic diagram of a method for calculating a weight change value, and the diagram (b) is a process for writing the weight change value into a row to be updated.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
In order to achieve the above object, in a first aspect, the present invention provides a memristor array-based K-means classifier, as shown in fig. 1, including a first control module 1, a memristor array 2, a second control module 3, a data comparison module 4, and an output module 5;
the first control module 1 is connected with the memristor array 2 in a bidirectional mode, the memristor array 2 is connected with the second control module 3 in a bidirectional mode, the second control module 3 is connected with the data comparison module 4 in a bidirectional mode, and the data comparison module 4 is connected with the output module 5. As shown in fig. 2, the memristor array 1 includes a first memristor array 21, a second memristor array 22, a third memristor array 23, and a fourth memristor array 24, each bit line of the first memristor array 21 and the fourth memristor array 24 is connected, each bit line of the second memristor array 22 and each bit line of the third memristor array 23 are connected, each word line of the first memristor array 21 and each word line of the second memristor array 22 are connected, and each word line of the third memristor array 23 and each word line of the fourth memristor array 24 are connected;
the first control module 1 is used for randomly selecting a clustering center from an input data set to be classified, respectively storing the clustering center into the first memristor array 21 and the second memristor array 22 after being subjected to writing voltage coding, and respectively storing data to be classified in the data set to be classified into the third memristor array 23 and the fourth memristor array 24 after being subjected to writing voltage coding; after reading voltage coding is carried out on the data to be classified and the opposite numbers of the weights of the clustering center, the data to be classified and the opposite numbers of the weights of the clustering center are respectively applied to bit lines of the second memristor array 22 and the first memristor array 21, wherein the information of each dimension of the clustering center is the weight;
the memristor array 2 is used for realizing dot product operation between the data to be classified after the read voltage coding input by the first control module 1 and the inverse number of each weight and the self-stored data on the clustering center and the row where the data to be classified are located, accumulating the obtained results according to the rows, and outputting the accumulated results to the second control module 3;
the second control module 3 is used for subtracting the calculation results of the row where the data to be classified and the clustering center input by the memristor array 2 are located to obtain the Euclidean distance between the clustering center and the data to be classified, and outputting the Euclidean distance to the data comparison module 4;
the data comparison module 4 is used for dividing the data to be classified into the class where the clustering center closest to the data to be classified is located, and outputting the classification result to the second control module 3 and the output module 5 respectively;
the second control module 3 is further configured to determine a row where the clustering center to be updated is located according to the classification result input by the data comparison module 4, and output the row where the clustering center to be updated and the data to be classified in the memristor array 2 are located respectively after reading voltage coding is performed on the preset learning rate and the inverse number thereof;
the memristor array 2 is further used for realizing dot product operation between the preset learning rate and the inverse number thereof input by the second control module 3 and self-stored data on the row where the to-be-classified data and the to-be-updated clustering center are located, accumulating the obtained results according to columns to obtain each weight change value, and outputting the weight change value to the first control module 1;
the first control module 1 is further configured to output each weight change value input by the memristor array 2 to a bit line of the memristor array 2 after being subjected to write coding;
the memristor array 2 is further used for updating the weight of the clustering center to be updated based on each weight change value input on the bit line of the first control module 1;
the output module 5 is used for outputting the classification result of the data to be classified input by the data comparison module 4 when the weight of the clustering center is not changed any more.
Specifically, the memristor array 2 has a data storage function and a data storage function, wherein the data storage function is to convert a data voltage obtained by encoding input data through a write voltage into a conductance value of a memristor node to be stored in the array; the data calculation function is to convert the data voltage of the input data after reading the voltage code and the conductance of the node into current and accumulate the charge.
In this embodiment, for the to-be-classified data set S ═ { U ═ U1,U2,…,UtEach to-be-classified data UiWith M data dimensions, i.e. Ui={xi1,xi2,…,xiMDesignating the data in S as k classes, k cluster centers W ═ W are generated1,W2,…,WkAnd each cluster center and data to be classified have M dimensions, namely Wj={yi1,yi2,…,yiM}. As shown in fig. 2, the present embodiment employs a memristor array of (k +1) × 2M size, which is in translational symmetry with respect to a central line, wherein the first memristor array and the second memristor array are in translational symmetry with respect to the central line, and are each formed by k rows and M columns of memristors; the third memristor array and the fourth memristor array are in translational symmetry by taking a center line as a reference, and are formed by 1 row and M columns of memristors, wherein the size of the memristor array is (k +1) × 2M, wherein k is the number of clustering classes, and M is the dimensionality of the classified data. The first memristor array and the second memristor array are used for storing dynamically-changed clustering center W ═ W1,W2,…,Wk}. In a memristor array, each node of the array represents one dimension of one data. M dimensions, namely M weights, of one clustering center are sequentially stored in each row of memristor units of the first memristor array from left to right, and each weight of K clustering centers can be stored in the M × K memristor units of the first memristor array. Similarly, K cluster centers may be stored into a second memristor array. The third memristor array and the fourth memristor array are used for storing the ith input data U to be classifiediThe data stored by the first memristor array, the second memristor array, the third memristor array and the fourth memristor array are identical and are in translational symmetry with the center line as a reference.
Specifically, the first control module 1 includes a data input unit 11, a first read-write encoding unit 12, a first buffer unit 13, and a second buffer unit 14; the second control module 3 comprises a third buffer unit 31, a second read-write encoding unit 32 and a subtraction unit 33; the output module 5 comprises an output buffer unit 51 and a result output unit 52;
wherein, the output end of the data input unit 11 is connected with one end of the first read-write coding unit 12, the other end of the first read-write coding unit 12 is respectively bidirectionally connected with one end of the first buffer unit 13 and one end of the second buffer unit 14, the bit lines of the first memristor array 21 and the fourth memristor array 24 are bidirectionally connected with the other end of the first buffer unit 13, the bit lines of the second memristor array 22 and the third memristor array 23 are bidirectionally connected with the other end of the second buffer unit 14, the word lines of the memristor array 2 are bidirectionally connected with one end of the third buffer unit 31, the other end of the third buffer unit 31 is respectively connected with the second read-write coding unit 32, one end of the subtraction unit 33 and one end of the data comparison module 4 are connected in a bidirectional manner, the other end of the data comparison module 4 is connected with the input end of the output cache unit 51, and the output end of the output cache unit 51 is connected with the input end of the result output unit 52;
FIG. 3 is a flow chart of the K-means clustering algorithm, which mainly includes data input stages S1-S2, distance calculation stages S3, and weight update stages S4-S5.
Correspondingly, the functions of each module and unit in the K-means classifier in FIG. 1 are as follows:
a data input stage: the data input unit 11 receives the input of the data set to be classified, selects the cluster center data and the data to be classified, and outputs to the first read-write encoding unit 12, the first read-write encoding unit 12 encodes the clustering center data and the data to be classified input by the data input unit 11 based on the write voltage with fixed amplitude, and the encoded data are respectively input to the bit lines of the memristor array 2 through the first buffer unit 13 and the second buffer unit 14, therefore, the clustering centers are respectively stored in the first memristor array 21 and the second memristor array 22, the write-coded data to be classified input by the first read-write module 12 is input to the bit lines of the memristor array 2, therefore, the data to be classified are respectively stored in the third memristor array 23 and the fourth memristor array 24, and the storage of the data to be classified and the dimension information of the clustering center by the memristor array 2 is completed.
A distance calculation stage: with each dimension information of the clustering center as a weight, after reading voltage coding is carried out on data to be classified and the opposite number of each weight, the data to be classified and the opposite number of each weight are applied to bit lines of a second memristor array 22 and a first memristor array 21 after passing through a second cache unit 14 and a first cache unit 13 respectively; the memristor array 2 is used for realizing dot product operation between the data to be classified after input read voltage encoding and the opposite number of each weight and self-stored data on the row where the data to be classified and the cluster center are located, accumulating the obtained results according to the row, outputting the accumulated results to the subtraction unit 33 through the third cache module for subtraction operation, obtaining Euclidean distances between the data to be classified and each cluster center, and outputting the Euclidean distances to the data comparison module 4 through the third cache module 31.
And a weight updating stage: the data comparison module 4 receives the euclidean distances between the data to be classified input by the third cache unit 31 and the clustering centers, compares the euclidean distances, divides the data to be classified into the class where the clustering center closest to the data to be classified is located, outputs the classification result to the third cache unit 31 and the output cache unit 51, and stores the temporary classification result in the output cache unit 51; the third cache unit 31 determines the row where the clustering center to be updated is located according to the classification result input by the data comparison module 4, and outputs the data to be classified and the row where the clustering center to be updated is located in the memristor array 2 after reading voltage coding is performed on the preset learning rate and the opposite number thereof; the memristor array 2 is respectively arranged on the row of the data to be classified and the cluster center to be updated, the dot product operation between the preset learning rate and the inverse number thereof input by the third cache unit 31 and the self-stored data is carried out, the obtained results are accumulated according to columns to obtain each weight change value, and the weight change values are output to the first cache unit 13 and the second cache unit 14; after the first buffer unit 13 and the second buffer unit 14 output the weight change values to the first read-write encoding unit 12 for write encoding, the weight change values are output to bit lines of the memristor array 2 through the first buffer unit 13 and the second buffer unit 14, so that the memristor array updates the weight of the clustering center to be updated.
After each data to be classified in the data set to be classified is subjected to the above process for multiple times, when the category of each data in the data set to be classified is not changed any more, the classification result is transmitted and output to the result output unit 52, so that the final classification result is output.
In a second aspect, the invention provides a memristor array-based K-means classification method. The invention simplifies the K-means algorithm into a single-layer perceptron model, inputs all dimension information of the data to be classified, outputs the class of the data to be classified, and trains the weight as all dimension information of the clustering center. And (3) realizing the sensor model by using the memristor array, repeatedly using data in the data set S to train a clustering center on line, completing the updating of all dimension information, namely weight, and finally realizing clustering. Fig. 4 shows a neural network weight mapping method provided by the present invention.
Specifically, the invention provides a memristor array-based K-means classification method, which comprises the following steps:
s1, classifying the data set S ═ U1,U2,…,UtRandomly selecting k data as initial clustering center weight W ═ W1,W2,…,WkWriting voltage codes are carried out, and then the codes are respectively stored in a first memristor array and a second memristor array, wherein k is a clustering number;
specifically, taking the K-means classifier provided in the first aspect of the present invention as an example, M dimensional data of a jth (j is 1,2, …, K) clustering center are sequentially encoded by using a first read-write data encoding module, and after passing through a first cache unit and a second cache unit, the encoded data to be classified are respectively written into jth row memristor nodes of a first memristor array and a second memristor array.
S2, selecting the first data U in the data set to be classified1The data to be classified are respectively stored in a third memristor array and a fourth memristor array after being coded by write voltage;
specifically, after the data is written into the memristor by the writing voltage code, the conductance value of the memristor is linearly related to the actual size of the data, and is represented as follows:
Figure BDA0002308455120000121
wherein G isxRepresenting the memristor conductance value after the data write voltage is coded and written into the memristor, X representing the data value, Gmax,GminRespectively representing maximum and minimum values of conductance, Xmax,XminRepresenting the maximum and minimum values of the data. The actual data is mapped to the device conductance accordingly, and since the encoding voltage amplitudes are the same, different numbers of voltages must be applied in order for the data to reach a certain conductance value, i.e. the write voltage encoding process. The writing data coding result is N ═ f (G)x) And the function f is memristor pulse conductance characteristics.
S3, data U to be classified1And the inverse of each weight of the first cluster center-W1After read voltage coding is carried out, the read voltage coding is respectively applied to bit lines of the second memristor array and the first memristor array and respectively arranged in the first clustering center W1And data U to be classified1On the row, the dot product operation between the data to be classified after the read voltage coding input by the first control module and the opposite number of each weight of the first clustering center and the self-stored data is realized, and the obtained results are accumulated according to the rows and then subtracted to obtain the Euclidean distance between the first clustering center and the data to be classified;
specifically, each dimension information of the clustering center is the weight. Selecting a first cluster center W in a first memristor array1Applying a coefficient-1 coded by reading voltage on a bit line of the first clustering device, and calculating to obtain the opposite number of each weight of the first clustering center based on ohm's law; as shown in fig. 5, which is a schematic diagram of a memristor array-based weight reading method provided by the present invention, a first clustering center W is selected by a third cache unit1Applying a read voltage to the line, passing ohm's law through the first cluster center W1The conductance values of the memristors on the lines act to obtain a first clustering center W1The inverse of each weight. Specifically, the read voltage encoded coefficient-1 is input to the first clustering center W through the third buffer unit1In the row, a first cache unit is adopted to collect a first memristor arrayThe current values on each bit line in the column can be used to obtain a first cluster center W1The corresponding conductance value obtains the opposite number-y of each weight through the mapping relation between the conductance and the actual data11,-y12,…,-y1MAnd stored in the first cache unit.
Specifically, as shown in FIG. 6, the inverse number (-y) of each weight of the first cluster center is set11,-y12,…,-y1M) And data to be classified (x)11,x12,…,x1M) The data of each M dimensionalities are coded by a read data coding module, the coded opposite numbers of each weight are input into a first memristor array and a fourth memristor array through a first cache unit, the coded data to be classified are input into a second memristor array and a third memristor array through a second cache unit, dot product operation between the data to be classified coded by read voltage input by a first control module and the opposite numbers of each weight and self-stored data is achieved, the obtained results are accumulated according to rows, and the obtained results are subtracted to obtain the Euclidean distance between a first clustering center and the data to be classified. The essence of the process is that the reading voltage is converted into current based on the conductance action of the data voltage coded by the input reading voltage and the node, the current can represent the result of addition after vector dot multiplication, and the third cache unit is used for collecting the first clustering center W1All lines and data to be classified U1The charge value of the row represents the result of the addition after the vector dot product operation. Data U characterizing the charge value1 2-U1*W1And a first cluster center W1Data characterizing charge values of the row
Figure BDA0002308455120000131
Output to the subtraction unit via the third buffer unit for subtraction, i.e.
Figure BDA0002308455120000132
The data to be classified and the clustering center W are obtained1And storing the obtained euclidean distance in a third cache unit. IntoIn one step, the Euclidean distance is determined by the accumulated charge amount brought by the current, and the accumulated charge amount is in direct proportion to the Euclidean distance.
The method applies the gradual change characteristic of the memristor to calculation of the Euclidean distance, not only can be used for calculating the Euclidean distance between input data and a clustering center, and realizing K-means clustering, but also can solve the similarity calculation problem of algorithms such as KNN and RBF neural networks in hardware circuits.
S4, calculating the data to be classified and the rest clustering centers W in sequence according to the method in the step S32,W3,…,WkThe Euclidean distance between;
s5, the obtained data to be classified and each clustering center W1,W2,…,WkComparing Euclidean distances between the data to be classified, dividing the data to be classified into the class where the clustering center closest to the data to be classified is located, and determining the row where the clustering center to be updated is located according to the classification result, wherein the clustering center to be updated is the clustering center closest to the data to be classified.
Specifically, the charges representing the euclidean distance stored in the third cache unit are transmitted to the data comparison module, the data comparison module divides the data to be classified into the class where the cluster center closest to the data to be classified is located by comparing the magnitude of the charge amount, and feeds the classification result back to the third cache unit and outputs the classification result to the output cache unit.
S6, respectively inputting the preset learning rate eta and the inverse number-eta of the preset learning rate eta after reading voltage coding to the row where the data to be classified and the clustering center to be updated are located, realizing the dot product operation between the preset learning rate and the inverse number of the preset learning rate input by the second control module and the self-stored data, accumulating the obtained results according to columns, obtaining the change value of each weight of the clustering center to be updated, writing the obtained change value into the memristor node of the clustering center to be updated, and updating the weight;
specifically, as shown in fig. 7(a), the voltage pulse corresponding to the learning rate inverse- η after the voltage reading coding is input to the clustering center W closest to the data to be classified through the third buffer unitpOn the line of the position, the position of the line is determined,meanwhile, voltage pulses corresponding to the learning rate eta after voltage reading coding are input to the data U to be classified through a third cache unit1On the row. In this embodiment, the number of voltage pulses corresponding to the learning rate is set to 1, which is the minimum number of pulses, where the learning rate η is 0.1. Respectively obtaining the current values of the cluster center closest to the data to be classified and the row on which the data to be classified is located based on ohm's law, and calculating to obtain the weight change value delta W ═ eta (U) of the cluster center closest to the data to be classified after accumulating according to the rowsi-Wp) Where η represents the learning rate, UiUiFor the ith data to be classified, WpThe cluster center closest to the data to be classified. Then, as shown in fig. 7(b), Δ W is encoded and written by the write encoding unit into the clustering center W closest to the data to be classified in the memristor arraypOn the row, thereby enabling the update of the cluster center on the memristor array.
S7, sequentially dividing the residual data in the data set to be classified into corresponding categories according to the method of the steps S2-S6;
s8, repeating the steps S2-S7 to iterate until the weight of each cluster center is not changed;
the first memristor array is connected with each bit line of the fourth memristor array, the second memristor array is connected with each bit line of the third memristor array, the first memristor array is connected with each word line of the second memristor array, and the third memristor array is connected with each word line of the fourth memristor array.
The invention provides a memristor array-based K-means classifier and a classification method thereof, which take all dimension information of a clustering center of a K-means algorithm as training weight and map the training weight in a memristor array, creatively provides a memristor array-based Euclidean distance calculation method, solves the problem of complete expression of Euclidean distance on the memristor array, and can be used for realizing data clustering of a large amount of data on the basis of a hardware circuit. According to the invention, the memristor array is utilized to reduce the data complexity in the data Euclidean distance calculation process, reduce the data storage time and the operation power consumption, and can be used for an edge calculation scene in the future.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (9)

1. A memristor array-based K-means classifier, comprising: the memristor array comprises a first control module, a memristor array, a second control module, a data comparison module and an output module;
the memristor array comprises a first memristor array, a second memristor array, a third memristor array and a fourth memristor array, each bit line of the first memristor array is connected with each bit line of the fourth memristor array, each bit line of the second memristor array is connected with each bit line of the third memristor array, each word line of the first memristor array is connected with each word line of the second memristor array, and each word line of the third memristor array is connected with each word line of the fourth memristor array;
the first control module is used for randomly selecting a clustering center from an input data set to be classified, respectively storing the clustering center into a first memristor array and a second memristor array after being subjected to writing voltage coding, and respectively storing data to be classified in the data set to be classified into a third memristor array and a fourth memristor array after being subjected to writing voltage coding; after reading voltage coding is carried out on the data to be classified and the opposite numbers of the weights of the clustering center, the data to be classified and the opposite numbers of the weights of the clustering center are respectively applied to bit lines of the second memristor array and the first memristor array, wherein the information of each dimension of the clustering center is the weight;
the memristor array is used for realizing dot product operation between the data to be classified and the clustering centers after the read voltage codes input by the first control module are respectively arranged on the clustering centers and the rows where the data to be classified are arranged, and between the opposite numbers of the weights and the clustering centers, and accumulating the obtained results according to the rows and then outputting the accumulated results to the second control module;
the second control module is used for subtracting the calculation results of the row where the data to be classified and the clustering center input by the memristor array are located to obtain the Euclidean distance between the clustering center and the data to be classified, and outputting the Euclidean distance to the data comparison module;
the data comparison module is used for dividing the data to be classified into the class where the clustering center closest to the data to be classified is located, and outputting the classification result to the second control module and the output module respectively;
the second control module is further used for determining a row where the clustering center to be updated is located according to the classification result input by the data comparison module, and outputting the data to be classified in the memristor array and the row where the clustering center to be updated is located after reading voltage coding is carried out on the preset learning rate and the opposite number of the preset learning rate;
the memristor array is further used for realizing dot product operation between the preset learning rate input by the second control module and the data to be classified and between the opposite number of the preset learning rate and the clustering center to be updated on the row where the data to be classified and the clustering center to be updated are located respectively, accumulating the obtained results according to columns to obtain each weight change value, and outputting the weight change value to the first control module;
the first control module is further used for outputting each weight change value input by the memristor array to a memristor array bit line after being subjected to write coding;
the memristor array is further used for updating the weight of the clustering center to be updated based on each weight change value input on the bit line of the first control module;
and the output module is used for outputting the classification result of the data to be classified input by the data comparison module when the weight of the clustering center is not changed any more.
2. The memristor array-based K-means classifier of claim 1, wherein the memristor array is translational symmetric with respect to a centerline.
3. The memristor array-based K-means classifier according to claim 1, wherein the memristor array size is (K +1) x 2M, where K is the number of cluster classes and M is the dimension of sample data; the first memristor array and the second memristor array are in translational symmetry by taking a central line as a reference, and are formed by k rows of memristors and M columns of memristors; the third memristor array and the fourth memristor array are in translational symmetry with the center line as a reference, and are formed by 1 row and M columns of memristors.
4. A memristor array-based K-means classification method is characterized by comprising the following steps:
s1, randomly selecting k data from the data set to be classified as an initial clustering center, and respectively storing the k data into a first memristor array and a second memristor array after writing voltage coding, wherein k is the clustering number;
s2, selecting first data in a data set to be classified as data to be classified, and storing the data to be classified into a third memristor array and a fourth memristor array after writing voltage coding;
s3, after reading voltage coding is carried out on the data to be classified and the opposite numbers of the weights of the first clustering center, the data to be classified and the opposite numbers of the weights of the first clustering center are respectively applied to bit lines of a second memristor array and a first memristor array, dot product operation between the data to be classified and the first clustering center after the reading voltage coding input by the first control module and between the opposite numbers of the weights of the first clustering center and the first clustering center is respectively realized on the line where the first clustering center and the data to be classified are located, and the obtained results are accumulated according to the line rows and then subtracted to obtain the Euclidean distance between the first clustering center and the data to be classified;
s4, sequentially calculating Euclidean distances between the data to be classified and the rest clustering centers according to the method in the step S3;
s5, dividing the data to be classified into the class where the clustering center closest to the data to be classified is located, and determining the row where the clustering center to be updated is located according to the classification result;
s6, respectively inputting the preset learning rate and the opposite number of the preset learning rate after the reading voltage coding to the row where the data to be classified and the clustering center to be updated are located, realizing the dot product operation between the preset learning rate and the data to be classified input by the second control module and between the opposite number of the preset learning rate and the clustering center to be updated, accumulating the obtained results according to columns to obtain the change value of each weight of the clustering center to be updated, writing the obtained change value into the memristor node of the clustering center to be updated, and updating the weight;
s7, sequentially dividing the residual data in the data set to be classified into corresponding categories according to the method of the steps S2-S6;
s8, repeating the steps S2-S7 to iterate until the weight of each cluster center is not changed;
the first memristor array is connected with each bit line of the fourth memristor array, the second memristor array is connected with each bit line of the third memristor array, the first memristor array is connected with each word line of the second memristor array, and the third memristor array is connected with each word line of the fourth memristor array; and the information of each dimension of the clustering center is the weight.
5. The classification method as claimed in claim 4, wherein after writing voltage encoding data to the memristor, the memristor conductance value is linearly related to the actual size of the data.
6. The classification method according to claim 4, wherein a first cluster center in the first memristor array is selected, and a read voltage encoded coefficient-1 is applied to its bit line, resulting in the opposite of each weight of the first cluster center.
7. The classification method according to claim 4, wherein the Euclidean distance is determined by an amount of charge accumulation due to an output current on the memristor row, the amount of accumulated charge being proportional to the Euclidean distance.
8. The classification method according to claim 4, wherein the cluster center to be updated is the cluster center closest to the data to be classified.
9. The classification method according to claim 4, wherein the variation Δ W of the weight is represented as:
ΔW=η(Ui-Wp)
wherein eta represents the learning rate, UiFor the ith data to be classified, WpIs the cluster center to be updated.
CN201911248887.5A 2019-12-09 2019-12-09 Memristor array-based K-means classifier and classification method thereof Active CN111027619B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911248887.5A CN111027619B (en) 2019-12-09 2019-12-09 Memristor array-based K-means classifier and classification method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911248887.5A CN111027619B (en) 2019-12-09 2019-12-09 Memristor array-based K-means classifier and classification method thereof

Publications (2)

Publication Number Publication Date
CN111027619A CN111027619A (en) 2020-04-17
CN111027619B true CN111027619B (en) 2022-03-15

Family

ID=70207600

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911248887.5A Active CN111027619B (en) 2019-12-09 2019-12-09 Memristor array-based K-means classifier and classification method thereof

Country Status (1)

Country Link
CN (1) CN111027619B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553415B (en) * 2020-04-28 2022-11-15 宁波工程学院 Memristor-based ESN neural network image classification processing method
CN111599409B (en) * 2020-05-20 2022-05-20 电子科技大学 circRNA recognition method based on MapReduce parallelism
CN111815640B (en) * 2020-07-21 2022-05-03 江苏经贸职业技术学院 Memristor-based RBF neural network medical image segmentation algorithm
CN111983429B (en) * 2020-08-19 2023-07-18 Oppo广东移动通信有限公司 Chip verification system, chip verification method, terminal and storage medium
CN112819036B (en) * 2021-01-12 2024-03-19 华中科技大学 Spherical data classification device based on memristor array and operation method thereof
CN113191402B (en) * 2021-04-14 2022-05-20 华中科技大学 Memristor-based naive Bayes classifier design method, system and classifier
CN113517007B (en) * 2021-04-29 2023-07-25 西安交通大学 Flowing water processing method and system and memristor array

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105593879A (en) * 2013-05-06 2016-05-18 Knowm科技有限责任公司 Universal machine learning building block
CN109791626A (en) * 2017-12-29 2019-05-21 清华大学 The coding method of neural network weight, computing device and hardware system
CN109800870A (en) * 2019-01-10 2019-05-24 华中科技大学 A kind of Neural Network Online learning system based on memristor
CN110007764A (en) * 2019-04-11 2019-07-12 北京华捷艾米科技有限公司 A kind of gesture skeleton recognition methods, device, system and storage medium
US10380386B1 (en) * 2018-04-30 2019-08-13 Hewlett Packard Enterprise Development Lp Accelerator for k-means clustering with memristor crossbars
CN110163334A (en) * 2018-02-11 2019-08-23 上海寒武纪信息科技有限公司 Integrated circuit chip device and Related product
CN110443168A (en) * 2019-07-23 2019-11-12 华中科技大学 A kind of Neural Network for Face Recognition system based on memristor

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10049321B2 (en) * 2014-04-04 2018-08-14 Knowmtech, Llc Anti-hebbian and hebbian computing with thermodynamic RAM

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105593879A (en) * 2013-05-06 2016-05-18 Knowm科技有限责任公司 Universal machine learning building block
CN109791626A (en) * 2017-12-29 2019-05-21 清华大学 The coding method of neural network weight, computing device and hardware system
CN110163334A (en) * 2018-02-11 2019-08-23 上海寒武纪信息科技有限公司 Integrated circuit chip device and Related product
US10380386B1 (en) * 2018-04-30 2019-08-13 Hewlett Packard Enterprise Development Lp Accelerator for k-means clustering with memristor crossbars
CN109800870A (en) * 2019-01-10 2019-05-24 华中科技大学 A kind of Neural Network Online learning system based on memristor
CN110007764A (en) * 2019-04-11 2019-07-12 北京华捷艾米科技有限公司 A kind of gesture skeleton recognition methods, device, system and storage medium
CN110443168A (en) * 2019-07-23 2019-11-12 华中科技大学 A kind of Neural Network for Face Recognition system based on memristor

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
K-means clustering in a memristive logic array;Jari Tissari等;《2015 IEEE 15th International Conference on Nanotechnology (IEEE-NANO)》;20160121;第633-636页 *
人工神经网络在智能电网中的应用回顾与展望;贾振堂等;《2015年全国智能电网用户端能源管理学术年会论文集》;20151231;第262-271页 *

Also Published As

Publication number Publication date
CN111027619A (en) 2020-04-17

Similar Documents

Publication Publication Date Title
CN111027619B (en) Memristor array-based K-means classifier and classification method thereof
Tang et al. Extreme learning machine for multilayer perceptron
Sohn et al. Improved multimodal deep learning with variation of information
Schulz et al. Deep learning: Layer-wise learning of feature hierarchies
CN109284406B (en) Intention identification method based on difference cyclic neural network
CN109063719B (en) Image classification method combining structure similarity and class information
KR102305568B1 (en) Finding k extreme values in constant processing time
CN109273054B (en) Protein subcellular interval prediction method based on relational graph
JP2015506026A (en) Image classification
KR20220058897A (en) Perform XNOR equivalent operations by adjusting column thresholds of a compute-in-memory array
Bai et al. Coordinate CNNs and LSTMs to categorize scene images with multi-views and multi-levels of abstraction
CN115661550B (en) Graph data category unbalanced classification method and device based on generation of countermeasure network
CN112417289A (en) Information intelligent recommendation method based on deep clustering
Ravi Efficient on-device models using neural projections
CN112749274A (en) Chinese text classification method based on attention mechanism and interference word deletion
Shi et al. Dynamic barycenter averaging kernel in RBF networks for time series classification
Tripathi et al. Real time object detection using CNN
CN114741507B (en) Introduction network classification model establishment and classification of graph rolling network based on Transformer
CN115375877A (en) Three-dimensional point cloud classification method and device based on channel attention mechanism
CN115392357A (en) Classification model training and labeled data sample spot inspection method, medium and electronic equipment
JP3043539B2 (en) neural network
Skorpil et al. Back-propagation and k-means algorithms comparison
CN114766024A (en) Method and apparatus for pruning neural networks
Ahmed et al. Branchconnect: Image categorization with learned branch connections
Wang et al. A Covering Algorithm Based on Competition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant