CN113838497A - Simplified integrated circuit for data reading - Google Patents

Simplified integrated circuit for data reading Download PDF

Info

Publication number
CN113838497A
CN113838497A CN202111115907.9A CN202111115907A CN113838497A CN 113838497 A CN113838497 A CN 113838497A CN 202111115907 A CN202111115907 A CN 202111115907A CN 113838497 A CN113838497 A CN 113838497A
Authority
CN
China
Prior art keywords
data
storage
calculation
array
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111115907.9A
Other languages
Chinese (zh)
Inventor
索超
司鑫
郝午阳
吴强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Houmo Intelligent Technology Co ltd
Original Assignee
Nanjing Houmo Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Houmo Intelligent Technology Co ltd filed Critical Nanjing Houmo Intelligent Technology Co ltd
Priority to CN202111115907.9A priority Critical patent/CN113838497A/en
Publication of CN113838497A publication Critical patent/CN113838497A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/10Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
    • G11C7/1051Data output circuits, e.g. read-out amplifiers, data output buffers, data output registers, data output level conversion circuits
    • G11C7/1069I/O lines read out arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5443Sum of products
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/57Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations
    • G06F7/575Basic arithmetic logic units, i.e. devices selectable to perform either addition, subtraction or one of several logical operations, using, at least partially, the same circuitry
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C8/00Arrangements for selecting an address in a digital store
    • G11C8/10Decoders

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Memory System (AREA)

Abstract

The embodiment of the disclosure discloses a storage and calculation integrated circuit, a chip and a calculation device which can be used for data reading and are simplified, wherein the circuit comprises: the read address decoder is used for selecting a target storage calculation unit from the storage calculation unit array according to the data address to be read; the data input unit array is used for inputting first data to the target storage computing unit and inputting second data to other storage computing units; the storage calculation unit array is used for respectively calculating the first data and the second data in a first preset mode with the corresponding storage data and inputting the calculation result into the operation unit array; and the operation unit array is used for calculating the input calculation result in a second preset mode to obtain the storage data in the target storage calculation unit. The embodiment of the disclosure reduces the area of the storage and computation integrated core, reduces power consumption, improves the working frequency of data reading and writing of the storage and computation integrated core, is less affected by external interference, and has higher reliability of data reading operation.

Description

Simplified integrated circuit for data reading
Technical Field
The present disclosure relates to the field of chip design technologies, and in particular, to a simplified storage and computation integrated circuit, a chip, and a computing device for data reading.
Background
Currently, in a banker chip, a banker core mainly includes a storage structure, a multiply-add structure, and a logic control structure. For the storage structure, the storage-computation core needs to write data into the storage-computation unit array and perform computation with the input product number. The design of the read-write function of the storage structure in the conventional storage and computation integrated design mainly continues the read-write design architecture of the traditional static random access memory.
At present, the design architecture of a storage structure part of a traditional storage and computation integrated core is similar to that of a traditional static random access memory, and is divided into two basic functions of reading and writing, and the implementation mode is basically similar to that of the traditional static random access memory. When the memory core is in a standby state, the bit line and the bit line are not in a precharge high-potential state, the word line is in a low-potential state, and the memory cell is in a data holding state. When the rising edge of the clock comes, the read-write function of the memory is started, and the bit line are precharged in a non-closing mode. If the read data period is the read data period, the word line corresponding to the read address is orderly opened under the control of an internal clock, the data in the memory unit is transmitted to the bit line in a floating high potential state and the negation of the bit line, one end of the memory '0' gradually pulls down the voltage of the corresponding bit line or the negation of the bit line so as to form a voltage difference with the high potential of the negation of the bit line on the other side or the high potential of the bit line, the voltage difference is transmitted to the sense amplifier along with the negation signals of the bit line and the bit line, when the voltage difference reaches a threshold value, the sense amplifier is started to work, the lower side of the negation of the bit line and the bit line is pulled down to '0', the other side of the negation maintains high potential '1', and the read data is transmitted to the output port through the output unit of the memory integrated core, so that the read operation is completed.
Disclosure of Invention
Embodiments of the present disclosure provide a simplified memory bank circuit usable for data reading, the circuit comprising: the device comprises a storage calculation unit array, a data input unit array, an operation unit array and a read address decoder; the storage computing units included in the storage computing unit array correspond to the data input units included in the data input unit array one by one; the read address decoder is used for selecting a corresponding target storage calculation unit from the storage calculation unit array according to an input data address to be read; the data input unit array is used for inputting first data to the target storage computing unit and inputting second data to other storage computing units; the storage calculation unit array is used for respectively calculating the first data and the second data in a first preset mode with the storage data in the corresponding storage calculation units and inputting the obtained calculation results into the operation unit array; the operation unit array is used for calculating the input calculation result in a second preset mode to obtain the storage data in the target storage calculation unit.
In some embodiments, the storage calculation units in the storage calculation unit array comprise multipliers, and the multipliers are used for multiplying the storage data in the storage calculation units with the data input by the corresponding data input units.
In some embodiments, the first data is a digital 1 and the second data is a digital 0.
In some embodiments, the storage calculation units in the storage calculation unit array respectively comprise a first preset number of single-bit storage calculation sub-units, each single-bit storage calculation sub-unit comprises a single-bit memory and a single-bit multiplier, and the single-bit multiplier is used for multiplying data in the corresponding single-bit memory and data input by the corresponding data input unit.
In some embodiments, the operation unit array includes an adder array for adding the calculation results input from the storage calculation unit to obtain the storage data in the target storage calculation unit.
In some embodiments, the adder array includes a second predetermined number of adder groups, the second predetermined number of adder groups are sequentially connected in a cascade manner, and input terminals of adders included in a first-stage adder group of the second predetermined number of adder groups are respectively connected to adjacent storage calculation unit groups.
In some embodiments, the circuit further comprises a master controller for: acquiring a data address to be read, and sending the data address to be read to a read address decoder; determining a target data input unit from the data input unit array based on the address of the data to be read; generating first data corresponding to the target data input unit and second data corresponding to other data input units in the data input unit array; the first data is sent to the target data input unit and the second data is sent to the other data input units.
In some embodiments, the master controller is further to: receiving data output from the arithmetic unit array, and determining the received data as data read from the target storage calculation unit.
According to another aspect of the embodiments of the present disclosure, there is provided a chip including the above-described simplified memory integrated circuit usable for data reading.
According to another aspect of the embodiments of the present disclosure, there is provided a computing device including the above chip.
Compared with the traditional data reading architecture, the simplified storage and calculation integrated circuit, the chip and the calculation device which are provided by the embodiment of the disclosure provide a novel storage and calculation integrated circuit for data reading, when data is read, the data input unit array is used for inputting first data to a target storage calculation unit to be read, second data is input to other storage calculation units, the storage calculation unit array performs calculation based on the input data, the obtained calculation result is input to the calculation unit array, the calculation unit array outputs the calculated data, namely the data read from the target storage calculation unit, so that the simplification is further realized compared with the existing storage and calculation integrated circuit, the read circuit architecture of the traditional static random access memory comprising a sensitive amplifier and the like is not required to be arranged, the area of the storage and calculation integrated core is reduced, the circuit scale is reduced, and the power consumption of the storage and calculation integrated core is reduced. In addition, the embodiment of the disclosure does not adopt a read data architecture of the traditional static random access memory, and the frequency of the stored data does not consider the read operation frequency, so that the working frequency of the storage and calculation integrated core on the data read-write operation is improved. The embodiment of the disclosure does not adopt a structure of reading data by a sensitive amplifier of the traditional static random access memory, so that the design scheme is less influenced by external interference factors such as process deviation, reading interference, voltage, temperature and the like, and the reliability of data reading operation is higher.
The technical solution of the present disclosure is further described in detail by the accompanying drawings and examples.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent by describing in more detail embodiments of the present disclosure with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure and not to limit the disclosure. In the drawings, like reference numbers generally represent like parts or steps.
Fig. 1 is a schematic structural diagram of a simplified memory integrated circuit that can be used for data reading according to an exemplary embodiment of the present disclosure.
Fig. 2 is a schematic structural diagram of a single-bit storage computing subunit according to an exemplary embodiment of the present disclosure.
Fig. 3 is a schematic structural diagram of an adder array according to an exemplary embodiment of the present disclosure.
Fig. 4 is another schematic structural diagram of a simplified memory integrated circuit that can be used for data reading according to an exemplary embodiment of the present disclosure.
Detailed Description
Hereinafter, example embodiments according to the present disclosure will be described in detail with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of the embodiments of the present disclosure and not all embodiments of the present disclosure, with the understanding that the present disclosure is not limited to the example embodiments described herein.
It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.
It will be understood by those of skill in the art that the terms "first," "second," and the like in the embodiments of the present disclosure are used merely to distinguish one element from another, and are not intended to imply any particular technical meaning, nor is the necessary logical order between them.
It is also understood that in embodiments of the present disclosure, "a plurality" may refer to two or more and "at least one" may refer to one, two or more.
It is also to be understood that any reference to any component, data, or structure in the embodiments of the disclosure, may be generally understood as one or more, unless explicitly defined otherwise or stated otherwise.
In addition, the term "and/or" in the present disclosure is only one kind of association relationship describing an associated object, and means that three kinds of relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the former and latter associated objects are in an "or" relationship.
It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and the same or similar parts may be referred to each other, so that the descriptions thereof are omitted for brevity.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
Summary of the application
For the design of the multiply-add unit, there are two main technical solutions at present. Firstly, a large-scale array is constructed by using a plurality of simple integer multiplier-adders, the scheme has poor support on data bit width flexibility and data quantity flexibility, floating point operation is not supported, and data needs to be quantized in advance on a computer before use. And secondly, the general floating point computing unit is used for circularly carrying out multiply-add operation, so that the scheme has more redundancy processing and low efficiency when the data volume is larger.
Exemplary Structure
Fig. 1 is a schematic structural diagram of a simplified memory integrated circuit that can be used for data reading according to an exemplary embodiment of the present disclosure. The various components of the circuit may be integrated into a single chip or may be implemented on different chips or circuit boards that establish data communication links therebetween.
As shown in fig. 1, the circuit includes: a memory calculation cell array 101, a data input cell array 102, an arithmetic cell array 103, and a read address decoder 104. The storage computing units included in the storage computing unit array 101 correspond to the data input units included in the data input unit array 102 one to one. The memory computing cell array 101 includes a plurality of memory computing cells, each including a preset number of single-bit memory computing sub-cells. As shown in fig. 1, four cells (i.e., single-bit memory sub-cells) in each row represent a memory computing unit, i.e., a memory computing unit can store 4-bit data. Each data input cell in the data input cell array 102 corresponds to a storage calculation cell.
In the present embodiment, the read address decoder 104 is configured to select a corresponding target memory compute unit from the memory compute unit array 101 according to an input data address to be read. The read address decoder 104 may activate the corresponding word line according to the data address to be read, thereby selecting the target memory cell. Data as shown in fig. 1, the read address decoder 104 determines that the memory calculation unit in the first row is the target memory calculation unit, that is, the target memory calculation unit stores the data to be read, according to the input data address add1 to be read.
In the present embodiment, the data input unit array 102 is used to input first data to the target storage computing unit and input second data to other storage computing units. The data input unit for transmitting the first data may be determined according to the address of the data to be read, and specifically, since the data input unit and the storage calculation unit are in one-to-one correspondence, the data input unit corresponding to the target storage calculation unit may be determined according to the address of the data to be read. Generally, the data input unit inputs one single-bit data to the corresponding storage calculation unit at a time, that is, the first data and the second data are single-bit data. For example, the first data may be 1 (or 0) and the second data may be 0 (or 1).
In this embodiment, the storage computing unit array 101 is configured to perform a first preset calculation on the first data and the second data respectively with the storage data in the corresponding storage computing unit, and input the obtained calculation result to the operation unit array 103. The first preset mode may be various calculation modes set as required. Such as a multiplication operation or a bitwise and operation.
Optionally, the first preset manner may be a multiplication calculation, and the storage calculation unit in the storage calculation unit array 101 includes a multiplier, and the multiplier is configured to multiply the storage data in the storage calculation unit with the data input by the corresponding data input unit. Alternatively, the first data may be a digital 1 and the second data may be a digital 0. At this time, the stored data stored in the target storage calculation unit is multiplied by the number 1, and the calculation result thereof is equal to the stored data; the stored data stored in other storage calculating units are multiplied by the number 0, and the calculation results are all 0. The respective calculation results may subsequently be added to obtain a sum equal to the stored data in the target stored calculation unit.
It should be noted that the first data and the second data may also be set as other data, for example, the first data may be 0, and the second data may be 1, in this case, after being input into the storage calculation unit, the first data and the second data need to be inverted first, and then multiplication needs to be performed.
In the optional implementation manner, the multiplier is arranged in the storage calculation unit, so that the input data and the storage data can be multiplied, not only can the multiplication and addition operation of scenes such as a neural network be supported, but also the storage data can be read from the target storage calculation unit by utilizing the principle of multiplication and addition, and the application scene of the storage and calculation integrated circuit is expanded.
Optionally, the storage calculation units in the storage calculation unit array 101 respectively include a first preset number (as shown in fig. 1, the first preset number is 4) of single-bit storage calculation sub-units. As shown in fig. 2, each single-bit storage calculation subunit 201 includes a single-bit memory 2011 and a single-bit multiplier 2012, and the single-bit multiplier 2012 is used for multiplying the data in the corresponding single-bit memory 2011 and the data input by the corresponding data input unit. Each data input unit generally inputs one single-bit data at a time, and the single-bit data is multiplied by the single-bit bits stored in the respective single-bit memories included in the corresponding storage calculation unit to obtain the single-bit products respectively corresponding to the stored single-bit bits. The bits output by each single-bit multiplier included in a memory calculation unit constitute the data output by the memory calculation unit, and the data is input into the operation unit array 103.
As an example, as shown IN fig. 1, the first row storage calculating unit is a target storage calculating unit, where data W is 1010, that is, W [0] is 0, W [1] is 1, W [2] is 0, and W [3] is 1, and the input first data IN is 1, and then four single-bit multipliers IN the target storage calculating unit calculate W [0] × IN, W [1] × IN, W [2] × IN, and W [4] × IN, respectively, and the calculated product data is S1010 — W. And the other storage calculation units receive the input second data as 0, and the product data output by the other storage calculation units are all 0 according to the same calculation process. As an example, the single-bit multiplier may comprise a nor gate for nor-oring the single-bit bits comprised by the inverted stored data and the single-bit bits comprised by the inverted input data, resulting in single-bit product data.
The single-bit memory and the single-bit multiplier are arranged in the single-bit memory calculation subunit, so that the internal multiplication operation can be realized by using a simple architecture and a calculation process, and the efficiency of reading and storing data can be improved.
In this embodiment, the arithmetic unit array 103 is configured to perform a second predetermined calculation on the input calculation result to obtain the storage data in the target storage calculation unit. The second preset mode may be arbitrarily set as required, and may be, for example, addition calculation (implemented by an adder array) or bitwise or operation (implemented by a gate array), or the like.
The circuit provided by the above embodiment of the present disclosure provides a novel storage and computation integrated circuit for data reading, which is simplified compared with the conventional data reading architecture, and when data is read, the data input unit array is used to input first data to a target storage and computation unit to be read, second data is input to other storage and computation units, the storage and computation unit array performs computation based on the input data, the obtained computation result is input to the operation unit array, the operation unit array outputs the computed data, that is, the data read from the target storage and computation unit, so that compared with the conventional storage and computation integrated circuit, simplification is further achieved, and a read circuit architecture of a conventional static random access memory including a sense amplifier and the like is not required to be provided, the area of the storage and calculation integrated core is reduced, the circuit scale is reduced, and the power consumption of the storage and calculation integrated core is reduced. In addition, the embodiment of the disclosure does not adopt a read data architecture of the traditional static random access memory, and the frequency of the stored data does not consider the read operation frequency, so that the working frequency of the storage and calculation integrated core on the data read-write operation is improved. The embodiment of the disclosure does not adopt a structure of reading data by a sensitive amplifier of the traditional static random access memory, so that the design scheme is less influenced by external interference factors such as process deviation, reading interference, voltage, temperature and the like, and the reliability of data reading operation is higher.
In some alternative implementations, the operation unit array 103 includes an adder array for adding the calculation results input from the storage calculation units to obtain the storage data in the target storage calculation unit. With reference to the above examples, if the calculation result input from the target storage calculation unit is 1 and the calculation result input from the other storage calculation unit is 0, the calculation results are added by the adder array, and the obtained data is the storage data in the target storage calculation unit.
According to the implementation mode, the adder array is arranged, the calculation results output by the storage calculation unit array can be added to obtain the storage data in the target storage calculation unit, the multiplication and addition operation of scenes such as a neural network can be supported, the storage data can be read from the target storage calculation unit by utilizing the principle of multiplication and addition, and therefore the application scene of the storage and calculation integrated circuit is expanded.
In some optional implementations, the adder array includes a second preset number of adder groups, the second preset number of adder groups are sequentially connected in a cascade manner, and input ends of adders included in a first-stage adder group in the second preset number of adder groups are respectively connected to adjacent storage calculation unit groups.
As shown in fig. 3, the operation unit array 103 is composed of an adder array including M (i.e., a second predetermined number) columns, each of which includes adders as an adder group, i.e., a first column labeled "adder _ 1" is a first-stage adder group, a second column labeled "adder _ 2" is a second-stage adder group, … …, and an mth column labeled "adder _ M" is an mth-stage adder group. Starting from the second-stage adder group, each adder corresponds to two adders of a previous stage, namely the outputs of the two adders of the previous stage are used as the inputs of the adder of the next stage. The first-level adder group comprises adders each receiving the calculation result input by the corresponding storage calculation unit, and the Mth-level adder group comprises only one adder, and the output data of the adder is the sum of the data output by the N storage calculation units.
The implementation mode can effectively save the area occupied by the adder array by arranging the cascaded adder groups, improve the area utilization rate of the circuit, shorten the length of a data transmission line in the circuit and be beneficial to reducing the power consumption when reading data.
In some alternative implementations, as shown in fig. 4, the circuit further includes a main controller 105, the main controller 105 is configured to perform the following steps:
step one, acquiring a data address to be read, and sending the data address to be read to a read address decoder.
The address to be read may be obtained according to a running data reading program. After the data address add1 to be read is sent to the read address decoder 104, the read address decoder 104 further selects a target memory cell. As shown in FIG. 4, the memory compute units located in the first row are target memory compute units.
And step two, determining a target data input unit from the data input unit array based on the address of the data to be read.
Specifically, the main controller 105 may determine a target data input unit corresponding to the target storage calculation unit from the data input unit array 102 according to the address of the data to be read, based on the corresponding relationship between the data input unit and the storage calculation unit. As shown in fig. 4, the data input units located in the first row are target data input units.
And step three, generating first data corresponding to the target data input unit and second data corresponding to other data input units in the data input unit array.
And step four, sending the first data to the target data input unit and sending the second data to other data input units.
As an example, as shown in fig. 4, the first data is a number 1 and the second data is a number 0. The numeral 1 is input to the target memory calculation unit through the target data input unit, and the numeral 0 is input to the other memory calculation units.
This implementation mode is through setting up main control unit, can effectively control the data transmission of each part in the circuit, makes the data reading function of the integrative circuit of deposit perfect more, improves the efficiency of reading data from the integrative circuit of deposit.
In some alternative implementations, the master controller 105 is further configured to: the data output from the arithmetic unit array 103 is received, and the received data is determined as the data read from the target storage calculation unit.
As shown in fig. 4, the storage calculation unit array 101 receives the first data and the second data input from the data input unit array 102, performs calculation in each storage calculation unit, and inputs each calculation result obtained to the calculation unit array 103, and the calculation unit array 103 further calculates each calculation result input, and inputs the obtained data "1010" to the main controller 105. After obtaining the storage data in the target storage computing unit, the main controller 105 may continue a series of operations on the storage data.
According to the implementation mode, the data output by the arithmetic unit array is received and used as the data read from the target storage calculation unit, the storage and calculation integrated circuit can be controlled to read the data in the storage and calculation mode, the storage and calculation integrated circuit does not need to be set to be in a traditional data reading mode, the application scene of the storage and calculation integrated circuit is expanded, and the circuit architecture for reading the data is simplified.
Embodiments of the present disclosure also provide a chip on which a simplified banker circuit for data reading is integrated, and technical details of the simplified banker circuit for data reading are shown in fig. 1 to 4 and related description, which are not further described herein.
Embodiments of the present disclosure also provide a computing device including the chip described in the above embodiments. Furthermore, the computing device may also include input devices, output devices, and necessary memory, etc. The input device may include a mouse, a keyboard, a touch screen, a communication network connector, etc., for inputting a data address to be read, etc. Output devices may include devices such as displays, printers, and communication networks and remote output devices connected thereto for outputting data read from a target storage computing unit such as those described in the embodiments above. The memory is used for storing the data input by the input device and the data generated in the operation process of the simplified memory integrated circuit which can be used for data reading. The memory may include volatile memory and/or non-volatile memory. Volatile memory can include, for example, Random Access Memory (RAM), cache memory (or the like). The non-volatile memory may include, for example, Read Only Memory (ROM), a hard disk, flash memory, and the like.
The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.
In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts in the embodiments are referred to each other.
The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".
The circuitry of the present disclosure may be implemented in a number of ways. For example, the circuitry of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order of the steps of the method used in the circuit is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be implemented as a program recorded in a recording medium, the program including machine-readable instructions for implementing the functions of the circuit according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the functions of the circuit according to the present disclosure.
It is further noted that in the circuits of the present disclosure, components or steps may be decomposed and/or recombined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims (10)

1. A reduced, computationally-integrated circuit usable for data reading, comprising: the device comprises a storage calculation unit array, a data input unit array, an operation unit array and a read address decoder; the storage computing units included in the storage computing unit array correspond to the data input units included in the data input unit array one by one;
the read address decoder is used for selecting a corresponding target storage computing unit from the storage computing unit array according to an input data address to be read;
the data input unit array is used for inputting first data to the target storage computing unit and inputting second data to other storage computing units;
the storage calculation unit array is used for respectively calculating the first data and the second data in a first preset mode with storage data in corresponding storage calculation units and inputting the obtained calculation results into the operation unit array;
and the arithmetic unit array is used for calculating an input calculation result in a second preset mode to obtain the storage data in the target storage calculation unit.
2. The circuit of claim 1, wherein a storage compute unit in the array of storage compute units includes a multiplier for multiplying storage data in a storage compute unit with data input by a corresponding data input unit.
3. The circuit of claim 2, wherein the first data is a digital 1 and the second data is a digital 0.
4. The circuit of claim 2, wherein the storage computation units in the storage computation unit array respectively comprise a first preset number of single-bit storage computation subunits, each single-bit storage computation subunit comprising a single-bit memory and a single-bit multiplier, the single-bit multiplier being configured to multiply data in the corresponding single-bit memory with data input by the corresponding data input unit.
5. The circuit according to claim 1, wherein the arithmetic unit array includes an adder array for adding calculation results input from the storage calculation unit to obtain the storage data in the target storage calculation unit.
6. The circuit of claim 5, wherein the adder array comprises a second predetermined number of adder groups, the second predetermined number of adder groups are sequentially connected in a cascade manner, and input terminals of adders included in a first-stage adder group of the second predetermined number of adder groups are respectively connected to adjacent storage calculation unit groups.
7. The circuit of claim 1, wherein the circuit further comprises a master controller to:
acquiring a data address to be read, and sending the data address to be read to the read address decoder;
determining a target data input unit from the data input unit array based on the address of the data to be read;
generating first data corresponding to the target data input unit and second data corresponding to other data input units in the data input unit array;
transmitting the first data to the target data input unit and transmitting the second data to the other data input unit.
8. The circuit of claim 7, wherein the master controller is further to:
receiving data output by the arithmetic unit array, and determining the received data as data read from the target storage calculation unit.
9. A chip comprising a reduced computational integrated circuit usable for data reading according to any one of claims 1 to 8.
10. A computing device comprising a chip according to claim 9.
CN202111115907.9A 2021-09-23 2021-09-23 Simplified integrated circuit for data reading Pending CN113838497A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111115907.9A CN113838497A (en) 2021-09-23 2021-09-23 Simplified integrated circuit for data reading

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111115907.9A CN113838497A (en) 2021-09-23 2021-09-23 Simplified integrated circuit for data reading

Publications (1)

Publication Number Publication Date
CN113838497A true CN113838497A (en) 2021-12-24

Family

ID=78969439

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111115907.9A Pending CN113838497A (en) 2021-09-23 2021-09-23 Simplified integrated circuit for data reading

Country Status (1)

Country Link
CN (1) CN113838497A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN212112470U (en) * 2020-04-24 2020-12-08 科大讯飞股份有限公司 Matrix multiplication circuit
US20210174848A1 (en) * 2019-12-06 2021-06-10 Xilinx, Inc. Data transfers between a memory and a distributed compute array
CN113257306A (en) * 2021-06-10 2021-08-13 中科院微电子研究所南京智能技术研究院 Storage and calculation integrated array and accelerating device based on static random access memory
CN113393879A (en) * 2021-04-27 2021-09-14 北京航空航天大学 Nonvolatile memory and SRAM mixed storage integrated data fast loading structure
CN113419705A (en) * 2021-07-05 2021-09-21 南京后摩智能科技有限公司 Memory multiply-add calculation circuit, chip and calculation device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210174848A1 (en) * 2019-12-06 2021-06-10 Xilinx, Inc. Data transfers between a memory and a distributed compute array
CN212112470U (en) * 2020-04-24 2020-12-08 科大讯飞股份有限公司 Matrix multiplication circuit
CN113393879A (en) * 2021-04-27 2021-09-14 北京航空航天大学 Nonvolatile memory and SRAM mixed storage integrated data fast loading structure
CN113257306A (en) * 2021-06-10 2021-08-13 中科院微电子研究所南京智能技术研究院 Storage and calculation integrated array and accelerating device based on static random access memory
CN113419705A (en) * 2021-07-05 2021-09-21 南京后摩智能科技有限公司 Memory multiply-add calculation circuit, chip and calculation device

Similar Documents

Publication Publication Date Title
CN112711394B (en) Circuit based on digital domain memory computing
Haj-Ali et al. Efficient algorithms for in-memory fixed point multiplication using magic
CN113419705A (en) Memory multiply-add calculation circuit, chip and calculation device
US7920413B2 (en) Apparatus and method for writing data to phase-change memory by using power calculation and data inversion
US9342478B2 (en) Processor with reconfigurable architecture including a token network simulating processing of processing elements
CN113853601A (en) Apparatus and method for matrix operation
CN110674462B (en) Matrix operation device, method, processor and computer readable storage medium
US20220269483A1 (en) Compute in memory accumulator
CN111583095A (en) Image data storage method, image data processing system and related device
CN110647722A (en) Data processing method and device and related product
CN113885831A (en) Storage and calculation integrated circuit based on mixed data input, chip and calculation device
US20230253032A1 (en) In-memory computation device and in-memory computation method to perform multiplication operation in memory cell array according to bit orders
US11500629B2 (en) Processing-in-memory (PIM) system including multiplying-and-accumulating (MAC) circuit
CN113838497A (en) Simplified integrated circuit for data reading
CN115495152A (en) Memory computing circuit with variable length input
US20220351765A1 (en) Processing-in-memory (pim) device for performing a burst multiplication and accumulation (mac) operation
TWI814734B (en) Calculation device for and calculation method of performing convolution
Zhu et al. imad: An in-memory accelerator for addernet with efficient 8-bit addition and subtraction operations
CN113743046B (en) Integrated layout structure for memory and calculation and integrated layout structure for data splitting and memory and calculation
JP6944733B1 (en) Dot product calculation device
CN111061675A (en) Hardware implementation method of system transfer function identification algorithm, computer equipment and readable storage medium for running method
CN113536219B (en) Operation method, processor and related products
KR20190114208A (en) In DRAM Bitwise Convolution Circuit for Low Power and Fast Computation
CN113823336B (en) Data writing circuit for storage and calculation integration
US20230333814A1 (en) Compute-in memory (cim) device and computing method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination