CN117009265B - Data processing device applied to system on chip - Google Patents

Data processing device applied to system on chip Download PDF

Info

Publication number
CN117009265B
CN117009265B CN202311266234.6A CN202311266234A CN117009265B CN 117009265 B CN117009265 B CN 117009265B CN 202311266234 A CN202311266234 A CN 202311266234A CN 117009265 B CN117009265 B CN 117009265B
Authority
CN
China
Prior art keywords
data
task
address
descriptor
local storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311266234.6A
Other languages
Chinese (zh)
Other versions
CN117009265A (en
Inventor
王帝
李进
李传业
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Suiyuan Intelligent Technology Co ltd
Original Assignee
Beijing Suiyuan Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Suiyuan Intelligent Technology Co ltd filed Critical Beijing Suiyuan Intelligent Technology Co ltd
Priority to CN202311266234.6A priority Critical patent/CN117009265B/en
Publication of CN117009265A publication Critical patent/CN117009265A/en
Application granted granted Critical
Publication of CN117009265B publication Critical patent/CN117009265B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4063Device-to-bus coupling
    • G06F13/4068Electrical coupling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7807System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a data processing device applied to a system on a chip. The device comprises: the system comprises a data processor, a controller, a configuration interface, an external bus and a local storage; the controller is respectively in communication connection with the external bus, the local storage and the configuration interface; the configuration interface is in communication connection with the data processor; the data processor is respectively connected with the controller and the local storage in a communication way; the controller is used for carrying out data processing task configuration, transmitting configuration information corresponding to the data processing task to the configuration interface, and transmitting task descriptors corresponding to the data processing task to the local storage; the data processor is used for acquiring configuration information through the configuration interface, acquiring corresponding task descriptors in the local storage according to the configuration information, and analyzing to obtain analysis results so as to access the local storage or the external bus for data processing. The data processor processes the data through the task descriptor so as to reduce the resource occupation of the controller and realize flexible data processing.

Description

Data processing device applied to system on chip
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a data processing apparatus applied to a system on a chip.
Background
There are various forms of data communication within a system-on-chip. Traditionally, systems on chip have employed pure software to rely on CPU instructions to perform data communications, such as data access and handling. Alternatively, linear copying is performed between the data source and destination addresses by means of simple direct memory access (Direct Memory Access, DMA).
There are problems with using these methods. For example, depending on software to read and write a large amount of data, a large amount of CPU resources are occupied, so that other tasks are blocked in the process, and the working efficiency of the system is affected; and the hardware execution unit may be idle, resulting in resource waste. The traditional DMA linear data carrying mode aims at the data processing of discrete addresses, needs information such as addresses which are frequently replaced by depending on software, and has low efficiency.
Disclosure of Invention
The invention provides a data processing device applied to a system on a chip, which is used for reducing the resource occupation of a controller and realizing flexible data processing.
According to an aspect of the present invention, there is provided a data processing apparatus applied to a system on a chip, the data processing apparatus comprising: the system comprises a data processor, a controller, a configuration interface, an external bus and a local storage;
The controller is respectively in communication connection with the external bus, the local storage and the configuration interface; the configuration interface is in communication connection with the data processor; the data processor is respectively connected with the external bus and the local storage in a communication way;
the controller is used for configuring the data processing task, transmitting configuration information corresponding to the data processing task to the configuration interface, and transmitting a task descriptor corresponding to the data processing task to the local storage;
the data processor is used for acquiring configuration information through the configuration interface and acquiring corresponding task descriptors in the local storage according to the configuration information; and analyzing the task descriptor, and accessing a local storage or an external bus to perform data processing according to the analysis result.
Optionally, the data processor includes at least one of: a data distributor, a data collector and a data comparator.
Optionally, when the data processor is a data distributor, the task descriptor includes: a data distribution address, distribution data, a burst transfer length threshold, at least one of: an offset address flag, a data replacement flag, and a template data modification flag;
correspondingly, the configuration information comprises: the storage address of the task descriptor, the task return information, and at least one of: offset address value, data replacement content, and the address of deposit of the subclass descriptor.
Optionally, the data distributor is specifically configured for at least one of the following:
acquiring a corresponding offset address value from the configuration information according to the offset address mark; determining a target distribution address according to the offset address value and the data distribution address, and distributing distribution data to the target distribution address through a local storage or an external bus;
acquiring corresponding data replacement content from the configuration information according to the data replacement mark; the distributed data is replaced by the data replacement content, and the distributed data is distributed to the data distribution address or the target distribution address determined according to the offset address mark through the local storage or the external bus;
acquiring the storage address of the corresponding sub-category descriptor in the configuration information according to the template data change mark; obtaining sub-class descriptor information in local storage according to the storage address of the sub-class descriptor, and replacing corresponding information in the task descriptor by adopting the sub-class descriptor information; and carrying out data distribution according to the replaced task descriptor.
Optionally, when the data processor is a data collector, the task descriptor includes: a data collection address, a burst transfer length threshold, and at least one of: an offset address flag, an external bus ID change flag, an invalid flag, and a last bit flag;
Correspondingly, the configuration information comprises: a deposit address for the task descriptor, a deposit address for the collected data, task return information, and an offset address value.
Optionally, the data collector is specifically configured for at least one of the following:
acquiring a corresponding offset address value from the configuration information according to the offset address mark; determining a target collection address according to the offset address value and the data collection address, collecting data at the target collection address through a local storage or an external bus, and storing the collected data at a storage address of the collected data;
allocating a new ID according to the external bus ID change mark, and collecting data by using the new ID; in the process of collecting data by the data collector, the task processing with the same ID returns data in sequence, and the task processing with different IDs interweaves the returned data in disorder;
dynamically enabling or disabling the task descriptor according to the invalidation flag;
and ending the data collection task according to the last bit mark.
Optionally, when the data processor is a data comparator, the task descriptor includes: the comparison data comprises a comparison data address, preset comparison data, comparison bits, a comparison mode, a matching mark, a signaling address, signaling data and a last bit mark;
Correspondingly, the configuration information comprises: the storage address of the task descriptor, and the task return information.
Optionally, the data comparator is specifically configured to:
according to the matching mark, when the matching is not successful, obtaining the data to be compared according to the data address to be compared; comparing the data to be compared with preset comparison data according to the comparison bit and the comparison mode;
updating the matching mark, the signaling address and the signaling data according to the comparison result;
and finishing the data comparison task according to the last bit mark.
Optionally, the configuration information further includes: priority information;
an arbiter is provided in the data processor.
Optionally, the data processor is specifically configured to:
acquiring priority information in the configuration information through the configuration interface, and establishing connection with a target configuration interface according to the priority information and an internally arranged arbiter;
acquiring target configuration information through a target configuration interface; acquiring corresponding task descriptors in local storage according to the target configuration information; and analyzing the task descriptor, and accessing a local storage or an external bus to perform data processing according to the analysis result.
According to the technical scheme, the data processing device comprises a data processor, a controller, a configuration interface, an external bus and a local storage device, wherein the controller is respectively in communication connection with the external bus, the local storage device and the configuration interface; the configuration interface is in communication connection with the data processor; the data processor is respectively connected with the controller and the local storage in a communication way; the controller is used for carrying out data processing task configuration, transmitting configuration information corresponding to the data processing task to the configuration interface, and transmitting task descriptors corresponding to the data processing task to the local storage; the data processor is used for acquiring configuration information through the configuration interface, acquiring corresponding task descriptors in the local storage according to the configuration information, analyzing to obtain analysis results, accessing the local storage or an external bus to perform data processing, solving the problem of system-on-chip data processing, reducing the resource occupation of a controller in the data processing and realizing flexible data processing.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a data processing apparatus applied to a system on a chip according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a data processing apparatus applied to a system on a chip according to still another embodiment of the present invention;
FIG. 3 is a schematic layout diagram of task descriptors corresponding to a data distributor according to an embodiment of the present invention;
FIG. 4 is a schematic layout diagram of task descriptors corresponding to a data collector according to an embodiment of the present invention;
Fig. 5 is a schematic layout diagram of task descriptors corresponding to a data comparator according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Fig. 1 is a schematic structural diagram of a data processing apparatus applied to a system on a chip according to an embodiment of the present invention, where the embodiment is applicable to a case where the system on a chip performs data processing, for example, data distribution, data collection, or data comparison and the like.
As shown in fig. 1, the data processing apparatus 100 includes: a data processor 110, a controller 120, a configuration interface 130, an external bus 140, and local storage 150.
Wherein the EXTERNAL BUS (EXTERNAL BUS) is a high-speed data BUS interconnected with various IPs at other locations of the system-on-chip. LOCAL storage (LOCAL MEMORY) is LOCAL storage. The configuration interface (Program Interface) is a configuration interface for the controller to issue DMA tasks. The controller may be a micro control unit (Microcontroller Unit, MCU) or other host (host) on an external bus.
As shown in fig. 1, the specific structure of the data processing apparatus 100 is: the controller 120 is communicatively coupled to the external bus 140, the local store 150, and the configuration interface 130, respectively; the configuration interface 130 is communicatively coupled to the data processor 110; the data processor 110 is communicatively coupled to an external bus 140 and a local store 150, respectively.
Specifically, the controller is configured to perform data processing task configuration, transmit configuration information corresponding to the data processing task to the configuration interface, and transmit a task descriptor corresponding to the data processing task to the local storage.
Wherein the task descriptor may be used to indicate the manner in which the task is handled. By specific setting of the task descriptor, operations such as temporary replacement of data involved in data processing, address replacement, or sharing of the overall template of the descriptor can be performed. The task call is performed in a mode of storing descriptors in advance, so that the configuration delay of the central processing unit is greatly reduced, and the configuration delay and the bus configuration can asynchronously work to form a pipeline, thereby greatly improving the working efficiency. And by special setting of task descriptors and combining with a data processing flow, repeated configuration behaviors can be reduced in task processing with homogenous application scenes and different details, the utilization rate of hardware resources can be improved, and task configuration in data processing is more flexible.
In the embodiment of the invention, the data processor is used for acquiring configuration information through the configuration interface and acquiring corresponding task descriptors in the local storage according to the configuration information; and analyzing the task descriptor, and accessing a local storage or an external bus to perform data processing according to the analysis result.
Wherein the configuration information and the task descriptor are used in cooperation with each other. All processing operations in the data processing can be determined by the configuration information and task descriptors. For example, the configuration information may include a storage address of the task descriptor, and the data processor may obtain the task descriptor from the local storage according to the storage address, parse the task descriptor, and perform data processing according to the parsing result. As another example, an offset address value may be included in the configuration information and an offset address flag may be included in the task description. When the data processor analyzes the offset address mark in the task descriptor to be required to be subjected to address replacement, the offset address value in the configuration information can be obtained, and a target address is generated according to the offset address value to perform data processing. For another example, the configuration information may include task return information, and when the data processor completes data processing according to the task descriptor, the data processor may feed back the task execution condition corresponding to the current task descriptor according to the task return information.
In the case of specific data processing, the data processor may perform the data processing at a specific address via a local memory or an external bus, depending on the specific address to which the data processing relates.
In an alternative implementation of the embodiment of the present invention, the data processor includes at least one of: a data distributor, a data collector and a data comparator. The data distributor can realize the distribution and configuration of the discrete data of the plurality of slaves by the master computer. The data collector can collect and store discrete data of the plurality of slaves by the master. The data comparator can realize that the host samples the data of one or more addresses and compares the data after sampling.
Specifically, fig. 2 is a schematic structural diagram of yet another data processing apparatus applied to a system on chip according to an embodiment of the present invention. As shown in fig. 2, the data processing apparatus 100 includes: a data processor 110, a controller 120, a configuration interface 130, an external bus 140, and a local store 150; the data processor 110 includes: a data distributor 111, a data collector 112 and a data comparator 113.
According to the technical scheme, the data distributor, the data collector and the data comparator are arranged in the data processing device, and various types of data processing can be realized through the combined use of the configuration information and the task descriptor; during data processing, a controller such as a host is asynchronous with bus configuration, so that configuration delay of the controller is reduced, data processing efficiency is improved, and occupation of resources of the controller in data processing is avoided; moreover, based on the setting of the task descriptor, the problem of repeated setting of the sharable content in the data processing can be avoided, and the configuration flexibility of the data processing is improved; the register configuration and the data handling work can be parallelized through the task descriptor, and the processing efficiency is improved.
Based on the above embodiment, optionally, when the data processor is a data distributor (Scatter), the task descriptor includes: a data distribution address, distribution data, a burst transfer length threshold, at least one of: an offset address flag, a data replacement flag, and a template data modification flag; correspondingly, the configuration information comprises: the storage address of the task descriptor, the task return information, and at least one of: offset address value, data replacement content, and the address of deposit of the subclass descriptor.
In an alternative implementation of the embodiment of the present invention, the data distributor is specifically configured for at least one of the following: acquiring a corresponding offset address value from the configuration information according to the offset address mark; determining a target distribution address according to the offset address value and the data distribution address, and distributing distribution data to the target distribution address through a local storage or an external bus; acquiring corresponding data replacement content from the configuration information according to the data replacement mark; the distributed data is replaced by the data replacement content, and the distributed data is distributed to the data distribution address or the target distribution address determined according to the offset address mark through the local storage or the external bus; acquiring corresponding data replacement content from the configuration information according to the data replacement mark; and replacing the distribution data with the data replacement content, and distributing the distribution data to the data distribution address or the target distribution address determined according to the offset address mark through the local storage or the external bus.
Fig. 3 is a schematic layout diagram of task descriptors corresponding to a data distributor according to an embodiment of the present invention. The specific meaning of the task descriptor as shown in fig. 3 is as follows: addr represents the data distribution address. For example, in an embodiment of the present invention, the data distribution address may be set to 48 bits, for example addr [47:0] indicates that the data distribution address occupies 48 bits in total, specifically including all of DW0, and bits 0 to 15 of DW 1. Where DW represents a doubleword, taking up 32 bits. data represents distribution data, and multiple groups of distribution data can be set according to the data distribution task requirements, so that multiple DWs are occupied. BL represents a burst transfer length threshold and may occupy 4 bits, namely BL [3:0]. For example, BL may occupy 16 to 19 bits of DW 1. BL is at most 16 and cannot cross 4KB boundaries. O represents an offset address flag, which may occupy 1 bit, i.e., O0:0. For example, O may occupy bit 29 of DW 1. RD represents a data replacement flag, which may occupy 1 bit, namely RD [0:0]. For example, RD may occupy bit 30 of DW 1. TDC represents a template data change flag and may occupy 1 bit, namely TDC [0:0]. For example, TDC may occupy bit 28 of DW 1. Rsvd represents reserved bits, which may occupy bits 20 to 27 of DW 1.
In practical applications, the specific setting of task descriptors provided in the embodiment of the present invention is not limited to the layout shown in fig. 3. Next, a specific application of the task descriptor in the data distribution task will be described with reference to fig. 3 as an example.
In particular, the task descriptor may be written by the controller into the local store in the manner described above. A large number of task descriptors can be stored in the local storage, but task descriptors corresponding to one operation application must be stored in the local storage continuously. Each task descriptor of the script corresponds to a burst transfer (BURST transaction) on one bus. Because BUSRT transmissions are configurable, the number of data contained within a task descriptor is not fixed, and the length of the task descriptor is not fixed. Specifically, the task descriptor of the script may be 3 to 18 DWs in length.
When a data distribution task needs to be executed, the controller can send a data distribution request through the configuration interface. The data distribution request may be configured by the controller by way of writing a register. The data distribution request includes configuration information. Optionally, the configuration information includes priority information. The data distributor may obtain configuration information at the configuration interface and determine a processing order of the data distribution tasks according to the priority information. Optionally, an arbiter is provided in the data distributor. For data distribution tasks of the same priority, the data distributors may arbitrate by an arbiter according to a round-robin mechanism (round-robin). Namely, the data distributor can determine a target configuration interface in a plurality of configuration interfaces through the priority information and the arbiter, acquire corresponding target configuration information and carry out data distribution processing.
When the data distributor processes the data distribution task, the task descriptor can be analyzed to obtain the data distribution address and distribution data. The data distributor distributes the distribution data to the data distribution addresses through an external bus or local storage.
In a data distribution scenario, there may be a case where data is distributed to a plurality of isomorphic units, and a distribution task of each isomorphic unit corresponds to a set of task descriptors. The number of task descriptors required to configure these isomorphic units, the specification inside the corresponding task descriptors, and even the distribution data configured are the same. In order to reduce the configuration amount of the data distribution task and improve the data distribution efficiency when configuring a plurality of isomorphic units, task descriptors can be multiplexed. In particular, task descriptor multiplexing may be implemented by modifying only the address of the data distribution.
Specifically, when the O flag is an enabled offset address in the task descriptor shown in fig. 3, the data distributor may obtain a corresponding offset address value in the configuration information, and superimpose the offset address value and the data distribution address to obtain target distribution addresses of multiple isomorphic units. Distributing the distribution data to the target distribution address through a local storage or an external bus. This saves time and resource overhead in repeatedly generating task descriptors, particularly when such data distribution operations are temporarily required to be generated. The enable offset address may be to set the O flag to 1.
In the data distribution scenario, there may be uncertainty of distribution data when task descriptors are generated, but data distribution addresses and other header information of data distribution (descriptors other than distribution data may be used as header information, i.e., DW0 and DW 1) are determined. In order to avoid the temporary generation of task descriptors in this scenario, data is temporarily replaced by task descriptors when the system drives the data distribution operation.
Specifically, an RD flag, that is, a data replacement flag, may be added to a task descriptor corresponding to distribution data, which is not clear of distribution data or may be changed later, when the task descriptor is generated. When the system generates a data distribution task, updated data replacement content provided by the configuration interface is obtained, and data in a task descriptor of the RD mark is replaced. Enabling the RD flag may be setting the RD flag to 1.
Only the data in all the task descriptors that enable the RD flag can be replaced by the same data sent by the configuration interface in one operation by the RD flag. In order to enhance the data replacement capability, a TDC flag is set in the task descriptor provided by the embodiment of the invention.
The TDC marks can be used for carrying out data replacement in a batch mode by taking DW as granularity one-to-one correspondence. The update frequency between each configuration information is not consistent when the system uses SCATTER for data distribution tasks. For example, in data distribution, a certain address may be configured many times, and information such as the address and the data amount is unchanged, but the data is dynamic. Then, when the task descriptor is generated, if the invariable information is repeatedly generated, resource waste is caused. If the task descriptor is generated after the dynamic information is determined, a large system delay may be caused, which affects the performance.
In the embodiment of the invention, the content with different updating frequencies in the task descriptors can be split into two parts through the TDC mark. In particular, task descriptors can be divided into template descriptors and subclass descriptors. Wherein the template descriptor includes all descriptor information, and the sub-class descriptor contains only frequently changing information. Wherein the subclass descriptor may be, but is not limited to, distribution data. The following description will take the subclass descriptor as an example of distributing data. One set of module descriptors may correspond to multiple sets of sub-class descriptors. The number of data within each set of module descriptors is the same as the number of descriptors of each set of sub-class descriptors and corresponds one-to-one. The specific distribution data may not be known when the system configures the task descriptor, and is determined only when a data distribution operation is generated. Thus, distribution data stored in advance to multiple sets of subclass descriptors is a number of alternatives. When the method is used, the storage address of the target sub-category descriptor is sent to the data distribution operation unit through the configuration interface to select, the TDC mark is enabled, the data distribution operation can read and splice a group of template descriptors and a group of sub-category descriptors at the same time, distribution data in the template descriptors are replaced by distribution data of the sub-category descriptors, and final distribution operation is generated.
It should be noted that, the data distributor provided by the embodiment of the present invention has parallel capability. Based on the device structure shown in fig. 2, the data distributor can process the next data distribution task request without waiting for all write response signals to return when the write requests of all task descriptors are sent. The hardware unit of the data distributor can automatically distribute and record the identity information transmitted by all buses according to the sequences of different operation tasks, and automatically count the writing request quantity and the writing response signal return quantity so as to judge whether a certain data distribution task is completely finished. After the data distribution task is completed thoroughly, the occupation of the hardware resources of the data distributor by the task can be automatically released.
The configuration information has task return information. When the data distribution operation is completely finished, the data distributor can additionally generate a finishing notice according to the task return information and send the task return signal to the appointed address. Wherein the task return signal can be sent in a plurality of strokes, the designated address and the task return data can be flexibly configured, and can be used for various purposes.
In an alternative implementation of the embodiment of the present invention, when the data processor is a data collector (Gather), the task descriptor includes: a data collection address, a burst transfer length threshold, and at least one of: an offset address flag, an external bus ID change flag, an invalid flag, and a last bit flag; correspondingly, the configuration information comprises: a deposit address for the task descriptor, a deposit address for the collected data, task return information, and an offset address value.
In an alternative implementation of the embodiment of the present invention, the data collector is specifically configured for at least one of the following: acquiring a corresponding offset address value from the configuration information according to the offset address mark; determining a target collection address according to the offset address value and the data collection address, collecting data at the target collection address through a local storage or an external bus, and storing the collected data at a storage address of the collected data; allocating a new ID according to the external bus ID change mark, and collecting data by using the new ID; in the process of collecting data by the data collector, the task processing with the same ID returns data in sequence, and the task processing with different IDs interweaves the returned data in disorder; dynamically enabling or disabling the task descriptor according to the invalidation flag; and ending the data collection task according to the last bit mark.
Fig. 4 is a schematic layout diagram of task descriptors corresponding to a data collector according to an embodiment of the present invention. The specific meaning of the task descriptor as shown in fig. 4 is as follows: addr represents the data collection address. For example, in an embodiment of the present invention, the data collection address may be set to 48 bits, e.g., addr [47:0] indicates that the data collection address occupies 48 bits in total, specifically including all of DW0, and bits 0 to 15 of DW 1. Where DW represents a doubleword, taking up 32 bits. BL represents a burst transfer length threshold and may occupy 4 bits, namely BL [3:0]. For example, BL may occupy 16 to 19 bits of DW 1. BL is at most 16 and cannot cross 4KB boundaries. O represents an offset address flag, which may occupy 1 bit, i.e., O0:0. For example, O may occupy bit 28 of DW 1. IDC means an external bus ID change flag, which may occupy 1 bit, i.e., IDC [0:0]. For example, IDC may occupy bit 29 of DW 1. I represents an invalid flag and may occupy 1 bit, i.e., I [0:0]. For example, I may occupy bit 30 of DW 1. L represents a preset write flag, which may occupy 1 bit, i.e., L [0:0]. For example, L may occupy bit 31 of DW 1. Rsvd represents reserved bits, which may occupy bits 20 to 27 of DW 1.
In practical applications, the specific setting of task descriptors provided in the embodiment of the present invention is not limited to the layout shown in fig. 4. Next, a specific application of the task descriptor in the data collection task will be described with reference to fig. 4.
Specifically, each task descriptor of the Gather corresponds to a burst read request on a bus. The length of the Gather task descriptor is a fixed 2 DW. The controller may send a data collection operation request through the configuration interface. The data collection operation request includes configuration information. Wherein, the configuration information in the configuration interface is configured by HOST through a register writing mode. When the application of the gather operation is generated, besides the application signal, some configuration information of the gather operation is required to be directly sent to the gather operation unit by the configuration interface.
Optionally, the configuration information includes priority information. The other operation unit can perform arbitration according to the priority and the arbiter arranged inside the data collector. The specific process is the same as the data distributor, and will not be described here again. The request to win arbitration triggers the state machine to begin working. The data collector reads the task descriptor from the local storage according to the storage address of the task descriptor in the configuration information, and analyzes the task descriptor to obtain a data collection address; a read request is sent to the data collection address via the local storage or external bus and the returned read data is written to the deposit address of the collection data in the order of the task descriptors.
In the data collection scene, in order to avoid repeated generation of task descriptors by isomorphic units, resource overhead is reduced, and like the scan, an offset address mark can be set in the task descriptors corresponding to the gather so as to perform address replacement. The address replacement capability of the gather is similar to that of the scan, except that the address replaced by the gather is a data collection address. In particular, the details are not repeated here.
In a data collection scenario, the data collection object is mostly at the far end of the bus. To improve access efficiency, data collection operations support dynamic allocation of IDs of transferred data on the bus, depending on the characteristics of the bus, to satisfy the balance of access behavior between efficiency and order retention. The dynamic allocation of IDs is controlled by IDC flags (ID change flags) marked in the data collection by the software. When the other operation starts to parse the task descriptor and generate a read request, an ID is allocated for data transmission on the bus, and the ID is used until the IDC flag is contained in the result of parsing a task descriptor (e.g., the IDC flag is set to 1), at which time a brand-new ID is allocated for subsequent data transmission of the other operation. The sequence of data return with the same ID is kept consistent with the sequence of sending requests, and data transmission with different IDs can be interleaved and returned out of order, so that the efficiency is improved. Generally, the same other access object on the bus uses the same ID to keep the access requests returned in order, and no order preservation is needed between different access objects. In addition, in order to support the storage address of the data to be collected when the data returned in an out-of-order manner is sent to the storage address of the data to be collected, the data is stored according to the arrangement sequence of task descriptors, the internal part of the other part also records the transmission sequence of each stroke when a read request is generated, and finally the address of the data to be finally stored is determined according to the information.
To avoid temporary shutdown of the other operand, software to empty the task descriptor when enabled, or to overwrite the descriptor, an invalid flag, i.e., an I flag, may be set in the other. The upper layer software may turn off or enable the other task descriptor by dynamically adjusting the I flag (invalid flag) in the task descriptor. For example, when the I flag is 1, the task descriptor is skipped. The other hardware operation unit only processes valid descriptors (I flag is 0). By the function of the I mark in the task descriptor, resource expenditure caused by invalidation of some task descriptors can be saved.
In the other operation, when the last bit flag is enabled, it may be determined that the operation is completed after the current task descriptor is processed. The enable last bit flag may be that the L flag is set to 1. After the task is completed, an end notification may be performed according to the configuration information. The gather supports end notification as the receipt. For example, three sets of information including enable, address, and data may be included in the end notification. In addition, the identity information currently operated in the system task can be marked according to the task return information. For example, the identity information may include identity information of high-level hardware, threads, users, or task flows, facilitating the high-level structure to comb dependencies between tasks.
In an optional implementation manner of the embodiment of the present invention, when the data processor is a data comparator (polling), the task descriptor includes: the comparison data comprises a comparison data address, preset comparison data, comparison bits, a comparison mode, a matching mark, a signaling address, signaling data and a last bit mark; correspondingly, the configuration information comprises: the storage address of the task descriptor, and the task return information.
In an optional implementation manner of the embodiment of the present invention, the data comparator is specifically configured to: according to the matching mark, when the matching is not successful, obtaining the data to be compared according to the data address to be compared; comparing the data to be compared with preset comparison data according to the comparison bit and the comparison mode; updating the matching mark, the signaling address and the signaling data according to the comparison result; and finishing the data comparison task according to the last bit mark.
Fig. 5 is a schematic layout diagram of task descriptors corresponding to a data comparator according to an embodiment of the present invention. The specific meaning of the task descriptor as shown in fig. 5 is as follows: poll_addr represents the alignment data address. For example, in an embodiment of the present invention, the alignment data address may be set to 48 bits, such as poll_addr [47:0] which indicates that the alignment data address occupies 48 bits in total, specifically including all of DW0, and bits 16 to 31 of DW 2. Where DW represents a doubleword, taking up 32 bits. The cmp_data represents preset comparison data, and can occupy 32 bits, namely, cmp_data [31:0]. For example, cmp_data may occupy all of DW 4. The cmp_mask represents comparison bits, which may occupy 32 bits, i.e., cmp_mask [31:0]. For example, cmp_mask may occupy all of DW 5. mode represents a compare mode and may occupy 8 bits, i.e., mode [7:0]. For example, mode may occupy bits 0 through 7 of DW 6. match represents a match flag, which may occupy 8 bits, i.e., match [7:0]. For example, match may occupy bits 16 to 23 of DW 6. Signal_adder represents a signaling address and may occupy 48 bits, i.e., signal_adder [47:0]. For example, signal_order may occupy all of DW1 and bits 0 to 15 of DW 2. Signal_data represents signaling data, and may occupy 32 bits, i.e., signal_data [31:0]. For example, signal_data may occupy all of DW 3. last represents the last bit flag, which may occupy 8 bits, namely last [7:0]. For example, last may occupy bits 24 to 31 of DW 6. Rsvd represents reserved bits, which may occupy all of DW7, and bits 8-15 of DW6, respectively, as reservations for different demands.
In practical applications, the specific setting of task descriptors provided in the embodiment of the present invention is not limited to the layout shown in fig. 5. Next, a specific application of the task descriptor in the data alignment task will be described by taking an example shown in fig. 5.
The data comparator may sample and examine data on the data address space for multiple comparisons by the system-on-chip. The data in these aligned data addresses may be continuously updated data. The comparison data address may be the result storage location of certain tasks or certain system monitoring information. Polling can replace software to continuously poll the data at the addresses and compare the results.
As shown in fig. 5, each task descriptor of Polling corresponds to a read request on the bus. The length of the Polling task descriptor is a fixed 8 DWs.
Specifically, the controller may compare the operation request with the data sent by the configuration interface. The data comparison operation request includes configuration information. Wherein, the configuration information in the configuration interface is configured by HOST through a register writing mode. When an application for a Polling operation is generated, in addition to the application signal, some configuration information of the Polling operation as a whole is required to be directly transmitted to the Polling operation unit by the configuration interface.
Optionally, the configuration information includes priority information. The arbitration can be performed in the Polling operation unit according to an arbitrator provided inside the priority and data comparator. The specific process is the same as the data collector and the data distributor, and will not be described here again. The request to win arbitration triggers the state machine to begin working. And the data comparator reads the task descriptor from the local storage according to the storage address of the task descriptor in the configuration information, and analyzes the task descriptor to obtain the information of the polling operation. The information of the polling operation includes a comparison data address, preset comparison data, comparison bits, comparison patterns, a match flag, a signaling address, signaling data, a last bit flag, and the like.
A plurality of programmable timers may be provided within poling. The timer can determine the time interval and the polling times of the data polling, and the polling operation application can be repeated instead of the software, so that the occupation of the processor and the controller resources is avoided.
The data comparator can read the data to be compared at the comparison data address and compare the data with preset comparison data in the polling operation unit. The preset comparison data is the target data or the expected result desired by the system. When the polling read data to be compared and the preset comparison data reach a preset relation, the result can be considered to be successfully matched.
Specifically, the preset relationship may be configured in the task descriptor. For example, the selectable items of the preset relationship include the relationship between the data to be compared and the preset comparison data: equal, partial bits (bit) equal, unequal, greater than, less than, greater than or equal to, less than or equal to, 1-containing bit number equal to the target value, 1-containing bit number unequal to the target value, 1-containing bit number less than or equal to the target value, 1-containing bit number greater than or equal to the target value, etc. The preset relationship may be represented by a comparison bit and a comparison pattern.
The matching result of polling alignment can be automatically updated into the task descriptor. If a match is successful, then the next time the poll is made, the hardware may skip this task descriptor and only fetch the aligned data addresses for which the poll has not yet matched. Of course, the matching flag of the task descriptor can be flexibly modified by software. Furthermore, after all task descriptors have been consumed one round, i.e. after all alignment data addresses have been polled once, the number of matching failures for this alignment as a whole can be recorded. The Polling may return the number of match failures to the configuration interface and upload to a register to make the software visible. The number of matching failures can be used as some behavior evaluation criteria.
If the matching flag in the task descriptor is a preset value, such as 1, the data of the corresponding comparison data address is considered to be successfully matched, and the next task descriptor is directly skipped and analyzed. And otherwise, reading the data according to the analyzed comparison data address, matching the data with preset comparison data, updating the matching mark of the task descriptor after successful matching, and sending an end notification. After all task descriptors are parsed once, the Polling operation unit may send the number of matching failures to the configuration interface, and a timer inside Polling may initiate a Polling request again after a certain delay.
Wherein, the record of successful match can be represented by signaling address and signaling data. For example, when the matching is successful, the comparison data address may be recorded at the signaling address, and the corresponding matching effort may be recorded at the signaling data.
The last bit flag is applied similarly to the data collector and will not be described here again.
After the polling operation is ended, the data comparator may perform an end notification. The end notification of Polling is different from the end notification of the scanner and the gather. The data comparator does not generate after the whole polling operation is finished, but corresponds to the task descriptor one by one, namely, a result notification is generated after each successful matching. The specific notification content of the Polling may be configured as the specific notification content of the scanner or the gateway, and will not be described here.
According to the technical scheme, the data distributor, the data collector and the data comparator are arranged in the system on chip, interaction between different data processors and the controller, the configuration interface, the external bus and the local storage is realized through the task descriptor, data processing is carried out, bus configuration and controller configuration can be asynchronous, and data processing efficiency is improved; especially, configuration multiplexing can be performed through specific descriptor setting, resource waste is avoided, and task configuration flexibility is improved.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (7)

1. A data processing apparatus for use in a system on a chip, the data processing apparatus comprising: the system comprises a data processor, a controller, a configuration interface, an external bus and a local storage;
the controller is respectively in communication connection with the external bus, the local storage and the configuration interface; the configuration interface is in communication connection with the data processor; the data processor is respectively in communication connection with the external bus and the local storage;
the controller is used for configuring data processing tasks, transmitting configuration information corresponding to the data processing tasks to the configuration interface, and transmitting task descriptors corresponding to the data processing tasks to the local storage;
the data processor is used for acquiring the configuration information through the configuration interface and acquiring a corresponding task descriptor in the local storage according to the configuration information; analyzing the task descriptor, and accessing a local storage or an external bus to perform data processing according to an analysis result;
wherein the data processor comprises at least one of the following: a data distributor, a data collector and a data comparator;
The task descriptor includes: a data distribution address, distribution data, a burst transfer length threshold, at least one of: an offset address flag, a data replacement flag, and a template data modification flag;
correspondingly, the configuration information comprises: the storage address of the task descriptor, the task return information, and at least one of the following: offset address value, data replacement content, and storage address of subclass descriptor;
the data distributor is specifically used for at least one of the following:
acquiring a corresponding offset address value from the configuration information according to the offset address mark; determining a target distribution address according to the offset address value and the data distribution address, and distributing the distribution data to the target distribution address through a local storage or an external bus;
acquiring corresponding data replacement content from the configuration information according to the data replacement mark; the distributed data is replaced by the data replacement content, and the distributed data is distributed to a data distribution address or a target distribution address determined according to the offset address mark through a local storage or an external bus;
Acquiring the storage address of the corresponding sub-category descriptor from the configuration information according to the template data change mark; obtaining sub-class descriptor information in local storage according to the storage address of the sub-class descriptor, and replacing corresponding information in the task descriptor by adopting the sub-class descriptor information; and carrying out data distribution according to the replaced task descriptor.
2. The apparatus of claim 1, wherein the task descriptor when the data processor is a data collector, comprises: a data collection address, a burst transfer length threshold, and at least one of: an offset address flag, an external bus ID change flag, an invalid flag, and a last bit flag;
correspondingly, the configuration information comprises: the task descriptor's deposit address, the deposit address of the collected data, the task return information, and the offset address value.
3. The apparatus according to claim 2, wherein the data collector is specifically configured for at least one of:
acquiring a corresponding offset address value from the configuration information according to the offset address mark; determining a target collection address according to the offset address value and the data collection address, collecting data at the target collection address through a local storage or an external bus, and storing the collected data at a storage address of the collected data;
Allocating a new ID according to the external bus ID change mark, and collecting data by using the new ID; in the process of collecting data by the data collector, the task processing with the same ID returns data in sequence, and the task processing with different IDs interweaves the returned data in disorder;
dynamically starting or closing the task descriptor according to the invalid mark;
and ending the data collection task according to the last bit mark.
4. The apparatus of claim 1, wherein the task descriptor when the data processor is a data comparator, comprises: the comparison data comprises a comparison data address, preset comparison data, comparison bits, a comparison mode, a matching mark, a signaling address, signaling data and a last bit mark;
correspondingly, the configuration information comprises: the storage address of the task descriptor and the task return information.
5. The apparatus of claim 4, wherein the data comparator is specifically configured to:
according to the matching mark, when the matching is not successful, the data to be compared is obtained according to the comparison data address; comparing the data to be compared with the preset comparison data according to the comparison bit and the comparison mode;
Updating the matching mark, the signaling address and the signaling data according to the comparison result;
and finishing the data comparison task according to the last bit mark.
6. The apparatus of any one of claims 1, 2, or 4, wherein the configuration information further comprises: priority information;
an arbiter is arranged in the data processor.
7. The apparatus of claim 6, wherein the data processor is configured to:
acquiring priority information in configuration information through the configuration interface, and establishing connection with a target configuration interface according to the priority information and an internally arranged arbiter;
acquiring target configuration information through the target configuration interface; acquiring corresponding task descriptors in local storage according to the target configuration information; and analyzing the task descriptor, and accessing a local storage or an external bus to perform data processing according to the analysis result.
CN202311266234.6A 2023-09-28 2023-09-28 Data processing device applied to system on chip Active CN117009265B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311266234.6A CN117009265B (en) 2023-09-28 2023-09-28 Data processing device applied to system on chip

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311266234.6A CN117009265B (en) 2023-09-28 2023-09-28 Data processing device applied to system on chip

Publications (2)

Publication Number Publication Date
CN117009265A CN117009265A (en) 2023-11-07
CN117009265B true CN117009265B (en) 2024-01-09

Family

ID=88576539

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311266234.6A Active CN117009265B (en) 2023-09-28 2023-09-28 Data processing device applied to system on chip

Country Status (1)

Country Link
CN (1) CN117009265B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1713164A (en) * 2005-07-21 2005-12-28 复旦大学 DMA controller and data transmission with multi-transaction discretionary process
US9053093B1 (en) * 2013-08-23 2015-06-09 Altera Corporation Modular direct memory access system
CN112650558A (en) * 2020-12-29 2021-04-13 优刻得科技股份有限公司 Data processing method and device, readable medium and electronic equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1713164A (en) * 2005-07-21 2005-12-28 复旦大学 DMA controller and data transmission with multi-transaction discretionary process
US9053093B1 (en) * 2013-08-23 2015-06-09 Altera Corporation Modular direct memory access system
CN112650558A (en) * 2020-12-29 2021-04-13 优刻得科技股份有限公司 Data processing method and device, readable medium and electronic equipment

Also Published As

Publication number Publication date
CN117009265A (en) 2023-11-07

Similar Documents

Publication Publication Date Title
DE3789104T2 (en) Network transmission adapter.
DE69429279T2 (en) MULTIPROCESSOR PROGRAMMABLE INTERRUPT CONTROL SYSTEM WITH PROCESSOR-INTEGRATED INTERRUPT CONTROLLERS
US7437617B2 (en) Method, apparatus, and computer program product in a processor for concurrently sharing a memory controller among a tracing process and non-tracing processes using a programmable variable number of shared memory write buffers
US20050149665A1 (en) Scratchpad memory
US20090119460A1 (en) Storing Portions of a Data Transfer Descriptor in Cached and Uncached Address Space
US20090031173A1 (en) Method, Apparatus, and Computer Program Product in a Processor for Dynamically During Runtime Allocating Memory for In-Memory Hardware Tracing
CN108279927B (en) Multi-channel instruction control method and system capable of adjusting instruction priority and controller
DE102006019839A1 (en) Time-conscious systems
JPH08227402A (en) Reduction method of bus conflict of shared memory
US20060184833A1 (en) Method, apparatus, and computer program product in a processor for performing in-memory tracing using existing communication paths
CN109800558B (en) Password service board card and password service device
CN116302617B (en) Method for sharing memory, communication method, embedded system and electronic equipment
CN115827524A (en) Data transmission method and device
US20240152395A1 (en) Resource scheduling method and apparatus, and computing node
CN114924855A (en) Method and device for realizing multi-core cooperative controller
CN113867801A (en) Instruction cache, instruction cache group and request merging method thereof
CN117009265B (en) Data processing device applied to system on chip
CN110716805A (en) Task allocation method and device of graphic processor, electronic equipment and storage medium
CN110830385A (en) Packet capturing processing method, network equipment, server and storage medium
CN115563038A (en) Data processing system, method and data processing equipment based on DMA controller
CN104836710A (en) Method and apparatus based on one-master with multi-slaves communication of distributed system
CN115525582A (en) Method and system for task management and data scheduling of page-based inline computing engine
US20020091957A1 (en) Multiprocessor array
CN113448867A (en) Software pressure testing method and device
CN1875350A (en) Integrated circuit comprising a measurement unit for measuring utilization of a communication bus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant