TW200540644A - A single chip protocol converter - Google Patents

A single chip protocol converter Download PDF

Info

Publication number
TW200540644A
TW200540644A TW94100086A TW94100086A TW200540644A TW 200540644 A TW200540644 A TW 200540644A TW 94100086 A TW94100086 A TW 94100086A TW 94100086 A TW94100086 A TW 94100086A TW 200540644 A TW200540644 A TW 200540644A
Authority
TW
Taiwan
Prior art keywords
protocol
chip
conversion
processor
soc
Prior art date
Application number
TW94100086A
Other languages
Chinese (zh)
Other versions
TWI338231B (en
Inventor
Christos J Georgiou
Victor L Gregurick
Indira Nair
Valentina Salapura
Original Assignee
Ibm
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US10/768,828 external-priority patent/US7412588B2/en
Application filed by Ibm filed Critical Ibm
Publication of TW200540644A publication Critical patent/TW200540644A/en
Application granted granted Critical
Publication of TWI338231B publication Critical patent/TWI338231B/en

Links

Landscapes

  • Bus Control (AREA)
  • Communication Control (AREA)

Abstract

A single chip protocol converter integrated circuit (IC) capable of receiving packets generating according to a first protocol type and processing said packets to implement protocol conversion and generating converted packets of a second protocol type for output thereof, the process of protocol conversion being performed entirely within the single integrated circuit chip. The single chip protocol converter can be further implemented as a macro core in a system-on-chip (SoC) implementation, wherein the process of protocol conversion is contained within a SoC protocol conversion macro core without requiring the processing resources of a host system. Packet conversion may additionally entail converting packets generated according to a first protocol version level and processing the said packets to implement protocol conversion for generating converted packets according to a second protocol version level, but within the same protocol family type. The single chip protocol converter integrated circuit and SoC protocol conversion macro implementation include multiprocessing capability including processor devices that are configurable to adapt and modify the operating functionality of the chip.

Description

200540644 九、發明說明: 【發明所屬之技術領域】 本發明係關於網路處理器裝置及儲存區域網路,且更詳 言之,本發明係關於一種藉由提供一用於協定轉換之架構 而跨越多個網路協定之系統及方法,該協定轉換係建構於 單1C晶片内或建構為習知SoC、DSP、FPGA或類似積體電 路子系統中之子處理器核心組件。 【先前技術】 隨著市場朝向儲存區域網路(SAN)及網路附著儲存(NAS) 系統而變化,以及隨著網際網路之巨大膨脹,出現了對伺 服器及儲存設計之新需求。經由並行SCSI連接而附著之儲 存被光纖通道(FC)儲存區域網路(SAN)及諸如iSCSI及IP光 纖通道(FC-IP)的其它所浮現之網路連接架構而替代。iSCSI 涉及區塊資料在TCP/IP網路上之傳送,其通常圍繞超高速 乙太網路(Gigabit Ethernet)而建置,同時FC-IP為基於儲存 網路連接技術之網際網路協定(IP),其使得能夠藉由在IP 網路上於SAN設施之間進行穿遂(tunneling)資料來傳輸FC 資訊。 通用CPU不能滿足網路協定轉換之計算需要或在單位成 本、空間及功率方面太昂貴。此已導致許多網路連接及協 定處理功能自主機處理器卸載(offload)至主機匯流排配接 器(HBA)或網路介面控制器(NIC)。起初,使用硬接線式邏 輯(hardwired logic)將大部分HBA及NIC建構在ASIC中。但 當出現建構諸如TCP/IP或iSCSI之複雜網路協定之需要 98784.doc 200540644 時,可程式化解決方案以其所提供之以下眾多優點已纟楚$ 較有吸引力:其能適應不同及發展變化之協定;其可經_ 程式改變而容易地升級;其提供較快推廣(market)時間。 現存SAN通常在實體上較遠,有時位於較大距離處,且 通常使用多個網路架構。為鞏固現存S AN及為利用現存 WAN及LAN基礎構造(infrastructure),在資料通信與電信領 域中皆存在對網路協定轉換之需要。協定轉換將允許系、统 中所有不同部分之無縫整合及運作。200540644 9. Description of the invention: [Technical field to which the invention belongs] The present invention relates to a network processor device and a storage area network, and more specifically, the present invention relates to a method for providing a framework for protocol conversion by providing A system and method that span multiple network protocols. The protocol conversion is built into a single 1C chip or as a sub-processor core component in a conventional SoC, DSP, FPGA, or similar integrated circuit subsystem. [Previous Technology] As the market moves toward storage area network (SAN) and network-attached storage (NAS) systems, and with the huge expansion of the Internet, new requirements for server and storage design have emerged. Storage attached via parallel SCSI connections is replaced by Fibre Channel (FC) storage area networks (SANs) and other emerging network connectivity architectures such as iSCSI and IP Fibre Channel (FC-IP). iSCSI involves the transmission of block data over a TCP / IP network. It is usually built around Gigabit Ethernet, and FC-IP is an Internet Protocol (IP) based on storage network connection technology. , Which enables FC information to be transmitted by tunneling data between SAN facilities on an IP network. A general-purpose CPU cannot meet the computational requirements of network protocol conversion or is too expensive in terms of unit cost, space, and power. This has caused many network connections and protocol processing functions to be offloaded from the host processor to the host bus adapter (HBA) or network interface controller (NIC). Initially, most HBAs and NICs were built into ASICs using hardwired logic. But when the need to build a complex network protocol such as TCP / IP or iSCSI is 98784.doc 200540644, the programmable solution has become more attractive with its many advantages: it can adapt to different and Development and change agreement; it can be easily upgraded through program changes; it provides faster market time. Existing SANs are often physically remote, sometimes at large distances, and often use multiple network architectures. In order to consolidate the existing SAN and to utilize the existing WAN and LAN infrastructure, there is a need for network protocol conversion in the fields of data communications and telecommunications. Agreement conversion will allow seamless integration and operation of all the different parts of the system.

Brocade Communications Systems發佈了一 種用於多協定 構造路由服務(multiprotocol fabric routing service)之系統 級協定轉換器產品[http://biz.yahoo.com/prnews/031028/ sftulOO-l.html],其計劃提供光纖通道至光纖通道 (FC-FC)、iSCSI至FC橋接,及光纖通道至FC_IP轉譯。Brocade Communications Systems has released a system-level protocol converter product for multiprotocol fabric routing service [http://biz.yahoo.com/prnews/031028/ sftulOO-l.html], which plans to Provides Fibre Channel to Fibre Channel (FC-FC), iSCSI to FC bridging, and Fibre Channel to FC_IP translation.

現存協定轉換器將多個晶片整合於一卡上以獲得所要邏 輯功能,或更通常地,該卡為***至現存主機系統之主機 匯流排配接器(HBA)卡,或為主要主機I/O卡上之子卡,從 而導致了體積龐大且在單位成本、空間及功率方面更昂貴 之產品。此外,現存協定轉換器不可程式化或具有極有限 之可程式化性,且不易於被升級以適應不同協定或新協 定。此外,出現了實體層存取模組或晶片,通常為一特殊 實體層協定而最優化其實施例及電路技術,從而當於一埠 上需要較新實體層協定時,需要替代整個主機匯流排配接 器(HBA)卡或若干組件。並非如通常一般完成相同實體I/O 卡内之轉換,且該轉換並非在單晶片解決方案内或作為SoC 98784.doc 200540644 半導體裝置内之嵌入式核心。 圖1中說明了根據先前技術之系統晶片設計20。其包含一 諸如PPC 440 (Power PC)25之處理元件、一區域處理器匯流 排(PLB)21、晶片上周邊匯流排(OPB)24,及眾多組件,諸 如 SRAM 15、DDR控制器 18、PCI-X橋接器 22、DMA 26及 DMA控制器28、經採用以提供用於乙太網路LAN系統之資 料鏈接層的乙太網路媒體存取控制(MAC)協定裝置50、處 理器核心計時器33及中斷控制器35,以及一與OPB匯流排 24及PLB 21接合之OPB橋接器29〇在圖1中所描繪之先前技 術實施例中,利用了 I.B.M之嵌入式PowerPC 440處理器核 心及CoreConnect區域匯流排,但可發現使用了其它嵌入式 處理器核心之類似組態,諸如ARM (例如,見 http://www.arni.com/products/70penDocument) 、 MIPS (見 http://www.ce.chalmers.se/〜thomasl/inlE/mips32 4Kp brief.pdf處 之 MIPS : "MIPS32 4ΚΡ -Embedded MIPS Processor Core")處理 核心等等。如圖1中所示,提供用以與晶片上周邊匯流排24 接合之其它裝置包括以下中的一個或多個:RAM/ROM周邊 控制器45a、外部匯流排主控器45b、UART裝置45c、1C間 匯流排(I2C)介面45d、通用I/O介面45e,及閘道介面45f。 描述SoC處理器及組件設計之態樣的相關參考包括: 美國專利第6,3 31,977號描述一種系統晶片(SoC),其含有 在該晶片内部之若干功能I/O之間的交錯式交換器(crossbar switch),及眾多外部連接引腳,其中,該等引腳之數目小 於内部I/O數目。 98784.doc 200540644 美國專利第6,262,594號描述-種用於建構—用於系統晶 片之塾(pad)群之可組態使用的交錯式交換器的設備及方 法。 • 美國專利第6,038,630號描述一種用於建構一種交錯式交 、 換器之設備及方法,該交錯式交換器提供用於具有存取多 個貝料匯流排上之外部結構之多㈣工力能單元的冑體系統之 共用存取控制裝置。 .美國專利申請案第US 2002/0184419號描述了一種使用 通用匯流排系統來使得能夠使用系統晶片之不同組件的 ASIC,且描述了用於具有不同速度及資料寬度之功能單元 以達成與通用匯流排之相容性的封裝。 美國專利申請案第US 2002/0176402號描述一種用於鏈 接SoC上之功能單元的八邊形互連網路。互連網路上之功能 單元係組成為環狀且使用中途(halfway)麵接組件之若干交 錯資料鍵接。 .美國專利申請案第US 2001/0042147號描述一種用於s〇c 互連之系統資源路由器,其包含連接每一資料快取記憶體 (D-快取記憶體)及指令快取記憶體屮快取記憶體)之兩個通 道插口 。亦包括於其中的為外部資料傳送起始器 (initiator)、兩個内部μ通道匯流排及一用以提供互連之M 通道控制器。 美國專利申請案第US 2002/0172197號描述一種經由一 欲入於晶片上之交錯式交換器以點對點之方式連接多個傳 輸及接收裝置之通信系統。 98784.doc 200540644 美國專利申請案第US 2001/0047465號描述一發明之若 干變化,該發明提供用於通信系統之可按比例縮放之架構 (通常為一 SOC或ASIC)以藉由將傳輸劃分成個別傳輸任 務、判定每一傳輸任務之計算複雜性及使計算複雜性基於 每一電路之MIPS數目來最小化總閘。 在由 A. Brinkmann、J.C. Niemann、I. Hehemann、D·Existing protocol converters integrate multiple chips on a card to obtain the desired logic function, or more commonly, the card is a host bus adapter (HBA) card inserted into an existing host system, or is a main host I / The daughter card on the O card has resulted in a product that is bulky and more expensive in terms of unit cost, space and power. In addition, existing protocol converters are not programmable or have very limited programmability, and cannot be easily upgraded to accommodate different protocols or new protocols. In addition, a physical layer access module or chip has appeared, and its embodiments and circuit technologies are usually optimized for a special physical layer protocol, so when a newer physical layer protocol is needed on a port, the entire host bus needs to be replaced Adapter (HBA) card or several components. The conversion in the same physical I / O card is not usually completed as usual, and the conversion is not in a single-chip solution or as an embedded core in a SoC 98784.doc 200540644 semiconductor device. A system-on-a-chip design 20 according to the prior art is illustrated in FIG. It includes a processing element such as PPC 440 (Power PC) 25, a regional processor bus (PLB) 21, an on-chip peripheral bus (OPB) 24, and numerous components such as SRAM 15, DDR controller 18, PCI -X bridge 22, DMA 26 and DMA controller 28, Ethernet media access control (MAC) protocol device 50 adopted to provide data link layer for Ethernet LAN system, processor core timing Processor 33 and interrupt controller 35, and an OPB bridge 29 that interfaces with OPB bus 24 and PLB 21. In the prior art embodiment depicted in FIG. 1, IBM's embedded PowerPC 440 processor core and CoreConnect regional buses, but similar configurations using other embedded processor cores can be found, such as ARM (see, for example, http://www.arni.com/products/70penDocument), MIPS (see http: // www .ce.chalmers.se / ~ thomasl / inlE / mips32 4Kp brief.pdf at MIPS: " MIPS32 4KP -Embedded MIPS Processor Core ") processing core and so on. As shown in FIG. 1, other devices provided to interface with the peripheral bus 24 on the chip include one or more of the following: a RAM / ROM peripheral controller 45a, an external bus master 45b, a UART device 45c, 1C bus (I2C) interface 45d, general-purpose I / O interface 45e, and gateway interface 45f. Relevant references describing aspects of SoC processor and component design include: US Patent No. 6,3 31,977 describes a system-on-chip (SoC) that contains interleaved functional I / Os within the chip Crossbar switch, and many externally connected pins, where the number of these pins is less than the number of internal I / O. 98784.doc 200540644 U.S. Patent No. 6,262,594 describes a device and method for constructing a configurable interleaved switch for pad groups of system wafers. • U.S. Patent No. 6,038,630 describes a device and method for constructing an interleaved interchanger, which provides multiple working capabilities for accessing external structures on multiple shells Common access control device for unit carcass system. . US Patent Application No. US 2002/0184419 describes an ASIC that uses a universal bus system to enable the use of different components of the system chip, and describes functional units with different speeds and data widths to achieve universal bus Row of compatible packages. U.S. Patent Application No. US 2002/0176402 describes an octagonal interconnection network for linking functional units on an SoC. The functional units on the Internet are composed of a ring and a number of interleaved data keys using halfway interface components. . US Patent Application No. US 2001/0042147 describes a system resource router for SOC interconnection, which includes a connection to each data cache (D-cache) and an instruction cache. Cache memory). Also included are an external data transfer initiator, two internal μ-channel buses, and an M-channel controller to provide interconnection. U.S. Patent Application No. US 2002/0172197 describes a communication system that connects a plurality of transmitting and receiving devices in a point-to-point manner via an interleaved switch to be mounted on a chip. 98784.doc 200540644 US Patent Application No. US 2001/0047465 describes several variations of an invention that provides a scalable architecture (usually a SOC or ASIC) for communication systems to partition transmissions into Individual transmission tasks, determining the computational complexity of each transmission task, and minimizing the total gate based on the number of MIPS of each circuit. In A. Brinkmann, J.C. Niemann, I. Hehemann, D.

Langen、M. Porrmann及 U. Ruckert所著之題為 ’Όη-Chip Interconnects for Next Generation System-on-Chips” , Proceedings of ASIC2003 年,2003 年 9 月 26 曰至 27 曰, Rochester,New York的參考中,描述了 一種利用主動交換器 盒以連接處理器單元來使得能夠進行封包網路通信的SoC 架構。此論文未提及或描述具有多線程能力 (multi-threading capability)之處理器核心。 在由 Kyeong Keol Ryu、Eung Shin及 Vincent J. Mooney戶斤 著之題為"A Comparison of Five Different Multiprocessor SoC Bus Architectures1* 5 proceedings of Euromicro Symposium on Digital System Design (DSS’01),2001 年 9 月 4 曰至 6 曰,Warsaw, Poland的參考中,描述了多處理器SoC匯流排架構,其包括 全球匯流排I架構(GBIA)、全球匯流排II架構(GBIIA)、 Bi-FIFO匯流排架構(BFBA)、交錯式交換器匯流排架構 (CSBA),及核心連接匯流排架構(CoreConnect Bus Architecture)(CCBA) 〇 基於單嵌入式處理器之方法為一些應用提供了具成本效 益、整合之解決方案,但其可能缺乏被更高需求之應用所 98784.doc -10- 200540644 需之計算功率’及用於協定轉換或將來協定速度增加之可 撓性,例如,2.5 Gbps光纖通道至1〇Gbps光纖通道。 在最近幾年内’已藉由添加附著至通用匯流排(pLB)之專 用處理器核心(加速器)39(如圖2中所示,與處理器核心25 並行運作)而在許多網路連接應用中增強了圖1之s〇c的計 算能力。由於排除了發現於典型通用處理器(例如,用以支 持虛定址之記憶體管理單元等等)中之許多特徵,故此等額 外專用處理器核心39a、39b等等通常具有較小矽面積。此 方法之實例為IBM之PowerNP (例如,見由M. Heddes所著的 題為"IBM Power Network Processor architecture”,Proceedings of Hot Chips 12, Palo Alto, CA,USA,2000年 8月,IEEE Computer Society之參考)及NEC之TCP/IP卸載引擎,(例如,見 http://www.tensilica.com/html/pr—2003—05—12.html 上之題為 ffNECfs New TCP/IP Offload Engine Powered by 10 Tensilica Xtensa Processor Cores"之參考)。儘管此等系統可程式化,且 因此與固線式加速器相比更為可撓,但其遭受若干缺陷: a)當匯流排必須立刻支持指令與資料流至處理器加速器 時,其誘發SoC匯流排(例如,PLB 21)上之額外通訊流量, 從而可導致頻寬競爭並限制系統效能;b)在SoC系統中,通 常並不為多處理器效能而是為與標準化組件及連接協定之 相容性而最優化SoC匯流排;及c)處理器加速器39通常僅建 構極有限之指令組且使用組合語言,因而使得運行於處理 器加速器上之應用的發展及維護極困難及昂貴。 第三類型之SoC設計75為一經由交錯式交換器而連接之 98784.doc 11 200540644 肷入式處理器核心,諸如Motorola之MPC 5554 Microcontroller (Design News,2003年11月3日,第#38頁),圖3中描繪其之 方塊圖。如圖3中所說明的,Motorola之SoC設計由許多類 # 似於圖1及圖2之SoC設計的元件組成,其包含p〇werPC處理 、 裔核心、記憶體及匯流排介面,然而,更顯著地,建構3 X 5交錯式交換器72作為用於區域匯流排中之一者的替代。藉 由將交錯式交換器72併入SoC設計中,處理器核心通信可藉 ^ 由二(3)線同時工作而得以較快地發生,藉此在一定程度上 解決頻寬競爭問題。然而,SoC仍然沒有為多處理器支持, 或如單SoC晶片内之協定轉換的更高級功能,或高速介面而 得以最優化。晶片内之I/O通信受到交錯式交換器之限制, 且仍需要與外部匯流排介面及主機系統匯流排通信,從而 限制了微控制器(SoC晶片)對於任何將來升級的效能及可 撓性。任何協定轉換將需要在晶片外,於若干級或晶片中 被執行。此外,不能自置放於主機系統匯流排上之指令而 _ 將資料封包退耦。在圖3之實例中,(例如)通常用於自動應 用中之FlexCan(CAN協定:,,控制區域網路”)資料流的一協 定,以及諸如dspi (或,’串列周邊介面”)4eSCI(增強串列通 信介面)之其它協定現經由外部1/〇橋接器78而建構於 M〇tor〇laMPC 5554晶片中,其中每一協定或1/〇特定資料流 經過一 I/O橋接器、交錯式交換器,及通常一内部晶片匯流 排或外部匯流排介面而至系統匯流排。 當前,不存在今日位於單晶片内之協定轉換,且不存在 藉由附著至内部晶片匯流排之嵌入式核心而自獨立協定或 98784.doc 12 200540644 協定版本級轉換至完全新協定或版本級之協定轉換方法。 當前協定轉換僅發生於系統或卡級,其涉及如較早提及 之多個晶片,一實例為較早提及之用於SAN網路之Brocade S ilk worm Fabric Application Server (例如,參看 http ·//www. br o c ade.com/san/ext ending一 value of一 SANs, jsp) ,如圖4中所示。Langen, M. Porrmann and U. Ruckert, titled 'Όη-Chip Interconnects for Next Generation System-on-Chips', Proceedings of ASIC 2003, 26-27 September 2003, Rochester, New York In this paper, we describe an SoC architecture that uses active switch boxes to connect processor units to enable packet network communications. This paper does not mention or describe processor cores with multi-threading capabilities. Titled " A Comparison of Five Different Multiprocessor SoC Bus Architectures1 * 5 proceedings of Euromicro Symposium on Digital System Design (DSS'01) by Kyeong Keol Ryu, Eung Shin and Vincent J. Mooney, September 4, 2001 From the 6th to the 6th, the reference of Warsaw, Poland describes the multi-processor SoC bus architecture, which includes the Global Bus I Architecture (GBIA), the Global Bus II Architecture (GBIIA), and the Bi-FIFO Bus Architecture (BFBA). ), Interleaved Switch Bus Architecture (CSBA), and Core Connect Bus Architecture (CCBA) The single-embedded processor approach provides a cost-effective, integrated solution for some applications, but it may lack the computing power required by higher-demand applications 98784.doc -10- 200540644 and for protocol conversion or Flexibility to increase future agreement speeds, for example, 2.5 Gbps Fibre Channel to 10 Gbps Fibre Channel. In recent years, 'specialized processor cores (accelerators) 39 attached to general-purpose bus (pLB) have been added (eg, As shown in Figure 2, it operates in parallel with the processor core 25) and enhances the soc computing power of Figure 1 in many network-connected applications. This excludes the discovery of typical general-purpose processors (for example, to support virtual Addressing memory management unit, etc.), so these extra dedicated processor cores 39a, 39b, etc. usually have a smaller silicon area. An example of this approach is IBM's PowerNP (see, for example, by M. Heddes ("IBM Power Network Processor architecture", Proceedings of Hot Chips 12, Palo Alto, CA, USA, August 2000, reference of IEEE Computer Society) and NEC TCP / IP offload engine, (see, for example, http://www.tensilica.com/html/pr—2003—05—12.html entitled ffNECfs New TCP / IP Offload Engine Powered by 10 Tensilica Xtensa Processor Cores " Reference). Although these systems are programmable and therefore more flexible than fixed-line accelerators, they suffer from several drawbacks: a) When the bus must immediately support instruction and data flow to the processor accelerator, it induces SoC buses (Eg, PLB 21), which can cause bandwidth competition and limit system performance; b) in SoC systems, it is usually not multi-processor performance but is related to standardized components and connection protocols Capacitively and optimally optimize the SoC bus; and c) the processor accelerator 39 usually only has a very limited instruction set and uses a combined language, thus making the development and maintenance of applications running on the processor accelerator extremely difficult and expensive. The third type of SoC design 75 is a 98784.doc 11 200540644 embedded processor core connected via an interleaved switch, such as Motorola's MPC 5554 Microcontroller (Design News, November 3, 2003, page # 38 ), Its block diagram is depicted in Figure 3. As illustrated in Figure 3, Motorola's SoC design consists of many types of components similar to the SoC design of Figures 1 and 2, which include powerPC processing, cores, memory, and bus interfaces. However, more Notably, a 3 X 5 interleaved switch 72 is constructed as an alternative to one of the regional buses. By incorporating the interleaved switch 72 into the SoC design, processor core communication can occur faster by two (3) lines working simultaneously, thereby solving the bandwidth contention problem to a certain extent. However, SoCs are still not optimized for multi-processor support, or more advanced functions such as protocol conversion within a single SoC chip, or high-speed interfaces. The I / O communication within the chip is limited by the interleaved switch, and it still needs to communicate with the external bus interface and the host system bus, thus limiting the performance and flexibility of the microcontroller (SoC chip) for any future upgrades . Any protocol conversion will need to be performed off-chip, in several stages or in the chip. In addition, you cannot _ decouple data packets by placing instructions on the host system's bus. In the example of Fig. 3, for example, a protocol for FlexCan (CAN protocol: ,, Control Area Network ") data flow commonly used in automated applications, and 4eSCI such as dspi (or, 'Serial Peripheral Interface') (Enhanced Serial Communication Interface) Other protocols are now built in the Motorola MPC 5554 chip via an external 1 / 〇 bridge 78, where each protocol or 1/0 specific data stream passes through an I / O bridge, Staggered switches, and usually an internal chip bus or external bus interface to the system bus. Currently, there is no protocol conversion today that resides in a single chip, and there is no conversion from an independent protocol or 98784.doc 12 200540644 protocol version level to a completely new protocol or version level by an embedded core attached to an internal chip bus. Agreement conversion method. The current protocol conversion only occurs at the system or card level, which involves multiple chips as mentioned earlier, an example is the Brocade S ilk worm Fabric Application Server for SAN networks mentioned earlier (for example, see http //www.br oc ade.com/san/ext ending a value of a SANs, jsp), as shown in Figure 4.

在圖4中概念性描緣之先前技術Br〇ea(je系統1 〇〇中,提供 了(例如)光纖通道至光纖通道(FC至FC)路由102、iSCSI至 FC橋接104,及光纖通道至FCMp轉譯11〇能力。Br〇cade之 設計為今日現存技術之改良,其改良之處在於··一光纖1/〇 埠卡可支持多個協定,且甚至可在相同1/〇卡上自一協定遷 ,至另一協定,而不會干擾系統内之其它埠上的通訊流 ®。此係藉由分離封包處理功能中之資料及控制訊框、具 有區域記憶體及訊框緩衝器之若干流線RISC處理器 (in-line RISCprocess〇r)晶片、軟體預處理器,及處理器卡 内之轉澤引擎而得以完成。此為標準單卡上之改良, 其允許單個HBA卡内之兩個網路協定 間、變化協定之可挽性(而不會干擾主系統匯流排上= 流量)、資料傳送耗用,及主系統處理器記憶體上之記 競爭。—e之方法中的多處理器為完全管線式、附;至 區域記憶體的。 & 1要在單晶片内而非單HBA卡或橋接器卡内併入此: :二二得能夠在單晶片入埠行正確協定轉換、在協: 轉換㈣處理資料及控制訊框以向區域就匯流排 98784.doc -13- 200540644 匯流排傳遞一完成之封包。此將使得能夠達成I/C)卡之進一 步潛在減少,硬體(晶片數目)之節省,較少頻寬競爭、記憶 體競爭,並使得能夠達成較高協定速度及一s〇c晶片内(或 附著至一區域系統匯流排)之更多處理器,及較高產出。 【發明内容】 本發明之-目的為在單晶片上提供—種自含式協定轉換 器或將其提供為嵌人式SoCE集,其在不需要主機系統資源 之情况下’完全在該單晶片内或在該嵌入式巨集實施例内 執行協定轉換處理。 根據本發明之一態樣,在單個半導體晶片上提供一種有 效協定轉換裔或將其提供為用於s〇c類型設計之單晶片嵌 入式協定轉換器巨集,該單晶片或嵌入式s〇c巨集實施例能 夠將一通信協定轉換成一獨立、新的通信協定,且/或能夠 將一通信協定版本級轉換成另一通信協定版本級。舉例而 I,s 〇 C嵌入式協定轉換器巨集或單晶片協定轉換器可經組 態以在該單晶片内或該嵌入式S〇C巨集内將封包自一協定 版本級(例如,光纖通道2 Gb/S)轉換成另一協定版本級(例 ^ ’光纖通道lGGb/s),或自—協定轉換成—^全不同之協 疋(例如,光纖通道轉換成乙太網路或iSCSI等等)。 勺無瀹建構為單晶片或建構為嵌入式巨集,協定轉換器皆 一或多個處理器核心總成,其中每一處理器核心總成 包含··兩個或兩個以上微處理器裝置,其能夠執行操作以 建構協疋轉換能力;一區域儲存裝置,其與該等兩個或兩 個以上微處理器裝置相關聯以將資料及指令中至少一者儲 98784.doc -14- 200540644 存於每-處理器核心總成中;一或多個可組態介面 其能夠根據-或多個通信協定而接收及傳輸通信封包’ 連構件’其用於使得能夠在該等兩個或兩個以: 理器裝置與該等介面裝置之間進行通信。因此,有利地处 單晶片協定轉換器及故入式巨集設計包括一用以 s 型設計按比例縮放至高得多的協定速度之構件,且、 眾多處理器併入一 S〇C實施例内之能力。 可藉由利用凡全官線式、多線程、多處理器晶片設計來 實現單晶片或嵌入式協定轉換器功能,其中將區域記憶體 併入該晶片(或作為soc附著巨集)内以處理協定轉換之所 有功能(改變尺寸、重定格式、控制、分割)來將一完成封包 傳遞至區域匯流排。 較佳地,單晶片協㈣換器及後人式巨集設計在不需要 主機系統匯流排資源之情況下執行大部分協定處理(意 即,處理發生在S〇C附著巨集中),意即,#需要時,隨後 將任何經協定轉換之封包置放在區域s〇c或***匯流排 上。協定處理指令係完全執行於s 〇 c協定巨集内或協定轉換 曰曰片内以用於單機设汁之目的。可達成改良之匯流排效 月匕系統頻寬、系統内數目經增加之協定,及主機匯流排 附著卡之顯著減少或消除。 由於單晶片嵌入式巨集,故減除了在協定轉換應用中通 常所採用之主系統子卡,因此降低了成本且增加了效能。 此外,可容易地將Soc嵌入式協定轉換器巨集或單晶片協 定轉換器架構自一功能(意即,協定轉換)重組態成完全新的 98784.doc 200540644 功能(TCP/IP卸載、加速器、防火牆功能,等等)。因此,可 將單晶片或嵌入式協定轉換器巨集之運作功能修改為一完 全新的運作功能,該完全新的運作功能可獨立於或不同於 .第一運作功能(為此功能可將單晶片或嵌入式協定轉換器 ,巨集進行初始程式化)。此運作功能變化可基於多個因素, 其包含(但不限於):晶片中之處理器核心總成之數目(處理 器叢集),該等叢集内之處理器之數目,該等叢集内之區域 籲記憶體(例如’快取記憶體)之數量及與每一叢集相關聯之區 域記憶體(SRAM、DRAM等等)之數量。 根據另一實施例,在單晶片設計中,單晶片協定轉換器 積體電路(1C)或SoC協定轉換巨集核心實施例利用了嵌入 有足夠區域記憶體、控制邏輯、收集及工作佇列、一交錯 式交換器或其它交換子系統、協定控制、介面及匯流排^ 接器I/O功能的多線程、管線、多處理器核心。藉由將標準 匯流排橋接器1/0功能併入系統晶片(SoC)區域匯流排,嵌 • 入式協定轉換器巨集額外地實現了更高密度、效率、改良 之主機處理器效能、頻寬’及記憶體競爭改良、耗用減少。 在多線程方法中,進行管線操作、採用較少數目之指令、 簡單處理器結構、礙入式記憶體,及不行至處理器深處之 環境(^〇nteXt),可使得協定轉換器晶片或嵌入式巨集與起 初打算使用之原始協定轉換器晶片或嵌入式s〇c巨集相 比’對於多個協定、版本級及甚至獨立網路連接功能而言 為可高度適應且可組態的。 有利地’本發明之S〇C嵌入式協定轉換器巨集或單晶片協 98784.doc 200540644 定轉換器可應用至自SAN網路、伺服器、家用網路、自動 網路、工業,及電信至簡單I/O協定資料流的許多應用。 【實施方式】 如本文將參看的,術語’’協定11指任何特定輸入/輸出(I/O) 通信資料實體層流,通常藉由標準本體指定,或可為公司 内部專有介面,其具有多個實例,包括(但不限於):光纖通 道、超高速乙太網路、iSCSI、IP、TCP/IP、FC/IP、ESCON、 FCON、CAN、SAMBA、DSL、VoIP、MPLS、GMPLS,及 更多。 在所述實施例中,協定為通信協定,諸如光纖通道、乙 太網路、iSCSI、ESCON、FCON、IP分層之協定,或諸如 FC/IP、IP/MPLS之封裝協定,等等。資料通信協定通常具 有以位元組、字組或組、訊框,及封包配置之資料位元, 其具有控制字元,諸如,訊框之開始、訊框之結束、源協 定、目標協定等等;以及在位元流之有效負載中的實際資 料。 本發明之協定轉換器採用特殊處理器且被建構為單機或 整合於SoC (系統晶片)類型設計中。圖5中說明可用作SoC 實施例之巨集的基本協定轉換器晶片350之方塊圖。 併入本文之共同擁有之同在申請中的於2003年7月25曰 申請之題為 ’’Self-Contained Processor subsystem as component for System-on-Chip design’'的相關美國專利申請 案第10/604,491號中描述了此核心之基本結構及運作,現 在本文中描述其運作。 98784.doc -17- 200540644 簡吕之,如圖5中所示,單晶片上之協定轉換器(或為s〇c 嵌入式巨集核心)為專用於協定轉換之自含式、基於處理器 之子系統350,但其可重組態成其它網路功能,包含一或多 •個處理器叢集200、用於儲存資料及/或指令之一或多個區 ,域記憶體組215,及一經建構為交錯式交換器(或者,可利 用構造交換器,或MP匯流排)之區域互連構件22〇或其它類 似交換構件。本發明之單晶片協定轉換器設計包含具有得 φ 自P〇WerPC架構之經減少通用指令組的許多簡單處理器核 〇 每一處理器叢集200包含一或多個處理核心2〇5,其各自 為具有四個級深度管線之單條目架構(single_issue architecture),其中每一處理器核心2〇5具有其自身之暫存 器檔案226、算術邏輯單元(ALU)225及指令序列器227。在 圖5中所描繪之單晶片協定轉換器及圖§中所描繪之用於協 定轉換之SoC嵌入式巨集的實施例中,八個處理器核心205 φ 連同一指令快取記憶體2〇8 —起係包裝在一處理器叢集2〇〇 中。指令快取記憶體之尺寸為一設計選項,例如32 kB,其 足以用於網路應用。額外提供經由一區域匯流排而與至少 兩個處理器核心205相關聯之一區域SRAM記憶體單元 230。協定轉換器350中用以支持之足夠計算功率之處理器 叢集200的準確數目(例如一個、兩個或甚至1 6個處理器叢 集(包含128個核心))取決於應用需要。舉例而言,建構用於 光纖通道網路協定之功能需要少於更為複雜之TCP/IP終 端、IP或用於iSCSI協定轉換實施例之計算功率的計算功 98784.doc -18- 200540644 率。 本發明之基於處理器之子系統協定轉換器350之另一特 铋為用於儲存應用程式、當前控制資訊及應用程式所使用 .之資料的嵌入式記憶體215的使用。用以在正常運作條件下 , &供平穩運作之足置記憶體在未過度增加其尺寸的情況下 置放在協定轉換器中。與習知晶片外記憶體相比,嵌入式 圮憶體之另一優點在於:其提供較短及可預測存取時間, . 在封包處理之時間預算估測中精確地考慮了該等存取時 間。 協定轉換器晶片350中之所有元件經由交錯式交換器22〇 而得以互連,該交錯式交換器22〇特定地互連處理器叢集 2〇〇、共用記憶體區塊215,及網路協定層硬體協助裝置或 嵌入式MAC介面175、185。當建構為SoC中之嵌入式巨集 時(諸如本文關於圖8至丨〇所描述的),交錯式交換器22〇接著 藉由橋接器巨集(匯流排)224而連接至或直接附著至s〇c處 | 理器區域匯流排21〇或外部系統匯流排223(例如,?(:1或 PCI-X等等)。可調適橋接器以適應不同速度、匯流排寬度、 訊號,及發出訊號之協定。在巨集SoC實施例中,協定轉換 斋巨集350與嵌入式處理器區域匯流排210 (例如,IBM之 C〇reC〇nnect 中的 PLB 或 ARMBA 中之 ARM、MIP,等等)之間 的標準介面之優點在於:其允許將協定轉換器整合為 組件庫中之巨集。 在網路協定之較低級進一步建構用於高時間攸關功能 (g y time-critical function)的為硬體加速器,其處理低級 98784.doc 19 200540644 協定任務,諸如,資料編碼/解碼、序列化/反序列化、鏈接 管理’以及CRC及校驗和計算(checksumcalculation)。此等 任務係執行於所傳送封包之每一位元組上,且若建構於軟 . 體中,則其計算將極昂貴。此等功能之硬體實施例係提供 • 為建構於光纖通道之網路介面175中及超高速乙太網路185 中之硬體加速器,網路介面175及超高速乙太網路ι85各自 僅需較小砍面積且分別與個別光纖通道及超高速乙太網路 $ 通信鏈接190、195進行接合。 由於協定轉換器核心350與處理器匯流排(單晶片實施例 中之SoC處理器區域匯流排或系統匯流排)之分離而產生之 額外優點為:1)·協定核心與SoC系統或系統匯流排之間僅 有的通訊流量為資料流通訊流量(資料接收及發送),因此最 小化頻寬競爭;及2).子系統互連構造(意即,交換器)在無 需適應總SoC之標準組件介面及連接協定、附著至交換器構 造之其它處理器,或主系統匯流排自身的情況下,為協定 • 核心提供最佳高效能解決方案,從而容許了更高協定轉換 速度、在單個S〇C或主機匯流排配接器卡内處理之更多協 疋’及主系統匯流排上之更少競爭。 現描述當建構為協定轉換器(單機單晶片或作為嵌入式 SoC巨集)時處理器子系統之運作。在一實施例中,單晶片 協疋轉換器35〇(或SoC設計之嵌入式巨集)提供光纖通道 (FC)至超高速乙太網路(GE)轉換。應瞭解,該設計考慮了 许多組合’諸如光纖通道至IP、光纖通道至iSCSI、光纖通 道至無限頻帶(Infiniband)、TCP/IP至iSCSI,及本文所提及 98784.doc -20- 200540644 之任何其匕協定。實際上,該實施例並不僅限於通信協定, 但可建構於自動網路、家庭,或工業環境中,諸如,類似 於用於5者如CAN之自動網路的Motorola MPC5554 • Microcontroller,或用於家庭應用之SAMBA網路。 Λ 圖6為圖5之單晶片協定核心35〇之例示性說明,其係組態 成自光纖通道至超高速乙太網路單晶片協定轉換器綱。 在圖6所示之協定核心中,建構了所需端點功能,以及兩 籲協疋之間的轉換所需之封包的改變尺寸及重定格式。此建 構之基礎為協定運作之分割以致可藉由晶片上之不同資源 來處理它們。一處理器(或一群處理器)之每-協定運作(排 除-些靠近網路實體介面之時間攸關功能之外)係藉由硬 體加,器得以建構。現關於圖6來描述封包及處理流程如 下·糟由DMA邏輯將所接收之封包及—些狀態資訊自入璋 FO緩衝器傳送至嵌入式記憶體,該邏輯已自一自由 緩衝器清單接收了空_,陰辨π °己隐體區域之指標。藉由自記憶體獲 • 得控制資訊或若封包為新交換夕笛^上 W 啊又換之第一個,則藉由產生新控 制資訊來檢查封包標頭以方 與立 知頌以視而要判定封包環境及切換當前 :兄此外·驗近所接收之封包以確保其遵循其所屬之交 =务類別。若需將所接收封包之確認(aeknGwiedgm⑽ 考又送口至源協定(例如,来 九纖通道中之類別2服務),則產生 確認封包。對確認封包 對應^碩資訊進行組合,且將該 封包發送至出埠光纖通道 ° 路面。在此說明書中將封包 界定為資料位元集合,其至少入 田… 至夕含有目標協定資訊,且通常 用於通信封包,亦用於標頭。 98784.doc 200540644 且!為所接收之封包產生超高速乙太網路封包標頭, =超向速乙太網路協定對該封包改變尺寸。將 =之(ΐ多個封包)傳送至乙太網路_-)介面硬體模組 立物f〇緩衝器。類似任務發生以執行相反協定轉 、'〜13將封包自乙太網路傳送至光纖通道網路。圖6 中說明了此原型單晶片光纖通道/乙太網路協定轉換器實 〇〇之ϋ輯表不。此實施例使用14個處理器,其中彼等 處理益運作於圖6中所示之處理區塊細中所描、♦的光纖通 道㈣至乙太網路轉換,其中,在處理器P1處接收FC輸入 封包’同時’圖6中之處理區塊27〇中描繪反向轉換處理。 根據圖6中所描繪之處理流程來執行將協定任務指派給 硬體育源’其中執行如下:處理器P1管理光纖通道入琿dma 機=_Ρ)及目標記憶體區域指派;處理器P2基於封包標 頭貧汛而將封包分派給四個處理器卩3至P6中的一個,节封 包標頭資訊視需要執行環境切換、封包驗證,及確料包 產生;處理器P7執行乙太網路標頭產生,傳送至乙太網路 出琿網路介面之資料的建立,並返回不再被鏈接自由緩衝 器清單所需要之記憶體區域區塊。類似地,藉由圖6中所描 繪之處理|§P8至P14來處理自乙太網路至光纖通道網路之 封包資料流。將待傳輸至該乙太網路之封包的指標置放在 工作仔列249中’且將待發送於光纖通道上之封包的指標置 放在光纖通道出埠工作仔列259上。 可容易地以類似方式建構其它網路協定或協定轉換。舉 例而言,在建構18(^1或TCP/IP協定堆叠中,可再次使用= 98784.doc -22- 200540644 2早處理器實施例之現存程式碼,僅需要々 =式設計努力以使其適應該架構。更具體言之,須建: 分二二Γ集(標記為P2及P7且標記為p9及m之處理器 ;接收路徑及傳輸路徑)之任務,但網In the prior art Broea (je system 100) conceptually depicted in FIG. 4, for example, Fibre Channel to Fibre Channel (FC to FC) routing 102, iSCSI to FC bridge 104, and Fibre Channel to FCMp translates 11 abilities. The design of Brocade is an improvement of today's existing technology. The improvement lies in that a fiber 1/0 port card can support multiple protocols, and can even be used on the same 1/0 card. Protocol to another protocol without interfering with the communication flow on other ports in the system®. This is done by separating the data and control frames in the packet processing function, some with regional memory and frame buffers Streamline RISC processor (in-line RISCprocess) chip, software preprocessor, and translation engine in the processor card are completed. This is an improvement on the standard single card, which allows two in a single HBA card Between the various network protocols, the reversibility of the change protocol (without interfering with the main system bus = traffic), the data transfer consumption, and the competition on the main system processor memory. —E Processor is fully pipelined, attached; to area memory &Amp; 1 to be incorporated in a single chip rather than a single HBA card or bridge card:: 22 can be correctly converted in the single chip port, in the agreement: conversion, processing data and control information Frame to deliver a completed packet to the area on the bus 98784.doc -13- 200540644. This will enable further potential reduction in I / C cards, hardware (chip count) savings, and less bandwidth Competition, memory competition, and enable higher agreement speeds and more processors within a SOC chip (or attached to a regional system bus), and higher output. [Summary of the Invention] The purpose of the present invention is to provide a self-contained protocol converter on a single chip or to provide it as an embedded SOCE set, which is completely on the single chip without requiring host system resources. Protocol conversion processing is performed within or within the embedded macro embodiment. According to one aspect of the present invention, an effective protocol conversion source is provided on a single semiconductor wafer or provided as a single-chip embedded protocol converter macro for a SOC type design, the single-chip or embedded SOC. The c-macro embodiment can convert a communication protocol into an independent, new communication protocol, and / or can convert a communication protocol version level into another communication protocol version level. For example, the SOC embedded protocol converter macro or single-chip protocol converter may be configured to packetize from a protocol version level within the single-chip or the embedded SOC macro (eg, Fibre Channel 2 Gb / S) to another protocol version level (eg, ^ 'Fibre Channel lGGb / s), or from-agreement to-^ a completely different agreement (for example, Fibre Channel to Ethernet or iSCSI, etc.). The spoon is constructed as a single chip or as an embedded macro. The protocol converters are all one or more processor core assemblies, each of which includes two or more microprocessor devices. , Which is capable of performing operations to build co-transformation capabilities; a regional storage device that is associated with the two or more microprocessor devices to store at least one of data and instructions 98784.doc -14- 200540644 Stored in each-processor core assembly; one or more configurable interfaces capable of receiving and transmitting communication packets according to one or more communication protocols; To: Communication between the device and the interface device. Therefore, the single-chip protocol converter and the in-place macro design advantageously include a component for scaling the s-type design to a much higher protocol speed, and a plurality of processors are incorporated into a SOC embodiment. Ability. Single-chip or embedded protocol converter functions can be implemented by using a full-of-line, multi-threaded, multi-processor chip design, where region memory is incorporated into the chip (or as a soc attached macro) for processing All functions of the protocol conversion (resize, reformat, control, split) to pass a completed packet to the regional bus. Preferably, the single-chip co-converter and posterior macro design perform most of the protocol processing without requiring host system bus resources (that is, processing occurs in the SOC attached macro), meaning that , # When needed, any protocol-converted packets are then placed on the area SOC or the system bus. The protocol processing instruction is completely executed in the soc protocol macro or the protocol conversion, ie, in the chip, for the purpose of setting up a single machine. Improved bus efficiency can be achieved. Agreement on the bandwidth of the moon dagger system, increased number in the system, and significant reduction or elimination of the host bus's attached card. The single-chip embedded macro eliminates the main system daughter card that is commonly used in protocol conversion applications, thus reducing costs and increasing performance. In addition, the Soc embedded protocol converter macro or single-chip protocol converter architecture can be easily reconfigured from a function (meaning, protocol conversion) to a completely new 98784.doc 200540644 function (TCP / IP offload, accelerator , Firewall capabilities, etc.). Therefore, the operation function of the single chip or embedded protocol converter macro can be modified to a completely new operation function, which can be independent of or different from the first operation function (for this function, the Chip or embedded protocol converter, macro is initially programmed). This operational function change can be based on a number of factors, including (but not limited to): the number of processor core assemblies in the chip (processor clusters), the number of processors in the clusters, and the areas within the clusters The amount of memory (such as 'cache') and the amount of area memory (SRAM, DRAM, etc.) associated with each cluster. According to another embodiment, in a single-chip design, the single-chip protocol converter integrated circuit (1C) or SoC protocol conversion macro core embodiment utilizes embedded area memory, control logic, collection and task queues, A multi-threaded, pipelined, multi-processor core of an interleaved switch or other switching subsystem, protocol control, interface, and bus I / O functions. By incorporating the standard bus bridge 1/0 function into the system-on-chip (SoC) area bus, the embedded protocol converter macro additionally achieves higher density, efficiency, improved host processor performance, frequency Wide 'and improved memory competition and reduced consumption. In a multi-threaded approach, pipeline operations, the use of a small number of instructions, a simple processor structure, intrusive memory, and an environment that cannot reach the processor's depth (^ 〇nteXt) can make the protocol converter chip or Embedded macros are highly adaptable and configurable for multiple protocols, version levels, and even independent network connectivity capabilities compared to the original protocol converter chip or embedded SOC macro originally intended to be used . Advantageously, the SOC embedded protocol converter macro or single chip protocol of the present invention is 98784.doc 200540644. The converter can be applied to SAN networks, servers, home networks, automatic networks, industrial, and telecommunications. Many applications to simple I / O protocol data streams. [Embodiment] As will be referred to herein, the term `` protocol 11 '' refers to any specific input / output (I / O) communication data entity layer flow, usually specified by a standard ontology, or may be a company's internal proprietary interface, which has Multiple instances, including (but not limited to): Fibre Channel, SuperSpeed Ethernet, iSCSI, IP, TCP / IP, FC / IP, ESCON, FCON, CAN, SAMBA, DSL, VoIP, MPLS, GMPLS, and More. In the described embodiment, the agreement is a communication protocol such as Fibre Channel, Ethernet, iSCSI, ESCON, FCON, IP layered agreement, or an encapsulation agreement such as FC / IP, IP / MPLS, and so on. Data communication protocols usually have data bits arranged in bytes, words or groups, frames, and packets, which have control characters such as the beginning of a frame, the end of a frame, source agreement, destination agreement, etc. Etc .; and the actual data in the payload of the bitstream. The protocol converter of the present invention uses a special processor and is constructed as a stand-alone or integrated in a SoC (System-on-Chip) type design. FIG. 5 illustrates a block diagram of a basic protocol converter chip 350 that can be used as a macro of an SoC embodiment. Related U.S. Patent Application No. 10 / entitled “Self-Contained Processor subsystem as component for System-on-Chip design” The basic structure and operation of this core are described in No. 604, 491, and its operation is now described in this article. 98784.doc -17- 200540644 Jian Luzhi, as shown in Figure 5, the protocol converter on a single chip (or the soc embedded macro core) is a self-contained, processor-based dedicated to protocol conversion Subsystem 350, but it can be reconfigured into other network functions, including one or more processor clusters 200, one or more regions for storing data and / or instructions, a domain memory group 215, and A regional interconnect member 22 or other similar exchange member configured as a staggered switch (or, a construction switch, or an MP bus may be used). The single-chip protocol converter design of the present invention includes a number of simple processor cores with a reduced general-purpose instruction set derived from the PoWerPC architecture. Each processor cluster 200 includes one or more processing cores 205, each of which It is a single-issue architecture with a four-stage deep pipeline, where each processor core 205 has its own register file 226, arithmetic logic unit (ALU) 225, and instruction sequencer 227. In the embodiment of the single-chip protocol converter depicted in FIG. 5 and the SoC embedded macro for protocol conversion depicted in the figure §, eight processor cores 205 φ are connected to the same instruction cache memory 2. 8-The system is packaged in a processor cluster 2000. The size of the instruction cache is a design option, such as 32 kB, which is sufficient for network applications. An area SRAM memory unit 230 is additionally provided that is associated with at least two processor cores 205 via a area bus. The exact number of processor clusters 200 (eg, one, two, or even 16 processor clusters (including 128 cores)) in the protocol converter 350 to support sufficient computing power depends on the needs of the application. For example, building functions for Fibre Channel network protocols requires less computational power than the more complex TCP / IP terminals, IP, or computing power for iSCSI protocol conversion embodiments. 98784.doc -18- 200540644. Another special feature of the processor-based subsystem protocol converter 350 of the present invention is the use of embedded memory 215 for storing application programs, current control information, and data used by the application programs. For normal operation, & sufficient memory for smooth operation is placed in the protocol converter without excessively increasing its size. Compared with the conventional off-chip memory, another advantage of the embedded memory is that it provides shorter and predictable access time. The access is accurately considered in the time budget estimation of packet processing. time. All components in the protocol converter chip 350 are interconnected via an interleaved switch 22o, which specifically interconnects the processor cluster 200, the shared memory block 215, and the network protocol Layer hardware assist device or embedded MAC interface 175, 185. When constructed as an embedded macro in the SoC (such as described herein with respect to Figures 8 to 〇), the interleaved switch 22 is then connected to or directly attached to the bridge macro (bus) 224 s〇c processing | processor area bus 21 or external system bus 223 (for example,? (: 1 or PCI-X, etc.). The bridge can be adapted to different speeds, bus widths, signals, and output Signal agreement. In the embodiment of the macro SoC, the agreement converts the Zhai macro 350 and the embedded processor area bus 210 (for example, PLB in IBM CoCrennect or ARM, MIP in ARMBA, etc. The advantage of the standard interface between) is that it allows the protocol converter to be integrated into a macro in the component library. The lower level of the network protocol is further constructed for gy time-critical functions Is a hardware accelerator that handles low-level 98784.doc 19 200540644 contract tasks such as data encoding / decoding, serialization / deserialization, link management ', and CRC and checksum calculation. These tasks are performed in Passed on Each byte of the packet, and if built in software, its calculation will be extremely expensive. Hardware embodiments of these functions are provided • Built in Fibre Channel network interface 175 and super high speed Hardware accelerators in Ethernet 185, network interface 175 and ultra-high-speed Ethernet ι85 each require a small area and are connected to individual Fibre Channel and ultra-high-speed Ethernet communication links 190 and 195, respectively. Additional advantages resulting from the separation of the protocol converter core 350 from the processor bus (the SoC processor area bus or system bus in a single-chip embodiment) are: 1) the protocol core and the SoC system or system The only communication traffic between the buses is the data communication traffic (data receiving and sending), so the bandwidth competition is minimized; and 2). The subsystem interconnect structure (meaning, the switch) does not need to adapt to the total SoC Standard component interfaces and connection protocols, other processors attached to the switch fabric, or the main system bus itself provide the protocol / core with the best high-performance solution, allowing for higher Fixed conversion rate, the less competitive the more co-treatment of piece goods within a single host bus adapter card or S〇C 'and the main system bus. The operation of the processor subsystem when constructed as a protocol converter (stand-alone single-chip or as an embedded SoC macro) is now described. In one embodiment, the single-chip co-converter 350 (or embedded macro designed by the SoC) provides Fibre Channel (FC) to ultra-high-speed Ethernet (GE) conversion. It should be understood that this design considers many combinations such as Fibre Channel to IP, Fibre Channel to iSCSI, Fibre Channel to Infiniband, TCP / IP to iSCSI, and any of the 98784.doc -20-200540644 mentioned in this article Its dagger agreement. Actually, this embodiment is not limited to communication protocols, but can be constructed in an automatic network, home, or industrial environment, such as Motorola MPC5554 Microcontroller similar to the one used for 5 automatic networks such as CAN, or used in SAMBA network for home applications. Λ FIG. 6 is an illustrative illustration of the single-chip protocol core 35 of FIG. 5, which is configured from a fiber channel to an ultra-high-speed Ethernet single-chip protocol converter. In the core of the agreement shown in Fig. 6, the required endpoint functions are constructed, and the resizing and re-formatting of the packets required for the conversion between the two call protocols. The foundation of this structure is the division of the protocol operation so that they can be handled by different resources on the chip. The per-protocol operation of a processor (or a group of processors) (except for some time-critical functions close to the physical interface of the network) is built by hardware. The packet and processing flow will now be described with respect to FIG. 6 as follows. The received packet and some state information are transferred from the FO buffer to the embedded memory by the DMA logic. The logic has been received from a free buffer list. Empty_, yin identify π ° index of the hidden body area. By obtaining control information from memory or if the packet is a new exchange xi flute ^ on W ah and then the first one, then check the packet header by generating new control information to understand To determine the packet environment and switch the current: brother In addition, check the received packet to ensure that it follows the traffic category it belongs to. If it is necessary to confirm the received packet (aeknGwiedgm 考) and send it to the source agreement (for example, the Type 2 service in the Nine Fiber Channel), a confirmation packet is generated. The confirmation packet correspondence information is combined, and the packet is combined Send to Outbound Fibre Channel ° Pavement. In this specification, a packet is defined as a set of data bits, which at least enters the field ... It contains target protocol information, and is usually used for communication packets, and also for headers. 98784.doc 200540644 And! Generate a super high-speed Ethernet packet header for the received packet, = the super-speed Ethernet protocol changes the size of the packet. Send = ((multiple packets)) to the Ethernet_- ) Interface hardware module stand f0 buffer. A similar task occurs to perform the opposite protocol transfer, '~ 13 to transmit packets from the Ethernet network to the Fibre Channel network. Figure 6 illustrates the compilation of the prototype single-chip Fibre Channel / Ethernet protocol converter. This embodiment uses 14 processors, in which their processing benefits operate on the Fibre Channel ㈣ to Ethernet conversion described in the processing block shown in FIG. 6, where the processing is received at processor P1 The FC input packet is 'simultaneously' depicted in reverse processing in processing block 27 of FIG. 6. According to the processing flow depicted in FIG. 6, the assignment of the contract task to the hard sports source is performed, where the execution is as follows: the processor P1 manages the Fibre Channel access (DMA machine = _P) and the target memory area assignment; the processor P2 is based on the packet label The packet is distributed to one of the four processors 卩 3 to P6, and the packet header information is saved as needed to perform environment switching, packet verification, and confirmation packet generation; processor P7 performs Ethernet header generation , The creation of the data sent to the Ethernet outbound network interface, and returns the memory area blocks that are no longer needed for the linked free buffer list. Similarly, the packet data flow from Ethernet to Fibre Channel network is processed by the processing depicted in Figure 6 | §P8 to P14. The index of the packet to be transmitted to the Ethernet is placed in the task queue 249 'and the index of the packet to be transmitted on the Fibre Channel is placed on the Fibre Channel outbound task queue 259. Other network protocols or protocol conversions can be easily constructed in a similar manner. For example, in constructing 18 (^ 1 or TCP / IP protocol stack, the existing code of the early processor embodiment can be reused = 98784.doc -22- 200540644. Only 々 = design effort is needed to make it Adapt to the architecture. More specifically, you must build: tasks of two two Γ sets (processors labeled P2 and P7 and labeled p9 and m; receive path and transmission path), but

^標記為P3至P6及P1。至⑴之處理器上幾乎無變化地; I仃。鬚根據任務複雜性來按比例縮放並行運行協定任 =之處=器的數目以滿足計時需要。舉例而言,在圖6中所 田繪之貫例中’ iSCSI協定轉換可能需要14個以上之處理哭 來執行單晶片協定轉換。 °° 可藉由以下方法來執行多個處理器核心上之封包處理: 遵循運仃至完成方法(run_t()__pletiGn咖⑽eh),其中將 :封包指派給—執行所有處理操作之單個處理器;或經由 官線操作,藉以將封包處理操作分割成指派給獨立處理器 之多個管線級。在本文所述之實施财,管線式方法提供 了對如I-快取記憶體之硬體資源較好之利用。可指 管線級之網路運作之實例為標頭處理、封包驗證、確認響 應之產生、封包重定序及訊息組合,及端對端控: (end-to-end control)。 在初始化期間靜態地執行分派給處理器之協定任務的排 矛μ即母處理器205在各種封包上執行同組運作。同 樣地,為避免與動態記憶體管理相關聯之諸如垃圾收集 (garbage collection)耗用,使用靜態記憶體管理。在系統啟 動期間初始化所有使用之記憶體結構23〇。此等結構包括用 於儲存資料封包之記憶體區域275、用於現存網路連接之控 98784.doc -23- 200540644 制及狀態資訊的記憶體280、程式碼285,及工作佇列。圖7 中說明了用於該架構中之各種記憶體結構。正是此等結構 之提供使得封包協定轉換之所有子處理能夠維持在皁晶片 • 内或作為附著至SoC類型設計之嵌入式巨集,從而使得能夠 . 將最終完成之封包傳送至系統匯流排或内部SoC匯流排。 如圖7中所示,將儲存資料封包之記憶體區域215組成為 鏈接清單275。將自鏈接清單所獲得之進入封包儲存於下一 自由緩衝器中。在封包處理期間,僅其中儲存封包之記憶 體區域的指標在處理恭之間傳遞。當完成封包處理後^封 包被傳送至出埠網路介面,且緩衝器返回至自由緩衝器清 單。 如圖7中進一步描繪的,以經由工作佇列290之訊息傳遞 的形式執行處理器同步。在此方法中,每一處理器P1至P4 與一儲存待藉由該執行緒加以處理的封包之指標的工作佇 列290相關聯。當處理器準備好工作於下一封包時,其自其 φ 工作佇列得到下一申請中之封包的指標。當其完成處理 時,其將該封包之指標置放入該管線中之下一執行緒的工 作佇列中。為確保記憶體存取衝突中之正確運作,使用了 鎖定。 處理器排程中之一重要考慮在於:所有管線級可不需要 相同封包處理時間,且此外,每一管線級處之處理時間可 視封包環境而改變。舉例而言,在光纖通道中,封包處理 時間取決於該封包是否為封包序列中之第一、中間或最末 封包,其是否承載鏈接控制資訊,及其是否屬於請求或未 98784.doc -24- 200540644 息。若管線級中的-個顯著慢於其它級,則其工作 過载,從而變為—卿❶―。此藉由 ^右 器(例如,圖7中之P1至P4)並行工作於相同管 • 線級而得以補救。 • 由於管線級中之處理日本P爿+ 利用所有處理器。實情為:夢=包間改變’故不可完全 而在管線級之間達成匹配之產出。 、铒 工作於相同任務之多虛 •,王…, 力《夕處理益的指派需要引入-任務分派 二、、。。、,圖7中描繪為處理器P5並標記為"MT")。封包 ^ Ml間早’以致其執行於較短處理時間巾,且不合 猎由貧源競爭及鎖定而導致效能降級。 曰 用2為’對於光纖通道實施例而言,藉由單個處理器使 所有封包處理屬於相同環境群的二進位分類演算 案中將:於當,之資訊存入(Cache)處理器之暫存器檔 時間。:而減乂 Ll源競甲及用於取得此資訊之平均存取 猎=使用間早雜湊功能而將分類耗用維持在較低 在—貫施例中,封包分類及處理器指派任務僅引入約 洲指令。然而,應瞭解,此可視設計選擇而改= =:個指令之範圍内隨意改變。在所描緣之實施例中, ㈣了盡可能少的指令組,例如,在35至5〇個指令的 内交化。在任何給定時間於典型網路通訊流量中 比處理器更多之主動環境群,可同時將若干不同琴: 給單個處理器。在最糟狀況下,可能將所有封包僅二 一# # 7田口口 "又L 另町包僅指派給 处益仗而致使其過载。然而,運行實際應用之網路 98784.doc -25- 200540644 通訊流量揭示,由於將該等環境群均勻地分佈於並行運作 之處理器中,故此不成問題。 · 單曰曰片協疋轉換裔之架構為蜂巢式,從而允許將設計定 製地(custom)按比例縮放。在該設計中,可容易地根據應用 需要來調適處理器核心及嵌入式記憶體區塊之數目,而無 而作出顯著设s十變化。舉例而言,在以下網路應用中,以 1 〇 G b / S之線速度運作之多處理器協定轉換器之所需計算能 力可改變,如本文以下内容將描述的。應注意,在此方面, 協定轉換器設計實際上藉由層設計、嵌入式記憶體、藉由 不同處理來進行處理並委派(delegate)給每一子處理器之網 路及協定資源的分割而可"適用於”其它網路處理功能,,,不 受限制"意即’處理器獨立於—特殊網路功能,其不同於先^ Marked as P3 to P6 and P1. There is almost no change on the processor; I 仃. The number of parallel running protocols must be scaled according to task complexity to meet timing needs. For example, in the conventional example shown in Figure 6, the 'iSCSI protocol conversion may require more than 14 processes to perform single-chip protocol conversion. °° Packet processing on multiple processor cores can be performed by the following methods: Follow the run-to-completion method (run_t () __ pletiGnCa⑽eh), where: Packets are assigned to a single processor that performs all processing operations; Or through the official line operation, thereby dividing the packet processing operation into multiple pipeline stages assigned to independent processors. The implementation of the pipeline described in this article provides a better use of hardware resources such as I-cache. Examples of pipeline-level network operations are header processing, packet verification, generation of acknowledgement responses, packet resequencing and message combinations, and end-to-end control. The schedule of statically performing the assigned tasks assigned to the processor during initialization, i.e., the parent processor 205 performs the same set of operations on various packets. Similarly, in order to avoid consumption such as garbage collection associated with dynamic memory management, static memory management is used. All memory structures used 23 are initialized during system startup. These structures include a memory area 275 for storing data packets, a memory 280 for control and status information of existing network connections 98784.doc -23- 200540644, code 285, and a task queue. Figure 7 illustrates the various memory structures used in this architecture. It is the provision of these structures that allows all sub-processing of the packet protocol conversion to be maintained within the soap chip or as an embedded macro attached to the SoC-type design, thus enabling the transmission of the finalized packet to the system bus or Internal SoC bus. As shown in FIG. 7, the memory area 215 storing the data packets is composed as a link list 275. The incoming packets obtained from the linked list are stored in the next free buffer. During packet processing, only the index of the memory area in which the packet is stored is passed between the processing nodes. When the packet processing is completed, the packet is sent to the outbound network interface, and the buffer returns to the free buffer list. As further depicted in FIG. 7, processor synchronization is performed in the form of message passing through a task queue 290. In this method, each processor P1 to P4 is associated with a task queue 290 that stores an indicator of a packet to be processed by the thread. When the processor is ready to work on the next packet, it gets the index of the packet in the next application from its φ work queue. When it finishes processing, it puts the packet's index into the task queue of the next thread in the pipeline. To ensure correct operation in memory access conflicts, locks are used. An important consideration in processor scheduling is that all pipeline stages may not require the same packet processing time, and in addition, the processing time at each pipeline stage may vary depending on the packet environment. For example, in Fibre Channel, the packet processing time depends on whether the packet is the first, middle, or last packet in the packet sequence, whether it carries link control information, and whether it belongs to a request or not. 98784.doc -24 -200540644 interest. If one of the pipeline stages is significantly slower than the other stages, its work is overloaded and becomes -Qing❶. This is remedied by using the right-hand device (for example, P1 to P4 in Figure 7) working in parallel on the same pipe level. • Due to the processing in the pipeline stage Japanese P 爿 + utilizes all processors. The truth is: dream = change between packages', so it cannot be complete and the matching output is reached between pipeline levels. , 铒 How many vacancies working on the same task, Wang ..., "The assignment of the benefits of processing needs to be introduced-task assignment. . , Is depicted as processor P5 in FIG. 7 and labeled " MT "). The packet ^ Ml is too early, so that it is executed in a shorter processing time, and it is not compatible with poor competition due to poor sources and lock-in, resulting in performance degradation. Use 2 for 'for the Fibre Channel embodiment, a single processor is used to make all packets process a binary classification algorithm that belongs to the same environment group. In the case of: Gear time. : While reducing Ll source competition and the average access hunting used to obtain this information = using early hash function to keep classification consumption low-in the embodiment, packet classification and processor assignment tasks are only introduced Joshua Directive. However, it should be understood that this can be changed depending on design choices ==: arbitrary changes within the scope of instructions. In the depicted embodiment, there are as few instruction sets as possible, e.g., internalization within 35 to 50 instructions. At any given time, there are more active environment groups than processors in typical network communication traffic, and several different pianos can be given at the same time: to a single processor. In the worst case, it is possible to assign all the packets to only one # # 7 田 口 口 " Also, the other town pack is only assigned to the Virgin War and overloaded it. However, the communication network running the actual application 98784.doc -25- 200540644 revealed that since these environmental groups are evenly distributed among the processors operating in parallel, it is not a problem. · The architecture of the transformation of the monograph and film association is honeycomb, allowing the design custom to be scaled. In this design, the number of processor cores and embedded memory blocks can be easily adjusted according to the application needs without making significant changes. For example, in the following network applications, the required computing power of a multi-processor protocol converter operating at a linear speed of 10 Gb / S can vary, as will be described later in this article. It should be noted that, in this regard, the protocol converter design actually uses the layer design, embedded memory, processing by different processes, and delegating the network and protocol resources to each sub-processor. Can be "applicable to" other network processing functions, without restrictions "meaning" the processor is independent of-special network functions, which is different from the first

前技術中每一處理器僅具有一給定潛在功能,諸如TCP/IP 卸載功能、訊框分類器功能、預處理功能、硬體加速器、Each processor in the previous technology has only a given potential function, such as TCP / IP offload function, frame classifier function, preprocessing function, hardware accelerator,

Rise或資料處理功能,等等。在本發明之單晶片協定轉換 器350中’或作為嵌人式巨集核心55G,只要允許具有足夠 處理功率來按比例縮放所打算之操作,則相同處理器及區 域記憶體可執行不同網路功能(意即,操作)。下文列肚 實例: 一 餘#換-:14個處理器(意即,兩個8核心處理器叢集)。 一包括64位元組之快取記憶體、64 K位元組之資料.· SRAM、P〇werPC44〇(或其它處理器)及圖$及圖6中所示之其 . 它巨集的晶片在〇·13 mASIca術中將需要約35平方毫米。 個處理器,意即,四個處理器叢集, 98784.doc -26- 200540644 假定128 K位元組之^快取記憶體及128尺位元組之 SRAM,此在以上技術中將需要佔用5〇平方毫米。 之 盘^1】8個處理器(估測),意即,16個處理器核 心叢集。假定512 K位元組之I-快取記憶體及512 κ位元組 SRAM,所得之晶片將約為15〇平方毫米。Rise or data processing functions, etc. In the single-chip protocol converter 350 of the present invention, or as the embedded macro core 55G, as long as it has sufficient processing power to scale the intended operation, the same processor and area memory can perform different networks. Function (meaning, operation). Examples are listed below: Yi Yu # for-: 14 processors (meaning, two 8-core processor clusters). One includes 64-bit cache memory and 64 K-byte data. · SRAM, PowerPC44 (or other processors) and others shown in Figure $ and Figure 6. Other macro chips Approximately 35 square millimeters will be required in a 0.13 mASIca procedure. Processors, that is, four processor clusters, 98784.doc -26- 200540644 assuming 128K bytes of cache memory and 128-byte SRAM, which will require 5 in the above technology 〇 square millimeter. Disk ^ 1] 8 processors (estimated), which means a cluster of 16 processor cores. Assuming a 512 K-byte I-cache and a 512 K-byte SRAM, the resulting chip will be approximately 150 square millimeters.

當網路速度或市場條件改變時,相同基本架構(吾人可在 -給定應用所需之晶片内或作為—SqC嵌人式巨集内建置 更多子處理器)係可適用的。舉例而言,該架構係調適以將 晶片重組態成"防火牆處理器’,或"協定轉換器",或甚至重組 態成仍未經建構之完全新的設計或協定。因此,一基本設 計可擴展至許多應用及潛在功能。可在不重設計晶片之情 況下藉由僅選擇處理器及記憶體單元之數目,接著應用 適當軟體碼或版本級來改變晶片或嵌入式巨集s〇c核心之 罔路力此II由對所打算之新功能之核心效能的統計模 擬’來選擇用於新協定功能之所選處理器及記憶體單元之 數目。 汝車乂早所提及的,可將協定轉換器建構為獨立半導體基 板上之單機積體電路晶片,或嵌入為S〇C類型設計、;ppGA、 DSP等等中之巨集。圖8中财了根據本發明之第二態樣的 «構為S〇C (系統晶片)設計侧中之嵌人式巨集核心的協 定,換H之實例。應瞭解’由於全部所需僅為—用以傳送 ^成之負料封包(經轉換或未經轉換)的標準匯流排介面 或橋接器,故該巨集不,SgC設計,而亦可被建構於標準 贈、微控制器、;pPGA、ASIC,及微處理器中。術語"s〇c" 98784.doc -27- 200540644 通常用以界定晶片上之系統,其具有至少一處理元件、記 憶體元件、I/O介面,及附著至一區域匯流排或多個晶片上 匯流排之核心。 如圖8中所示,包括嵌入式協定轉換器巨集核心550 (亦在 圖5中描繪為單機晶片設計)之S〇C 400之一實施例包含一 CPU或MPU元件425(此處圖示為IBM之PowerPC 440,然 而,應瞭解,除PowerPC之外,可建構諸如ARM、MIP及其 類似物之其它SoC處理器核心);一區域SoC匯流排210(圖8 中說明為IBM之CoreConnect PLB 210(處理器區域匯流 排));一可選慢速匯流排(圖8中說明為IBM之晶片上周邊匯 流排或OPB 240);及諸如圖1中所示之任何數目之SoC組件 (核心),其包括一81^]^415、0011控制器418、?(:1-又橋接 器422、DMA 426及DMA控制器428、0PB橋接器429,等等。 0PB 240連接包括以下裝置中一個或多個之其它裝置: RAM/ROM周邊控制器445a、外部匯流排主機445b、UART 裝置445c、1C間匯流排(I2C)介面445d、通用I/O介面445e, 及閘道介面445f。 圖8中所描繪之實施例包括自含式、基於處理器之協定轉 換器550,其係整合為單晶片協定轉換器或SoC系統400中之 嵌入式巨集核心,且經由橋接器224及PLB 210而與處理器 核心425匯流排通信。如所描述的,基於處理器之協定轉換 器巨集核心550包含一或多個處理器叢集200、一或多個用 於儲存資料及/或指令之區域記憶體組215、諸如所描繪之 實施例中之交錯式交換器220的區域互連構件,或等效地為 98784.doc -28- 200540644 -構造或Νχ X交換器及其類似物,以及用於至少兩個網路 協定之至少兩個媒體存取控制(MAC)介面單元175、185(媒 體存山取控制)。如圖8中所*,具有可為實體層晶片(PHY)、 入式MAC或pHY功能、或外部協定晶片之個別外部 MAC介面衷置475 ,的此等MAC單元175、185介面與soc 或主機卡分離。亦即,圖8中所示之MAC 475、485可包含 m ϋ 4硬體協助核心及乙太網路⑽μ eMAC,然 而可包括用於任何協定之介面,及於晶片外被整合為獨 立MAC或PHY裝置(實體層晶片),或整合於區域卡上之s〇c 曰曰片的外部。在諸如自動應用或家庭網路連接之今日慢速 應用中,此可能為吾人所需要的。 圖9說明嵌入式s〇c内之單個封包至外部協定介面之協定 轉換的處理流芽呈。當£集内之區域記憶體控%封包之處理 與DMA傳送時,封包可在協定轉換自巨集出來之後,自第 面傳送至第二I/O介面,或輸出至s〇c匯流排且最終 至主機系統匯流排223(例如,如圖9中所描緣之pci-χ 133 MHZ或類似均等物)。較佳地,通信為雙工(duplex)的,意 即,包括能夠以發送與接收方向進行通信之鏈接。藉由圖9 中所描繪之實例,展示人至3封包轉換具有一根據第一協定 之封包,例如,其中,在Soc外部協定晶片、巨集或emac(外 部乙太網路I/O)介面485處接收1G乙太網路封包,且將該⑴ 乙太、,周路封包轉遞至轉換器巨集之Emac 1 85内部FIFO,並 於交錯式父換器220上進入該巨集之内部記憶體215中。巨 集之内部記憶體(SRAM、DRAM等等)藉由工作提示而收集 98784.doc -29- 200540644When the network speed or market conditions change, the same basic architecture (we can build more sub-processors in a chip required for a given application or as a SqC embedded macro) is applicable. For example, the architecture is adapted to reconfigure the chip into a " firewall processor ", or " protocol converter ", or even restructure into a completely new design or protocol that has not yet been constructed. Therefore, a basic design can be extended to many applications and potential functions. It is possible to change the path force of the chip or embedded macro SOC core by selecting only the number of processors and memory units without redesigning the chip, and then applying the appropriate software code or version level. A statistical simulation of the core performance of the proposed new function 'selects the number of selected processors and memory units for the new protocol function. As mentioned by Ru Chezhen earlier, the protocol converter can be constructed as a stand-alone integrated circuit chip on an independent semiconductor substrate, or embedded in a SOC-type design; a macro in ppGA, DSP, and so on. Fig. 8 shows an example of the structure of the embedded macro core in the design side of the SOC (System-on-Chip) according to the second aspect of the present invention. It should be understood that, because all that is needed is a standard bus interface or bridge to transmit a negative packet (converted or unconverted), the macro is not a SgC design, but can also be constructed In standard gifts, microcontrollers, pPGA, ASIC, and microprocessors. The term " s〇c " 98784.doc -27- 200540644 is generally used to define a system on a chip, which has at least one processing element, memory element, I / O interface, and is attached to an area bus or multiple chips The core of the upper bus. As shown in FIG. 8, one embodiment of the SOC 400 including an embedded protocol converter macro core 550 (also depicted as a stand-alone chip design in FIG. 5) includes a CPU or MPU element 425 (shown here) PowerPC 440 from IBM, however, it should be understood that in addition to PowerPC, other SoC processor cores such as ARM, MIP, and the like can be constructed; a regional SoC bus 210 (illustrated as IBM's CoreConnect PLB in Figure 8) 210 (processor area bus)); an optional slow bus (illustrated as an IBM on-chip peripheral bus or OPB 240 in Figure 8); and any number of SoC components such as those shown in Figure 1 (core ), Which includes a 81 ^] ^ 415, 0011 controller 418,? (: 1- and bridge 422, DMA 426 and DMA controller 428, 0PB bridge 429, etc. 0PB 240 connection includes one or more of the following devices: RAM / ROM peripheral controller 445a, external bus Host 445b, UART device 445c, 1C bus (I2C) interface 445d, general-purpose I / O interface 445e, and gateway interface 445f. The embodiment depicted in Figure 8 includes a self-contained, processor-based protocol conversion 550, which is integrated as a single-chip protocol converter or an embedded macro core in the SoC system 400, and communicates with the processor core 425 bus via the bridge 224 and PLB 210. As described, the processor-based The protocol converter macro core 550 includes one or more processor clusters 200, one or more regional memory banks 215 for storing data and / or instructions, such as the interleaved switch 220 in the depicted embodiment Regional interconnect building blocks, or equivalently 98784.doc -28- 200540644-Constructor or N × X switch and the like, and at least two Media Access Control (MAC) for at least two network protocols Interface units 175, 185 (media storage mountain Control). As shown in Figure 8 *, there are individual external MAC interfaces that can be physical layer chip (PHY), embedded MAC or pHY functions, or external protocol chips, such as 475. These MAC units 175 and 185 interface with soc or host card separation. That is, the MAC 475, 485 shown in Figure 8 may include m ϋ 4 hardware assist core and Ethernet ⑽ μ eMAC, but may include an interface for any protocol, and off-chip Integrated as a stand-alone MAC or PHY device (physical layer chip), or external to the SOC chip on the area card. In today ’s slow applications such as automatic applications or home network connections, this may be ours What is needed is shown in Figure 9. Figure 9 illustrates the processing flow of protocol conversion from a single packet in embedded soc to an external protocol interface. When processing and DMA transfer of the% memory control packet within the set, the packet can be sent in After the protocol is converted from the macro, it is sent from the first side to the second I / O interface, or it is output to the soc bus and finally to the host system bus 223 (for example, pci-χ as described in Figure 9). 133 MHZ or equivalent). Preferably, communication is dual (Duplex), that is, including a link capable of communicating in the sending and receiving directions. With the example depicted in FIG. 9, a human-to-3 packet conversion is shown with a packet according to the first protocol, for example, where Soc external protocol chip, macro or emac (external Ethernet I / O) interface receives 1G Ethernet packet at 485, and forwards the Ethernet, weekly packet to Emac of converter macro 1 85 internal FIFO, and enters the internal memory 215 of the macro on the interleaved parent switch 220. Macro's internal memory (SRAM, DRAM, etc.) is collected through job tips 98784.doc -29- 200540644

乙太、’罔路封包,且晶片上控制器功能經由交錯式交換器而 將乙太網路封包傳送至如圖9中所示之”proc.i ”叢隼的子處 理器。應瞭解,如本文所述,由於並行、管線運= 程,協定轉換處理在用於協定轉換之嵌入式協定轉換器巨 集核心550内於若干子處理器中均等中止,且若干轉換處理 與一處理器相匹配。因此,即使僅有一輪,例如,描緣了 ”A"自"prod"至"proc.3”而行至"B",但實際上,封包^在 用於轉換之若干處理器中被分割。儘管圖9中描繪了協定A 至B封包轉換,但替代處理將包含在"B完成"之另一側上使 協定&封包進入並於協冑#1(A側)上退出。應瞭解,&及 A·處理流程將位於兩路雙工鏈接之另一側上。 在嵌入式協定轉換器巨集核心550中所包括之處理元件 上執行實際協定轉換碼。巨集具有若干並行運行之處理 P0、P1…Pn——組用於每一方向(意即,接收及傳輸卜此等 處理中每一者係映射成標記為Proc.〇、Pr〇c l、Pr〇c 2等等 之巨集處理元件中的一個。在所述實施例中提供三種不同 處理以使其運行於嵌入式SoC巨集之處理器上,該等處理包 括: i 1 ·免派:向處理器配置任務之處理 2·協定虛理:協定處理任務 3· iliA :設定DMA SoC控制器以將封包自核心之内部記 憶體傳送出來’以及在已將封包傳送之後執行一些記憶體 管理功能。 a 此等處理之間的通信係經由諸如圖7中胼扮w > 丁所4田繪之記憶體 98784.doc -30- 200540644 中之基本專用區域的工作佇列得以完成。閒置處理藉由週 期性輪詢其工作佇列而判定其是否具有申請令之工作。 協定巨集核心建構所有所需之特定協定任務,諸如將資 料分割成-系列ip封包、產生„>封包標頭、產生乙太網路 封包等等,且將封包移回至乙太網路MAC巨集。若存在再 次傳輸封包之需要,如協定所界定的,則此在無來自 區域處理器干擾之情況下發生,僅封包/f料傳送請求或實 際貝料傳送為外部DMA或DDR記憶體所需要。在封包,fA,, 至B之協定轉換之後,該封包被傳送回至區域晶片上巨集 記憶體,且資料之一端為訊號。自彼處,區域巨集記憶體 及嵌入式區域DMA控制器經由交錯式交換器、光纖通道介 面及最終外部I/O介面而傳送經轉換之封包。或者,光纖通 道介面可具有一嵌入式控制器以傳輸最終經轉換封包。 若需要,外部SoC DDR 41 8或DM A 42 6可額外地請求封包 經由匯流排橋接器而被傳送至區域SoC匯流排且最終到達 主機系統匯流排223上,其與自協定轉換器介面發送封包相 對。同樣地,主機匯流排223可向巨集發送用於協定轉換之 一封包或多個封包且接收一經轉換回的完成封包或視個別 協疋及封包類型而傳送至外部協定介面475、485。 圖10說明一接收自主機匯流排223且傳送至外部SoC介面 485以用於傳輸之單個封包之協定轉換的例示性處理流 程。例如在圖1〇中所說明之實例處理流程中,自主機系統 匯流排223發送(源自該主機系統匯流排223)—光纖通道協 定封包並將其發送至SoC協定轉換器巨集350以用於轉換及 98784.doc -31 - 200540644 傳輸至外部乙太網路介面1G EMAC介面485。如圖l〇中所 示,SoC主處理器(PowerPC 44〇)設定一資料處理請求且經 由匯流排橋接器224而將該請求及外部DDR記憶體中之資 料的指標發送至協定轉換器巨集核心55〇。在所描繪之實施 例中,產生中斷訊號,但此可藉由向專用暫存器或預先指 定之記憶體位置寫入資料來建構。 嵌入式協定轉換器巨集核心55〇識別該請求且啟動DMA 引擎以將資料自外部主機或s〇c區域記憶體傳送至巨集區 域纪憶體。經由區域s〇c匯流排及橋接器匯流排而將資料 (例如,封包#B)傳送至巨集之區域記憶體215。當傳輸了 所有資料時,SoC處理器被通告任務完成。此可藉由向 PowerPC440^送一中斷或寫入由p〇werpC44〇定期輪詢之 些預先界定的位置而得以建構。 藉由工作仔列、收集序列及一作為任務分派處理器 (MT)(如圖7中所示)之處理器,該光纖通道封包⑻藉由如本 文所述之封包分割而得以自巨集之區域記憶體傳送至多個 子處理器。一旦完成協定轉換,例如,自協定”B”(光纖通 道類型)至協定”A”(超高速乙太網路類型),則所完成之封包 由乂錯式又換器220而傳送回至區域巨集之記憶體。區域 DMA睛求將封包,,A,,自巨集之記憶體傳送至外部乙太網路 介面485以完成傳輸及轉換。 本文所述之方法考慮了數目經減少之1/〇卡及晶片、經極 大改良之可撓性、網路功能、較高密度(附著至一區域或主 機匯流排之更多處理器)、較高協定處理速度、改良之頻 98784.doc -32- 200540644 寬、較少之記憶體競爭、終端系統用戶之可撓性、網路設 計/升級之簡易’及較之今日現存協㈣換而言經極大改^ 之協定轉換。 义 儘管已結合本發明之說明性及經執行之實施例而特定地 展示及描述了本發明,但熟知此項技術者應瞭解,在不脫 離僅受限於附加巾請專利範圍之㈣的本發明精神及範鳴Ethernet, ‘Broadway’ packets, and the on-chip controller function sends the Ethernet packets to the sub-processors of the “proc.i” cluster as shown in FIG. 9 via an interleaved switch. It should be understood that, as described herein, due to parallel and pipeline operations, the protocol conversion processing is suspended equally among several sub-processors within the embedded protocol converter macro core 550 used for protocol conversion, and several conversion processing and one The processors match. Therefore, even if there is only one round, for example, "A " from " prod " to " proc.3" and go to " B ", in fact, the packet ^ is in several processors used for conversion Divided. Although the protocol A to B packet conversion is depicted in Figure 9, the alternative process will involve getting the protocol & packets on and off on the other side of " B completion " on protocol # 1 (side A). It should be understood that & and A · processing flows will be on the other side of the two-way duplex link. The actual protocol conversion code is executed on the processing elements included in the embedded protocol converter macro core 550. The macro has a number of processes P0, P1 ... Pn running in parallel-groups for each direction (meaning, receiving and transmitting). Each of these processes is mapped to the labels Proc.〇, PrOcl, Pr One of the macro processing elements such as 0c and the like. In the described embodiment, three different processes are provided to run on the processor of the embedded SoC macro. These processes include: i 1 · Free: Configure task processing to the processor 2 · Protocol virtualization: Protocol processing task 3. iliA: Set the DMA SoC controller to transfer packets from the core's internal memory 'and perform some memory management functions after the packets have been transferred A Communication between these processes is accomplished via a work queue such as the basic writable area in Figure 7 and the memory 98784.doc -30-200540644 of Ding So Ding. The idle process borrows Periodically poll its job queue to determine whether it has the job of applying for an order. The agreement macro core constructs all the specific agreement tasks required, such as dividing the data into a series of IP packets, generating "> packet headers, Produce ether Network packets, etc., and move the packets back to the Ethernet MAC macro. If there is a need to retransmit the packets, as defined in the agreement, this will happen without interference from the regional processor, only the packets / f data transmission request or actual data transmission is required for external DMA or DDR memory. After the packet, fA, to B protocol conversion, the packet is transmitted back to the macro memory on the regional chip, and the data One end is a signal. From there, the area macro memory and the embedded area DMA controller transmit the converted packets through the interleaved switch, the Fibre Channel interface, and the final external I / O interface. Alternatively, the Fibre Channel interface may have An embedded controller to transmit the final converted packet. If required, the external SoC DDR 41 8 or DM A 42 6 can additionally request that the packet be transmitted to the regional SoC bus via the bus bridge and finally reach the host system bus On 223, it is opposite to sending packets from the protocol converter interface. Similarly, the host bus 223 can send a packet or multiple packets for protocol conversion to the macro and receive a packet. The converted completed packets are sent to the external protocol interfaces 475, 485 depending on the individual protocol and packet type. Figure 10 illustrates a single packet protocol received from the host bus 223 and sent to the external SoC interface 485 for transmission Exemplary processing flow for conversion. For example, in the example processing flow illustrated in FIG. 10, sent from the host system bus 223 (derived from the host system bus 223)-a Fibre Channel protocol packet and send it to the SoC protocol The converter macro 350 is used for conversion and transmission of 98784.doc -31-200540644 to the external Ethernet interface 1G EMAC interface 485. As shown in FIG. 10, the SoC main processor (PowerPC 44) sets a data processing request and sends the request and the index of the data in the external DDR memory to the protocol converter macro via the bus bridge 224. Core 55. In the depicted embodiment, an interrupt signal is generated, but this can be constructed by writing data to a dedicated register or a pre-specified memory location. The embedded protocol converter macro core 55 recognizes the request and starts the DMA engine to transfer data from the external host or soc area memory to the macro area memory. Data (e.g., packet #B) is transferred to the macro's area memory 215 via the area soc bus and the bridge bus. When all data has been transferred, the SoC processor is notified of the completion of the task. This can be constructed by sending an interrupt to PowerPC440 ^ or writing to some pre-defined locations that are periodically polled by powerpC44. With a job queue, a collection sequence, and a processor as a task dispatch processor (MT) (shown in Figure 7), the Fibre Channel packet can be extracted from the macro by packet segmentation as described herein. Area memory is transferred to multiple sub-processors. Once the protocol conversion is completed, for example, from protocol "B" (Fibre Channel type) to protocol "A" (Ultra-high-speed Ethernet type), the completed packet is transmitted back to the area by error converter 220 Macro memory. The local DMA is required to transfer the packet, A, and the memory from the macro to the external Ethernet interface 485 to complete the transmission and conversion. The method described in this article takes into account the reduced number of 1/0 cards and chips, greatly improved flexibility, network functions, higher density (more processors attached to a region or host bus), High protocol processing speed, improved frequency of 98784.doc -32- 200540644 wide, less memory competition, flexibility of end system users, ease of network design / upgrade 'and compared with existing protocols today After a great deal of change. Although the invention has been particularly shown and described in connection with illustrative and implemented embodiments of the invention, those skilled in the art should understand that without departing from the scope of the patent, which is limited only by the scope of patents attached Inventive spirit and Fan Ming

的情況下’可於其中進行形態及細節上的前述及其它改變: 【圖式簡單說明】 圖1為說明根據先前技術之關於採用單個處理器之典型 SoC的方塊圖; 圖2為描繪根據先前技術之今日市場上採用處理加速器 之典型SoC的方塊圖; 圖3為^田繪根據先前技術之Motorola MPC5554微控制器 (併入交錯式交換器之SoC)的方塊圖; 圖4為描繪根據先前技術之用於s an網路的Brocade之 Silkworm™構造應用飼服器的方塊圖; 圖5描繪根據本發明之一實施例之單晶片協定轉換器核 心設計的例示性總視圖; 圖6為根據本發明之一實施例的經組態成自光纖通道至 超高速乙太網路單晶片協定轉換器之協定核心的例示性說 明; 圖7描繪根據本發明之一實施例的單晶片協定轉換器内 之例示性記憶體配置; 圖8描繪根據本發明之一第二態樣的在s〇c設計中經組態 98784.doc -33· 200540644 成欲入式巨集的協定轉換器晶片; 圖9描繪根據本發明之自SoC巨集内至圖8系統外部I/O之 SoC協定轉換器封包資料流;且 圖10描繪根據本發明之自主機匯流排至SoC裝置外部封 包傳遞介面之例示性SoC協定轉換器封包資料流。 【主要元件符號說明】The foregoing and other changes in form and details can be made in the case of the following: [Schematic description] Figure 1 is a block diagram illustrating a typical SoC using a single processor according to the prior art; Figure 2 is a diagram depicting A block diagram of a typical SoC that uses a processing accelerator on the market today; Figure 3 is a block diagram of a Motorola MPC5554 microcontroller (a SoC incorporated into an interleaved switch) based on the prior art; Figure 4 is a diagram based on the previous A block diagram of the Brocade's Silkworm ™ construction application feeder for the SAN network; Figure 5 depicts an exemplary general view of the core design of a single-chip protocol converter according to an embodiment of the present invention; Figure 6 is based on An exemplary illustration of a protocol core configured from Fibre Channel to an ultra-high-speed Ethernet single-chip protocol converter according to an embodiment of the present invention; FIG. 7 depicts a single-chip protocol converter according to an embodiment of the present invention FIG. 8 depicts a second aspect of the present invention configured in the SOC design 98784.doc -33 · 200540644 into the macro of the on-demand macro Converter chip; Figure 9 depicts the SoC protocol converter packet data flow from within the SoC macro to the external I / O of the system of Figure 8 according to the present invention; and Figure 10 depicts the packet from the host bus to the external SoC device according to the present invention An exemplary SoC protocol converter packet data stream for a delivery interface. [Description of main component symbols]

15 、 230 、 415 區域SRAM記憶體單元 18 、 418 DDR控制器 20 糸統晶片設計 21 區域處理器匯流排(PLB) 22 ^ 422 PCI-X橋接器 24 晶片上周邊匯流排(OPB) 25 PPC 440(Power PC) 25 處理器核心/PPC440 26 、 426 DMA 28 、 428 DMA控制器 29 、 429 OPB橋接器 33 處理器核心計時器 35 中斷控制器 39 專用處理器核心(加速器) 39a、39b 額外專用處理器核心 45a、445a RAM/ROM周邊控制器 45b 、 445b 外部匯流排主控器 45c 、 445c UART裝置 98784.doc -34- 20054064415, 230, 415 area SRAM memory units 18, 418 DDR controller 20 System chip design 21 Area processor bus (PLB) 22 ^ 422 PCI-X bridge 24 On-chip peripheral bus (OPB) 25 PPC 440 (Power PC) 25 processor core / PPC440 26, 426 DMA 28, 428 DMA controller 29, 429 OPB bridge 33 processor core timer 35 interrupt controller 39 dedicated processor core (accelerator) 39a, 39b additional dedicated processing Processor core 45a, 445a RAM / ROM peripheral controller 45b, 445b external bus master 45c, 445c UART device 98784.doc -34- 200540644

45d、445d IC間匯流排(I2C)介面 45e 、 445e 通用I/O介面 45f、445f 閘道介面 50 媒體存取控制(MAC)協定裝置 72 交錯式交換器 75 S 〇 C設計 78 外部I/O橋接器 100 Brocade系統 102 光纖通道至光纖通道路由 104 iSCSI至FC橋接 110 光纖通道至FC-IP轉譯 175 、 185 媒體存取控制(MAC)介面單元 190 光纖通道通信鏈接 195 超高速乙太網路通信鏈接 200 處理器叢集 205 處理器核心 208 指令快取記憶體(I-快取記憶體) 210 SoC處理器區域匯流排 215 區域記憶體組 220 交錯式交換器 223 外部主機系統匯流排 224 匯流排橋接器巨集 225 算術邏輯單元(ALU) 226 暫存器檔案 98784.doc -35- 200540644 227 指令序列器 240 249 259 260 270 275 28045d, 445d Inter-IC bus (I2C) interface 45e, 445e Universal I / O interface 45f, 445f Gateway interface 50 Media access control (MAC) protocol device 72 Interleaved switch 75 S OC design 78 External I / O Bridge 100 Brocade System 102 Fibre Channel to Fibre Channel Routing 104 iSCSI to FC Bridge 110 Fibre Channel to FC-IP Translation 175, 185 Media Access Control (MAC) Interface Unit 190 Fibre Channel Communication Link 195 Ultra High Speed Ethernet Communication Link 200 processor cluster 205 processor core 208 instruction cache (I-cache) 210 SoC processor area bus 215 area memory group 220 interleaved switch 223 external host system bus 224 bus bridge Macro 225 arithmetic logic unit (ALU) 226 register file 98784.doc -35- 200540644 227 instruction sequencer 240 249 259 260 270 275 280

290 300 350 400 425 OPB 工作佇列 出埠工作佇列 處理區塊 處理區塊 記憶體區域 記憶體 程式碼 工作佇列 光纖通道至超高速乙太網路之單晶 片協定轉換器 基本協定轉換器晶片 SoC(系統晶片)設計 CPU或MPU元件290 300 350 400 425 OPB Task List Port Task Queue Processing Block Processing Block Memory Area Memory Code Task Queue Single Chip Protocol Converter Fibre Channel to Ultra High Speed Ethernet Basic Protocol Converter Chip SoC (system on chip) design CPU or MPU element

475 、 485 550 外部MAC介面裝置 協定轉換器巨集核心 98784.doc 36-475, 485, 550 external MAC interface device protocol converter macro core 98784.doc 36-

Claims (1)

200540644 十、申請專利範圍: 1 · 一種卓晶片協定轉換考 筮_ 、。積體曼路(1C),其能夠接收根據一 弟一類型所產生之封包並 1處亥寺封包以建構協定轉 換’及月b夠為其輸出而產生 Μ、 生、、二轉換之一第二協定類型封 匕’精以完全在該單積妒雷 曰 、體電路日日片内執行該協定轉換處 理。 月长員1之早曰曰片協定轉換器積體電路㈤),該晶片包 含: 多個處理器核心總成,其各自包含能夠執行操作 以構協疋轉換能力之兩個或兩個以上微處理器裝置; -:與該等兩個或兩個以上微處理器裝置相關聯之區域 儲子裝置’其用於健存每—處理器核心總成中之資料及 指令中的至少一者; 一或多個介面裝置,i #怨At — 八便侍此夠根據一或多個通信協 定而接收及傳輸通信封包;及 • 位於该協定轉換器中之互遠椹 m ^ ^ 反1^楫件,其用於使得能夠 在該等兩個或兩個以上微處理器裝置與該等介面裝置之 間進行通信。 3·如請求項2之單晶片協定轉換器IC,其中該或該等可組態 介面裝置包括來自包含以下裝置之群中的一或多個:一 可程式化媒體存取控制介面裝置(Mac),及—用於接收一 特定協定之封包的協定介面加速器裝置。 4.如请求項3之單晶片協定處理器Ic,其中該或該等處理器 核心總成、儲存裝置、互連構件及介面裝置合作性地使 98784.doc 200540644 得能夠進行該第一協定與該第二協定之間的轉換所需之 對封包改轡尺汁;?舌〜从,、 付佚听而之 外 & α式’&單晶片協定轉換器設計 :員外地順以使得能夠在一單個協定類型之不同版本 級之間進打所接收之封包的轉換。 5. 6. 7· ^求項2之單晶片協定轉換器IC’藉以將該所接收之一 第類型封包分割給一或多個微處理器農置一 處理器裝置運行相同組 對。 特疋協定處理成 如凊未項2之單晶片協定轉換器IC,丨中該等所接收之用 於協定轉換之指令係完全包含 U 3趴爽理态核心總成内, 一里包括分割該等協定操作以藉由該 器1C上之不同資源來處理。 轉換 如:求項2之單晶片協定轉換器Ic,其中該互連構件包括 一父錯式交換器。 8·如之單晶片協定轉換器ic,進一步包括一用於調 ^單曰曰片協定轉換器設計以執行與協定轉換相關之一 或多個功能的構件。 w ^員2之單晶片協定轉換器IC,其中該單晶片協定轉 換為係建構為一系統晶片(S〇C)積體電路(IC)中之一巨集 人”中5亥協定轉換處理係包含在該SoC協定轉換巨集 核心内。 ^ 会长貝2之單晶片協定轉換器1C,其中該s〇c 1C包含多 個組件,該等組件包括一處理器元件、一記憶體儲存元 件、—區域通信匯流排及一"ο介面,該單晶片協定轉換 98784.doc 200540644 器核心進一步包括一匯流排介面裝置,其用於使得能夠 經由該區域通信匯流排而在該單晶片協定轉換核心與該 SoC 1C之該等組件之間進行通信。 11 ·如請求項2之單晶片協定轉換器1C,其中該或該等可組態 介面裝置能夠根據一網路通信協定而接收通信,其中該 網路通信協定包括來自包含以下協定之群中的一個或多 個:光纖通道(Fiber Channel)、Gb乙太網路(Ethernet)、 Infitiiband、iSCSI、代-IP、TCP/IP、IP、MPLS、v〇DSL、 CAN及 SAMBA。 12. —種系統晶片(SoC)積體電路(IC)裝置,其包含一處理器 元件;一記憶體儲存元件;及一區域通信匯流排;及一 介面構件,其用於根據一協定類型而接收封包;及一嵌 入式協疋轉換器核心裝置,其包含以下能力:根據一第 一類型接收封包、4理該等封包以建構協定轉換,及為 其輸出而產生經轉換之一第二協定類型封包,藉以完全 也在,亥瓜入式單晶片協定轉換器裝置内執行該協定轉換 處理。 、 13.如請求項12之s〇c ΤΓ驻婆 好丄 波置,其中,該嵌入式協定轉換器 核心裝置包含: 、-或多個處理器核心總成,其各自包含能夠執行操作 以建構協定轉換能力之兩個或兩個以上微處理器裝置; 一與該等兩個或兩個 U Λ上被處理器裝置相關聯之區域 儲存裝置,其用於儲在益 於儲存母一處理器核心總成中之資料及 指令中的至少一者; 、 98784.doc 200540644 或多個可組態介面裝置,其能夠根據一或多個通信 協定而接收及傳輸通信封包;及 一位於該協定轉換器中之互連構件,其用於使得能夠 • 在该等兩個或兩個以上微處理器裝置與該等介面裝置之 . 間進行通信。 14·如明求項π之s〇C 1C裝置,其中該介面構件包含來自包 含以下裝置之群中的一或多個:一可程式化媒體存取控 • 制介面裝置(MAC),及一協定介面加速器裝置,該協定介 面加速為裝置用於自一外部鏈接接收一特定協定之封包 且將該等封包轉遞至該嵌入式協定轉換器核心裝置之一 介面裝置。 15_如請求項14之8〇0 IC裝置,其中該或該等可組態介面裝 置包括來自包含以下裝置之群中的一或多個··一可程式 化媒體存取控制介面裝置(MAC),及一用於自該s〇c ic 接收一特定協定之封包的協定介面加速器裝置。 # 16·如請求項13之8〇0 IC裝置,其中該或該等處理器核心總 成、儲存裝置、互連構件及介面裝置合作性地使得能夠 進行該第一協定與該第二協定之間的轉換所需之對封包 進行改變尺寸及重定格式,該單晶片協定轉換器核心裝 置經額外地調適以使得能夠在一單個協定類型之不同版 本級之間進行所接收之封包的轉換。 17.如請求項13之SoC IC裝置,藉以將一所接 第一類 聖封包分割給一或多個微處理器裝置,复中 ^肀母一處理器 ^置運打一相同組之指令且與一特定協定處理成對,其 98784.doc 200540644 中用於協定轉換之組的該等指令係完全包含在一處理器 核心總成内’該處理包括分割該等協定操作以藉由該單 晶片協定轉換為1C上之不同資源進行處理。 18.如請求項 …1 γ μ方且怨9|、囬裝 置能夠根據-網路通信協定而接收通信,該網路通信協 定包括來自包含以下協定之群中的—個或多個:光纖通 道、Gb 乙太網路、Infiniband、iscsi、Fc ip、Tcp/ip、200540644 10. Scope of patent application: 1 · A kind of Zhuo Wafer Agreement conversion test Integral Man Road (1C), which can receive packets generated according to one brother and one type and one Hai Temple packet to construct a protocol conversion, and month b is enough to generate M, B, and B conversions for its output. The second type of agreement is to execute the agreement conversion process completely within the single product. The first month of the month 1 is the integrated circuit of the chip protocol converter ㈤), the chip contains: a plurality of processor core assemblies, each of which contains two or more micro-processing capable of performing operations to construct a coordinated conversion capability Device;-: an area storage sub-device associated with the two or more microprocessor devices, which is used to store at least one of the information and instructions in each processor core assembly; a Or multiple interface devices, i # 怨 At — It is sufficient to receive and transmit communication packets based on one or more communication protocols; and • Mutually located 该 m ^ ^ anti 1 ^ files in the protocol converter , Which is used to enable communication between the two or more microprocessor devices and the interface devices. 3. The single-chip protocol converter IC of claim 2, wherein the one or more configurable interface devices include one or more from the group consisting of: a programmable media access control interface device (Mac ), And-a protocol interface accelerator device for receiving packets of a specific protocol. 4. The single-chip protocol processor Ic of claim 3, wherein the processor core assembly, storage device, interconnecting member, and interface device cooperatively enable 98784.doc 200540644 to perform the first agreement and The packet conversion required for the conversion between the second agreement; Tongue ~ from, and pay attention to & alpha-style &&; single-chip protocol converter design: field-oriented to enable conversion of received packets between different versions of a single protocol type . 5. 6. 7 · ^ Single-chip protocol converter IC 'of claim 2 to divide one of the received first-type packets into one or more microprocessors and install a processor device to run the same pair. The special protocol is processed into a single-chip protocol converter IC as described in Item 2. The instructions received for the protocol conversion in this protocol are all included in the U 3 core core assembly. Wait for the agreement operation to be processed by different resources on the device 1C. Conversion Example: The single-chip protocol converter Ic of claim 2, wherein the interconnecting member includes a parent error switch. 8. The single-chip protocol converter ic, further comprising a means for tuning the single-chip protocol converter design to perform one or more functions related to the protocol conversion. The single-chip protocol converter IC of the member 2, wherein the single-chip protocol conversion system is constructed as a system chip (SOC) integrated circuit (IC) one of the giants in the "5 Hai protocol conversion processing system" It is included in the core of the SoC protocol conversion macro. ^ The single-chip protocol converter 1C of President Bay 2, where the SOC 1C includes multiple components, including a processor element, a memory storage element, -Regional communication bus and a " ο interface, the single-chip protocol conversion 98784.doc 200540644 processor core further includes a bus interface device for enabling conversion of the core in the single-chip protocol via the regional communication bus Communication with the components of the SoC 1C. 11-The single-chip protocol converter 1C of claim 2, wherein the configurable interface device or devices can receive communication according to a network communication protocol, wherein the Network communication protocols include one or more from the group consisting of: Fiber Channel, Gb Ethernet, Infitiiband, iSCSI, Generation-IP, TCP / IP, IP, MP LS, vDSL, CAN, and SAMBA 12. 12. A system-on-chip (SoC) integrated circuit (IC) device including a processor element; a memory storage element; and a regional communication bus; and an interface A component for receiving a packet according to a protocol type; and an embedded protocol converter core device including the following capabilities: receiving a packet according to a first type, processing the packets to construct a protocol conversion, and The output generates a converted second protocol type packet, so that the protocol conversion processing is also performed in the integrated single-chip protocol converter device. 13. As in claim 12, the soc ΤΓ is good Aibo Zhi, wherein the embedded protocol converter core device includes:-or multiple processor core assemblies, each of which includes two or more microprocessor devices capable of performing operations to build protocol conversion capabilities; A regional storage device associated with the processor device on the two or two U Λs, which is used to store data and instructions in a processor core assembly that benefits the storage mother At least one of: 98784.doc 200540644 or more configurable interface devices capable of receiving and transmitting communication packets in accordance with one or more communication protocols; and an interconnecting component located in the protocol converter, which Used to enable communication between the two or more microprocessor devices and the interface devices. 14. A soC 1C device such as the term π, where the interface component contains One or more of the following groups of devices: a programmable media access control device (MAC), and a protocol accelerator device that accelerates the device for receiving a specific protocol from an external link Packet and forward the packets to an interface device of the embedded protocol converter core device. 15_800 IC device as claimed in item 14, wherein the one or more configurable interface devices include one or more from a group consisting of: a programmable media access control interface device (MAC ), And a protocol interface accelerator device for receiving a packet of a specific protocol from the SoC. # 16. As claimed in 8000 IC device of claim 13, wherein the processor core assembly, storage device, interconnection member and interface device cooperatively enable the first agreement and the second agreement to be carried out. Packets are resized and reformatted as needed between conversions. The single-chip protocol converter core device is additionally adapted to enable conversion of received packets between different versions of a single protocol type. 17. If the SoC IC device of claim 13 is used to divide a received first type of sacred packet into one or more microprocessor devices, restore the ^ mother and processor ^ set and play the same set of instructions and Paired with a specific protocol process, the instructions for the set of protocol conversions in its 98784.doc 200540644 are completely contained within a processor core assembly. The process includes segmenting the protocol operations to pass the single chip The agreement is converted into different resources on 1C for processing. 18. If the request ... 1 γ μ party and complain 9 |, the device can receive communication according to-network communication protocol, which includes one or more from the group containing the following agreement: Fibre Channel , Gb Ethernet, Infiniband, iscsi, Fc ip, Tcp / ip, IP、MPLS、VoDSL、CAN及 SAMBA。 19.如請求項16之SoC IC裝置,其係組態成—Dsp、共處理 器、混合ASIC,或其它網路處理實施例中的一個,該處 理器元件包含一網路處理裝詈,呤p埒 心王衣直,邊區域通信匯流排裝置 用於將該嵌入式協定轉換器核心裝置與該網路處理器裝 置互連。 20·如請求項19之SoC 1C裝置,其中該網路處理器實施例之 組件包括選自包含以下各物之群中的一個或多個:一 • SRAM、一 DDR控制器、一 PCI_X橋接器、-直接記憶體 存取DMA裝置、一DMA控制器,及一用於經由一或多個 I/O介面裝置而與外部組件接合之晶片上周邊匯流排 (0PB) 〇 21. —種單晶片協定轉換器積體電路(IC),其能夠接收根據一 第一協定版本級而產生之封包且處理該等封包以建構協 定轉換,及在該相同協定類型内為其輸出而產生經轉換 之一第二協定版本級封包,藉以完全地在該單積體電路 晶片内執行該協定轉換處理。 98784.doc 200540644 22.如請求項21之單晶片協定轉換 含·· 器積體電路(1C),該晶片 包 、一或多個處理器核心總成,其各自包含能夠執行操作 以建構協疋轉換能力之兩個或兩個以上微處理器裝置; 一與該等兩個或兩個以上微處理器裝置相關聯之區域 儲存爰置’其用於儲存每—處理器核心總成中之資料及 指令中的至少一個;IP, MPLS, VoDSL, CAN and SAMBA. 19. The SoC IC device of claim 16, configured as a Dsp, co-processor, hybrid ASIC, or one of other network processing embodiments, the processor element including a network processing device. The piercing heart is straight, and the edge area communication bus device is used for interconnecting the embedded protocol converter core device with the network processor device. 20. The SoC 1C device of claim 19, wherein the components of the network processor embodiment include one or more selected from the group consisting of:-SRAM, a DDR controller, and a PCI_X bridge -Direct memory access DMA device, a DMA controller, and a peripheral bus (0PB) on the chip for bonding with external components via one or more I / O interface devices. 21. A single chip Protocol converter integrated circuit (IC) capable of receiving packets generated according to a first protocol version level and processing the packets to construct a protocol conversion, and generating one of the conversions for its output within the same protocol type The second protocol version-level packet is used to completely perform the protocol conversion processing within the single integrated circuit chip. 98784.doc 200540644 22. If the single-chip protocol conversion of claim 21 includes a device integrated circuit (1C), the chip package, one or more processor core assemblies, each of which contains operations capable of performing operations to construct a protocol. Conversion capability of two or more microprocessor devices; an area storage device associated with the two or more microprocessor devices, which is used to store data in each processor core assembly And at least one of the instructions; 一或多個介面裝置,其能鈞士 八月b夠根據一或多個通信協定而 接收及傳輸通信封包;及 一位於該協定轉換器中 ^ τ之互連構件,其用於使得能夠 在遠荨兩個或兩個以上料_ σ 上U處理益裝置與該等介面裝置之 間進行通信。 23.如請求項22之單晶片協定轉換器積體電路κ,其中該單 曰片協疋轉換係建構為—系統晶片(就)積體電路⑽ 一之-巨集核心,其中該協定轉換處理係包含於該就協 定轉換巨集核心内。 98784.docOne or more interface devices capable of receiving and transmitting communication packets in accordance with one or more communication protocols in August; and an interconnection member located in the protocol converter ^ τ for enabling Two or more materials from Yuanxun communicate with each other on the U processing device and these interface devices. 23. The single-chip protocol converter integrated circuit κ according to claim 22, wherein the single-chip protocol conversion system is configured as a system chip (in) integrated circuit-one of the macro core, wherein the protocol conversion processing It is included in the core of the agreement conversion macro. 98784.doc
TW94100086A 2004-01-30 2005-01-03 A single chip protocol converter TWI338231B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/768,828 US7412588B2 (en) 2003-07-25 2004-01-30 Network processor system on chip with bridge coupling protocol converting multiprocessor macro core local bus to peripheral interfaces coupled system bus

Publications (2)

Publication Number Publication Date
TW200540644A true TW200540644A (en) 2005-12-16
TWI338231B TWI338231B (en) 2011-03-01

Family

ID=34911294

Family Applications (1)

Application Number Title Priority Date Filing Date
TW94100086A TWI338231B (en) 2004-01-30 2005-01-03 A single chip protocol converter

Country Status (2)

Country Link
JP (1) JP4088611B2 (en)
TW (1) TWI338231B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113821473A (en) * 2020-06-20 2021-12-21 科尔奇普投资公司 Method and system for solving UNI port information on single chip system or switch based on MAC-TABLE quick access

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4376862B2 (en) 2005-12-20 2009-12-02 富士通テン株式会社 Communication message conversion apparatus and communication message conversion method
JP4807502B2 (en) * 2006-03-10 2011-11-02 日本電気株式会社 I / O bridge circuit and interrupt signal control method
JP4869123B2 (en) * 2007-03-28 2012-02-08 株式会社日立製作所 Storage system
FR2925187B1 (en) * 2007-12-14 2011-04-08 Commissariat Energie Atomique SYSTEM COMPRISING A PLURALITY OF TREATMENT UNITS FOR EXECUTING PARALLEL STAINS BY MIXING THE CONTROL TYPE EXECUTION MODE AND THE DATA FLOW TYPE EXECUTION MODE
WO2009149383A2 (en) * 2008-06-07 2009-12-10 Coherent Logix Incorporated Transmitting and receiving control information for use with multimedia streams
US8700821B2 (en) * 2008-08-22 2014-04-15 Intel Corporation Unified multi-transport medium connector architecture
JP2010278897A (en) * 2009-05-29 2010-12-09 Renesas Electronics Corp Communication data processing circuit and communication data processing method
KR101101342B1 (en) * 2010-02-18 2012-01-02 한국외국어대학교 연구산학협력단 TMO Inter-Process Communication Method based on CAN
JP2015065507A (en) * 2013-09-24 2015-04-09 日本電気株式会社 Gateway device, communication network and gateway device control method
US9311044B2 (en) * 2013-12-04 2016-04-12 Oracle International Corporation System and method for supporting efficient buffer usage with a single external memory interface
CN115866081B (en) * 2022-11-09 2024-02-27 燕山大学 SOC-based industrial Ethernet protocol conversion method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113821473A (en) * 2020-06-20 2021-12-21 科尔奇普投资公司 Method and system for solving UNI port information on single chip system or switch based on MAC-TABLE quick access
CN113821473B (en) * 2020-06-20 2024-02-20 美商光禾科技股份有限公司 Method and system for resolving UNI port information on a single chip system or switch based on MAC-TABLE caching

Also Published As

Publication number Publication date
JP4088611B2 (en) 2008-05-21
TWI338231B (en) 2011-03-01
JP2005216283A (en) 2005-08-11

Similar Documents

Publication Publication Date Title
US7412588B2 (en) Network processor system on chip with bridge coupling protocol converting multiprocessor macro core local bus to peripheral interfaces coupled system bus
EP3400688B1 (en) Massively parallel computer, accelerated computing clusters, and two dimensional router and interconnection network for field programmable gate arrays, and applications
US20220351326A1 (en) Direct memory writes by network interface of a graphics processing unit
CN107003955B (en) Method, apparatus and system for integrating devices in a root complex
CN106681938B (en) Apparatus and system for controlling messaging in multi-slot link layer flits
TWI408934B (en) Network interface techniques
JP2021093130A (en) Flexible on-die fabric interface
CN115687234A (en) Architecture for software defined interconnect switches
WO2002082267A1 (en) Fpga coprocessing system
He et al. EasyNet: 100 Gbps network for HLS
TW200540644A (en) A single chip protocol converter
US11789790B2 (en) Mechanism to trigger early termination of cooperating processes
TWI784845B (en) Dataflow function offload to reconfigurable processors
Stewart et al. A new generation of cluster interconnect
Expósito et al. Design of scalable Java message-passing communications over InfiniBand
Nüssle et al. Accelerate communication, not computation!
Cao et al. Design of hpc node with heterogeneous processors
TW594482B (en) Information processing apparatus with open architectures
US20020186046A1 (en) Circuit architecture for reduced-synchrony on-chip interconnect
Zhu et al. High performance communication subsystem for clustering standard high-volume servers using Gigabit Ethernet
Lauria High performance MPI implementation on a network of workstations
Wang et al. Reconfigurable RDMA communication framework of MULTI-DSP
Mustafa An assessment of a method to enhance the performance of cluster systems
Sarin et al. Notice of Violation of IEEE Publication Principles: Efficient RTL design of SoCWire BUS protocol
Zhu et al. High performance communication subsystem for clustering