CN101420440A - String matching processing method and apparatus - Google Patents

String matching processing method and apparatus Download PDF

Info

Publication number
CN101420440A
CN101420440A CNA2008102390552A CN200810239055A CN101420440A CN 101420440 A CN101420440 A CN 101420440A CN A2008102390552 A CNA2008102390552 A CN A2008102390552A CN 200810239055 A CN200810239055 A CN 200810239055A CN 101420440 A CN101420440 A CN 101420440A
Authority
CN
China
Prior art keywords
module
submodule
character
suspicious
string matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2008102390552A
Other languages
Chinese (zh)
Other versions
CN101420440B (en
Inventor
陈乃涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN2008102390552A priority Critical patent/CN101420440B/en
Publication of CN101420440A publication Critical patent/CN101420440A/en
Application granted granted Critical
Publication of CN101420440B publication Critical patent/CN101420440B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

An embodiment of the invention provides a method for matching and processing character string and a device thereof. The method according to the invention comprises the following steps: filtering the message to be matched through a multi-Hash filtering module to obtain a plurality of suspected data packages comprising suspected character string, respectively storing the plurality of data packages into a plurality of first-in first-out sub-modules of a first-in first-out module; reading the suspected data packages in the plurality of first-in first-out sub-module by an AC engine module in turn, applying a character selecting sub-module, an AC Hash sub-module, an enquiring table item sub-module and a result comparing sub-module which are connected in turn and are independent from each other in the AC engine module, and executing character string matching to the suspected data packages in the plurality of first-in first-out sub-modules with a time division multiplex mode. The method and device according to the invention divide the AC engine module into for independent sub-modules for forming a four stage stream line and therefore have the advantages of increasing the sharing of AC state machine, leading to time sharing multiplexing of four sub-modules by a plurality of state ''fifos'', increasing the usage factor of reading bandwidth by SRAM and increasing a logic processing capability.

Description

String matching processing method and device
Technical field
The embodiment of the invention relates to communication technical field, relates in particular to a kind of string matching processing method and device.
Background technology
Along with the develop rapidly of IP operation, operator is increasing to the perception aspect demand of network security and content, deep-packet detection (Deep packet inspection; Hereinafter to be referred as: DPI) technology, can realize the running that becomes more meticulous to network operation.String matching, promptly whether search exists target string from data, is an important technology realizing the DPI scheme.AC (Aho-Corasick) algorithm is a kind of typical multimode matching algorithm, often is applied in the string matching technology.The AC algorithm uses finite state machine to realize, the parameters such as next redirect state of the character string of the current state of AC state machine, current input and state machine is made list item be placed on plug-in random access memory (random access memory; Be called for short: RAM).All to carry out a read operation inquiry list item when character string of each input goes to mate with AC, and must wait for that the data of reading back analyze plug-in RAM.When reading plug-in RAM, data can not returned at once, but have the delay of several timeticks, so follow-up suspicious byte can only be in wait state in the message, causes data processing discontinuous; And, cause twice quadruple according to multiplying power (Quad Date Rate owing to will wait the data of reading back to analyze; Hereinafter to be referred as: QDR) interval between the read operation between the transfer of data is very big, and the read data passage of QDR fully is not utilized yet, and handling property is not high.
Utilize the method for " the MultiHash+ multistep is long " AC, promptly data flow is filtered through " MultiHash " earlier, suspicious character string is filtered out give the long AC of multistep and search.String matching mainly uses AC algorithm finite state machine to realize.Filtering the back flow through " MultiHash " can have substantial degradation, and is the flow of suspicious character string basically, has alleviated the burden of AC like this, has improved searching speed.Fig. 1 is existing AC algorithm string searching schematic diagram, Fig. 2 is the logic realization scheme schematic diagram of Fig. 1, each functions of modules is as follows among Fig. 2: " MultiHash " module is responsible for filtering suspicious character string, is put into back level module " stat_fifo " the inside after suspicious character string is filtered; " Stat_fifo " module is made up of state first in first out (fifo), and state " fifo " is the suspicious character string of coming out for the unit stored filter by bag, and is to deposit successively by the order that occurs in message; " Ac_engine " module major function is that the suspicious character string of finding is accurately mated, and it mainly relatively waits one of four states to constitute by character selection, " ac_hash ", inquiry list item, result.When inquiry list item state, read the list item of " qdr sram " lining, when comparison state as a result, judge whether this suspicious string is true, if very then give reporting module as a result.A state " fifo " will have " Ac_engine " module of a correspondence to handle.As shown in Figure 1, earlier the suspicious message that contains " edonkey " suspicious character string is filtered through " MultiHash " module, " edonkey " filters out suspicious character string, owing to have the wrong report phenomenon during to the message Hash, " abcde " also found out, the suspicious character string of finding is stored in " stat_fifo ", " Ac_engine " module can be got suspicious character string from " stat_fifo " module the inside and be carried out more accurate coupling with the AC algorithm, in the process of searching, can from plug-in SRAM (static RAM (SRAM)), read table entry relatively, find and report " edonkey " character string out reporting module as a result at last with the AC algorithm.
In realizing process of the present invention, the inventor finds that there are the following problems at least in the prior art: Fig. 3 is AC state machine sequential schematic diagram in Fig. 2 implementation, as shown in Figure 3, because it is that last data must execute the AC state machine that state " fifo " is initiated the condition of read operation, so just causing " Ac_engine " module must carry out the AC state machine in order, can not be that the preceding state machine is not finished and just started next state machine; Realize the AC algorithm with an independent logic state machine, the AC state machine will be analyzed the result who finds when inquiry list item state, reads time-delay owing to have when reading SRAM, causes the AC state machine to be waited for for a long time when the inquiry list item.And in the time that these pending datas are returned, only sent out read request one time, serious waste the tape reading of SRAM wide.
Summary of the invention
The embodiment of the invention provides a kind of string matching processing method and device, when solving prior art logic realization AC algorithm, and the plug-in SRAM defectives such as wide utilance is low of reading tape, it is wide to realize making full use of plug-in SRAM tape reading, improves AC algorithmic match performance.
The embodiment of the invention provides a kind of string matching processing method, comprising:
Message to be matched is filtered through the uncommon filtering module in Doha, obtain to comprise several suspicious data bags of suspicious character string, and described several suspicious data bags are stored in respectively in several first in first out submodules in the first in first out module;
The AC engine modules reads the suspicious data bag in described several first in first out submodules successively, and use in the described AC engine modules and to connect successively and separate character chooser module, AC Hash submodule, inquiry list item submodule and comparison sub-module as a result, adopt time-multiplexed mode that the suspicious data bag in described several first in first out submodules is carried out string matching.
The embodiment of the invention provides a kind of string matching processing unit, comprising:
Filtering module is wished in the Doha, is used to treat matching message and filters, and obtains to contain in the described message to be matched the suspicious data bag of suspicious character string;
The first in first out module comprises several first in first out submodules, and the certain rale store of suspicious data bag basis that described first in first out module is used for the uncommon filtering module in described Doha is obtained is to described several first in first out submodules;
The AC engine modules, be used for reading successively the suspicious data bag of described several first in first out submodules, and use in the described AC engine modules and to connect successively and separate character chooser module, AC Hash submodule, inquiry list item submodule and comparison sub-module as a result, adopt time-multiplexed mode that the suspicious data bag in described several first in first out submodules is carried out string matching.
String matching processing method that the embodiment of the invention provides and device, when having avoided realizing the AC algorithm with an independent logic state machine, owing to will wait the data of reading back to analyze, cause data processing to get up to have the phenomenon of wait, by the AC engine modules being splitted into four independently module formation level Four flowing water, improve sharing of AC state machine, make four submodules of a plurality of states " fifo " time-sharing multiplex AC engine modules, reduce the FPGA resource; Adopt flowing water can also improve SRAM and read bandwidth utilization, improve the logical process performance.
Description of drawings
Fig. 1 is existing AC algorithm string searching schematic diagram;
Fig. 2 is the logic realization scheme schematic diagram of Fig. 1;
Fig. 3 is an AC state machine sequential schematic diagram in Fig. 2 implementation;
Fig. 4 is a logic realization AC algorithm state figure schematic diagram of the present invention;
Fig. 5 is a string matching processing method embodiment flow chart of the present invention;
Fig. 6 realizes the sequential chart of AC state machine for string matching processing method embodiment pipeline mode of the present invention;
Fig. 7 is a string matching processing unit structural representation of the present invention.
Embodiment
Further specify the technical scheme of the embodiment of the invention below in conjunction with the drawings and specific embodiments.
In the string matching technology, often adopt AC (Aho-Corasick) algorithm to carry out string matching, the AC algorithm uses finite state machine to realize for each state 3 functions (function) being arranged all: " Goto function " expression jumps to that NextState, " Failure function " expression cannot jump to NextState, " Output function " represents that this is a receptive phase.During logic realization AC algorithm, the parameters such as next redirect state of the character string of the current state of AC state machine, current input and state machine are made list item be placed in the plug-in SRAM, the structure of list item is as shown in table 1:
Table 1
Current state The character string of current input Next state Out of Memory
cur_stat0 cur_string0 next_stat0 Index0
cur_stat1 cur_string1 next_stat1 Index1
. . . . . . . .
cur_statn cur_stringn next_statn Indexn
During with logic realization AC algorithm, the once suspicious string matching of every execution all will be inquired about plug-in SRAM in the string matching technology, and the efficient of inquiring about plug-in SRAM becomes the key that algorithm is realized.The AC algorithm will utilize the QDR bandwidth fully, and the QDR scheduler will reach 100% wide utilization of QDR tape reading, requires all will send in each effective period to read instruction.Fig. 4 is a logic realization AC algorithm state figure schematic diagram of the present invention, logic is finished a string matching with the AC algorithm and divided four steps to carry out: character selection, " ac_hash ", inquiry list item and result are relatively, its operating sequence reaches each redirect state that goes on foot as shown in Figure 4: wherein, character is selected to be used for step-length and is selected, and determines to work as the length and the AC current state of the character string of pre-treatment according to the state that reads from " fifo "; " ac_hash " is Hash (Hash) computing: use the QDR address that Hash (Hash) algorithm computation need be searched, the entry address of promptly plug-in SRAM list item; Inquiry list item (AC tables look-up) is that QDR tables look-up, and waits for Query Result; The result relatively carries out the result relatively according to the retaking of a year or grade data.The central idea of the embodiment of the invention is for fear of existing when using single AC logic state machine to carry out string matching, state " fifo " will execute successively in last data once and initiate read operation in the logic state machine behind the one of four states again, cause the phenomenon of the wide serious waste of tape reading of SRAM, single hard-wired AC state machine is divided into four separate hardware modules, constitute level Four flowing water, improve sharing of AC state machine module; And " stat_fifo " that will be used to store the suspicious data bag marks off a plurality of " fifo " state, and to store respectively with the bag be the suspicious character string of unit, make four modules of a plurality of states " fifo " time division multiplexing AC state machine, reduce field programmable gate array (FieldProgrammable Gate Array; Be called for short: FPGA) resource, improved SRAM and read bandwidth utilization.
Fig. 5 is a string matching processing method embodiment flow chart of the present invention, and as shown in Figure 5, this method comprises:
Step 100 is filtered message to be matched through the uncommon filtering module in Doha, obtain to comprise several suspicious data bags of suspicious character string, and described several suspicious data bags are stored in respectively in several first in first out submodules in the first in first out module;
In the processing procedure of logic realization string matching, message to be matched is input in the string matching system, described message to be matched is the message that will carry out string matching.Behind the message to be matched of i.e. " Multihash " filtering module reception of the uncommon filtering module in the Doha in string matching system input, it is carried out filtering screening, find the suspicious character string that contains in all messages to be matched, and be that unit adopts suspicious data bag that the mode of repeating query will contain suspicious character string to be stored in the first in first out module respectively promptly in several states " fifo " in " stat_fifo " module with the packet at suspicious character string place, each state " fifo " is corresponding to a first in first out submodule, that is to say in the present embodiment in order to cooperate four independently hardware module realization streamlined processing of AC engine modules, in " stat_fifo " module, a plurality of submodules are set, and storage " Multihash " filtering module filters the suspicious data bag that produces respectively, make in the suspicious data bag of subsequent module in handling previous first in first out submodule in four hardware modules of AC engine modules, preceding continuous module in four hardware modules of AC engine modules can continue to handle the suspicious data bag in the back first in first out submodule, handles to reach the streamlined operation.
Step 101, the AC engine modules reads the suspicious data bag in described several first in first out submodules successively, and use in the described AC engine modules and to connect successively and separate character chooser module, AC Hash submodule, inquiry list item submodule and comparison sub-module as a result, adopt time-multiplexed mode that the suspicious data bag in described several first in first out submodules is carried out string matching.
After " Multihash " filtering module was stored in several suspicious data bags in several first in first out submodules in the first in first out module respectively, the suspicious data bag that the AC engine modules begins to read successively in each first in first out submodule carried out string matching.The AC engine modules is handled in order to realize streamlined, avoid when tabling look-up, producing unnecessary wait, one of four states machine in the AC algorithm is arranged to four independently hardware modules respectively, be character chooser module, AC Hash submodule, inquiry list item submodule and comparison sub-module as a result, and these four submodules connect successively, wherein character chooser module is used for step-length and selects, and determines to work as the length and the AC current state of the character string of pre-treatment according to the state that reads from " fifo "; AC Hash submodule is used the QDR address that Hash (Hash) algorithm computation need be searched, the entry address of just plug-in SRAM list item; Inquiry list item submodule is used for QDR tables look-up, and waits for Query Result; Comparison sub-module is used for carrying out the result relatively according to the retaking of a year or grade data as a result.Packet through character chooser module, AC Hash submodule, inquiry list item submodule with as a result after the comparison sub-module, has just been finished the string matching processing procedure successively, then comparative result can also be sent.
In the AC engine modules four independently submodule adopt time-multiplexed mode, the suspicious data bag of storing in each first in first out submodule is carried out string matching, in described time-multiplexed mode is at one time, four submodules of AC engine modules are all in running order, but the process object difference of each submodule is promptly carried out at different string matching processing threads; After preceding continuous submodule in the AC engine modules disposed a thread, this thread changed follow-up submodule over to and handles; Should continue a back thread is handled by preceding continuous submodule; Just as the product flow production line, each thread divides and successively enters the AC engine modules, and successively through character chooser module, AC Hash submodule, inquiry list item submodule and four submodules such as comparison sub-module as a result; Independently of one another and have the time interval definitely between each thread, guarantee that each the submodule work in the AC engine modules does not clash.
The AC engine modules adopts time-multiplexed mode that the suspicious data bag in several first in first out submodules is carried out string matching and is specially, character chooser module reads the last suspicious data bag in the previous first in first out submodule of first in first out module, start character chooser module, AC Hash submodule, inquiry list item submodule and as a result comparison sub-module to the string matching thread of the last suspicious data bag in the previous first in first out submodule; After character chooser module is finished to the last suspicious data bag in the previous first in first out submodule that character select to be handled and result is sent to AC Hash submodule, continue to use the last suspicious data bag in the back first in first out submodule that character chooser module reads the first in first out module, start character chooser module, AC Hash submodule, inquiry list item submodule and as a result comparison sub-module to the string matching thread of the last suspicious data bag in the back first in first out submodule; After character chooser module is finished to the last suspicious data bag in last first in first out submodule of first in first out module that character select to be handled and result is sent to AC Hash submodule, continue to use the back suspicious data bag in the first first in first out submodule that character chooser module reads the first in first out module, start character chooser module, AC Hash submodule, inquiry list item submodule and as a result comparison sub-module to the string matching thread of the back suspicious data bag in the first first in first out submodule.Be in particular, all can store a plurality of suspicious data bags in each first in first out submodule in the first in first out module, described suspicious data bag filters for treating matching message, after deletion does not contain the packet of suspicious character string, and the packet that includes suspicious character string of acquisition.Character chooser module at first reads first suspicious data bag in first first in first out submodule of first in first out module, startup is to the string matching thread of this suspicious data bag, after this suspicious data bag finished character and select to handle, after result sent to AC Hash submodule, use first suspicious data bag in second first in first out submodule that character chooser module reads the first in first out module again, in like manner start string matching thread to this suspicious data bag, and the like, after character chooser module reads out first suspicious data bag in last first in first out submodule of first in first out module, finish repeating query first to all first in first out submodules.Search under the prerequisite of plug-in SRAM timeticks at satisfied inquiry list item submodule, after the AC engine modules is finished coupling thread to first suspicious data bag in first first in first out submodule of first in first out module, use second suspicious data bag in first first in first out submodule that character chooser module continues to read the first in first out module again, and and the like, finish the repeating query second time to all first in first out submodules, after all the suspicious data bags in all first in first out submodules were all mated end, this string matching process finished.
For the tape reading that improves SRAM better wide, can read the quantity that the plug-in required clock periodicity of SRAM data is provided with submodule in the first in first out module according to inquiry list item submodule, can be that number that the first in first out submodule is set equals to inquire about the list item submodule and reads the plug-in required clock periodicity of SRAM data, for example inquiring about the list item submodule, to read the required clock periodicity of plug-in SRAM data be 5 timeticks, so in order to make that each submodule is all in running order in different time sections in the AC engine modules, requirement is sent in each effective period and is read instruction, and there is not the suspicious data bag state to be matched such as to be in, then can allow to have 5 string matching processing threads carries out in the AC engine modules simultaneously, each submodule is all being handled accordingly in the AC engine modules in each timeticks, time mates just, and the QDR tape reading is wide to be utilized by 100%.The number of first in first out submodule is set to read the plug-in required clock periodicity of SRAM data greater than inquiry list item submodule in can certainly the first in first out module, for example the first in first out submodule is 6, resource waste in the time of can avoiding like this not having the suspicious data bag in the first in first out submodule.
In the string matching processing method that the embodiment of the invention provides, when having avoided realizing the AC algorithm with an independent logic state machine, owing to will wait the data of reading back to analyze, cause data processing to get up to have the phenomenon of wait, by the AC engine modules being splitted into four independently module formation level Four flowing water, improve sharing of AC state machine, make four submodules of a plurality of states " fifo " time-sharing multiplex AC engine modules, reduce the FPGA resource; Adopt flowing water can also improve SRAM and read bandwidth utilization, improve the logical process performance.
In the present embodiment string matching processing procedure, also comprise when comparison sub-module compares process as a result in the system and conflicting, can inquire about plug-in SRAM data again by application query list item submodule, compare again as if occurring.In same string matching thread, with several bytes is matching unit, divide several coupling flow processs simultaneously suspicious character string to be mated, the number of coupling flow process is identical with the byte number of matching unit, and therein can be in coupling flow process according to the length of the length adjustment matching unit of suspicious character string, be specially, carry out in the process of a string matching thread in the present embodiment string matching processing method, can be that unit carries out string matching with the byte of some, and demultiplexing mates simultaneously, the coupling way is identical with the number of the byte number of matching unit, can avoid the omission that occurs mating, and raising is matched to power.Present embodiment is that 4 bytes are mated with step-length, all the match is successful when preceding four bytes of suspicious character string, then continue the subsequent byte of suspicious character string is mated, for example suspicious character string is " mnkedonkeyab ", then this suspicious character string is being divided 4 the tunnel and mate with 4 step-lengths, the first via is at first judged " mnke ", owing to do not mate, finishes this road flow process; " nked " at first judged owing to do not mate in another road, finishes this road flow process; One the tunnel at first judges " kedo " owing to do not mate again, finishes this road flow process; " edon " at first judged on last road, because " edon " the match is successful, therefore return character chooser module continues to judge whether " keya " mates again, though " keya " the match is successful fully, wherein but " key " is similar with suspicious character string, therefore can regulate the length of matching unit this moment is that step-length is mated with 3 bytes for example, because " key " the match is successful, therefore the match is successful for the suspicious character string " mnkedonkey " of this suspicious data bag, and matching result is sent to reporting module as a result.
Fig. 6 realizes the sequential chart of AC state machine for string matching processing method embodiment pipeline mode of the present invention, as shown in Figure 6, start three suspicious string matching operation threads simultaneously with three states " fifo ", four of AC engine modules independently state machine adopt time-multiplexed mode that the suspicious data bag in several states " fifo " is carried out string matching, " qdr sram " initiates the number continuity increase of read operation behind the employing level Four flowing water, if state " fifo " number promptly simultaneously the operation suspicious string matching operating line number of passes, carry out suitable increasing to more than or equal to total read latency, the read operation of " qdr sram " initiation is just got up continuously so, and the tape reading of plug-in SRAM is wide to obtain 100% utilization.
One of ordinary skill in the art will appreciate that: all or part of step that realizes said method embodiment can be finished by the relevant hardware of program command, aforesaid program can be stored in the computer read/write memory medium, this program is carried out the step that comprises said method embodiment when carrying out; And aforesaid storage medium comprises: various media that can be program code stored such as ROM, RAM, magnetic disc or CD.
Fig. 7 is a string matching processing unit structural representation of the present invention, as shown in Figure 7, this device comprises that filtering module 1 is wished in the Doha, the first in first out module is that " stat_fifo " module 2 and AC engine modules are " Ac_engine " module 3, wherein the uncommon filtering module 1 in Doha is used to treat matching message and filters, and obtains to contain in the described message to be matched the suspicious data bag of suspicious character string; " stat_fifo " module 2 comprises several first in first out submodules, represent with " fifo " among the figure, drawing reference numeral represents with 21 that all " stat_fifo " module 2 is used for the suspicious data bag that the uncommon filtering module 1 in Doha obtains is arrived several " fifo " according to certain rale store; " Ac_engine " module 3 is used for reading successively the suspicious data bag of several " fifo ", and use and to connect successively in " Ac_engine " module 3 and separate character chooser module 31, AC Hash submodule 32, inquiry list item submodule 33 and comparison sub-module 34 as a result, adopt time-multiplexed mode that the suspicious data bag in several " fifo " is carried out string matching.
Particularly, must in " Ac_engine " module, finish coupling owing to require the data of a bag when AC algorithm is realized, promptly once can only initiate a suspicious string matching operation thread if only use a state " fifo ", so just cause " Ac_engine " module 3 must carry out the AC state machine in order, can not the preceding state machine do not finish and just start next state machine.If but there is the parallel work-flow of a plurality of suspicious string matching operation thread to share " Ac engine " module 3, then can make full use of flowing water.The number (each state " fifo " can start a suspicious string matching operation thread) of the string matching processing unit that present embodiment provides by increasing first in first out module 1 the inside state " fifo ", it is the number of first in first out submodule, a plurality of states " fifo " can be read to repeating query, thereby improve the read cycle number of " qdr sram ", utilize AC level Four flowing water, reach parallel purpose.Have again,, in " Ac engine " module 3 the AC state machine is divided into four independently hardware submodules, comprise character chooser module 31, AC Hash submodule 32, inquiry list item submodule 33 and comparison sub-module 34 as a result in order to realize level Four flowing water.When comparison sub-module 34 compares as a result,, then turn back to inquiry list item submodule 33 and carry out the inquiry of plug-in SRAM list item again if conflict occurs.
For the tape reading that makes full use of plug-in SRAM wide, can read the number of the first in first out submodule 21 in required clock periodicity setting " stat_fifo " module 2 of plug-in SRAM data according to inquiry list item submodule 33, the number setting of first in first out submodule 21 can be equaled to inquire about list item submodule 33 as a preferred embodiment and read the plug-in required clock periodicity of SRAM data, the tape reading of SRAM is wide like this can obtain 100% utilization.Consider when a certain first in first out submodule 21 may be for sky that the number that first in first out submodule 21 then can be set reads the plug-in required clock periodicity of SRAM data greater than inquiry list item submodule 33.This device also comprises comparative result that is used for 3 transmissions of reception " Ac_engine " module and the reporting module as a result 4 that reports.
After message input of character string coupling place device filters through the uncommon filtering module 1 in Doha, to be unit with the packet be put into several states " fifo " the inside respectively with the mode of repeating query to suspicious character string, suspicious character string in the same bag is put in the same state " fifo ", each " fifo " corresponding to a first in first out submodule 21, the number n of " fifo " should be more than or equal to the needed total clock cycle number of plug-in SRAM read data." Ac_engine " module 3 usefulness level Four flowing water realize, read suspicious string matching of suspicious character string starting with one " fifo " of mode from " stat_fifo " module 2 of repeating query and operate thread.N suspicious string matching operation thread arranged simultaneously at operation and time-sharing multiplex " Ac_engine " at most 3 li of " Ac_engine " modules.The parallel running of a plurality of suspicious string matchings operation thread makes the read operation of plug-in SRAM continuous carrying out, SRAM wide being fully utilized of reading tape, and logic inter-process streamlined, the performance of logic is highly improved like this.
It should be noted that at last: above embodiment only in order to technical scheme of the present invention to be described, is not intended to limit; Although with reference to previous embodiment the present invention is had been described in detail, those of ordinary skill in the art is to be understood that: it still can be made amendment to the technical scheme that aforementioned each embodiment put down in writing, and perhaps part technical characterictic wherein is equal to replacement; And these modifications or replacement do not make the essence of appropriate technical solution break away from the spirit and scope of various embodiments of the present invention technical scheme.

Claims (10)

1, a kind of string matching processing method is characterized in that, comprising:
Message to be matched is filtered through the uncommon filtering module in Doha, obtain to comprise several suspicious data bags of suspicious character string, and described several suspicious data bags are stored in respectively in several first in first out submodules in the first in first out module;
The AC engine modules reads the suspicious data bag in described several first in first out submodules successively, and use in the described AC engine modules and to connect successively and separate character chooser module, AC Hash submodule, inquiry list item submodule and comparison sub-module as a result, adopt time-multiplexed mode that the suspicious data bag in described several first in first out submodules is carried out string matching.
2, string matching processing method according to claim 1, it is characterized in that, described AC engine modules reads the suspicious data bag in described several first in first out submodules successively, and use in the described AC engine modules and to connect successively and separate character chooser module, AC Hash submodule, inquiry list item submodule and comparison sub-module as a result, adopt time-multiplexed mode that the suspicious data bag in described several first in first out submodules is carried out string matching and comprise:
Described character chooser module reads the last suspicious data bag in the previous first in first out submodule of described first in first out module, start described character chooser module, AC Hash submodule, inquiry list item submodule and as a result comparison sub-module to the string matching thread of the last suspicious data bag in the described previous first in first out submodule;
After described character chooser module is finished to the last suspicious data bag in the described previous first in first out submodule that character select to be handled and result is sent to described AC Hash submodule, continue to use the last suspicious data bag in the back first in first out submodule that described character chooser module reads described first in first out module, start described character chooser module, AC Hash submodule, inquiry list item submodule and as a result comparison sub-module to the string matching thread of the last suspicious data bag in the described back first in first out submodule;
After described character chooser module is finished to the last suspicious data bag in last first in first out submodule of described first in first out module that character select to be handled and result is sent to described AC Hash submodule, continue to use the back suspicious data bag in the first first in first out submodule that described character chooser module reads described first in first out module, start described character chooser module, AC Hash submodule, inquiry list item submodule and as a result comparison sub-module to the string matching thread of the back suspicious data bag in the described first first in first out submodule.
3, string matching processing method according to claim 1 and 2, it is characterized in that the number of first in first out submodule reads the required clock periodicity of plug-in sram data more than or equal to described inquiry list item submodule in the described first in first out module.
4, string matching processing method according to claim 1 and 2 is characterized in that, described described several suspicious data bags are stored in respectively in several first in first out submodules in the first in first out module comprises:
The mode that adopts repeating query is stored in described several suspicious data bags respectively in several first in first out submodules in the first in first out module.
5, string matching processing method according to claim 2 is characterized in that, in same string matching thread, also comprises:
Occur conflict in the process if described comparison sub-module as a result compares, then use described inquiry list item submodule and inquire about plug-in sram data again.
6, string matching processing method according to claim 2 is characterized in that, in same string matching thread, also comprises:
With several bytes is matching unit, divide several coupling flow processs simultaneously described suspicious character string to be mated, the number of described coupling flow process is identical with the byte number of matching unit, and therein can be according to the length of the length adjustment matching unit of suspicious character string in coupling flow process.
7, string matching processing method according to claim 1 and 2 is characterized in that, described method also comprises: described AC engine modules sends to reporting module as a result with comparative result.
8, a kind of string matching processing unit is characterized in that, comprising:
Filtering module is wished in the Doha, is used to treat matching message and filters, and obtains to contain in the described message to be matched the suspicious data bag of suspicious character string;
The first in first out module comprises several first in first out submodules, and the certain rale store of suspicious data bag basis that described first in first out module is used for the uncommon filtering module in described Doha is obtained is to described several first in first out submodules;
The AC engine modules, be used for reading successively the suspicious data bag of described several first in first out submodules, and use in the described AC engine modules and to connect successively and separate character chooser module, AC Hash submodule, inquiry list item submodule and comparison sub-module as a result, adopt time-multiplexed mode that the suspicious data bag in described several first in first out submodules is carried out string matching.
9, string matching processing unit according to claim 8 is characterized in that, also comprises:
Reporting module is used to receive the comparative result of described AC engine modules transmission and report as a result.
10, according to Claim 8 or 9 described string matching processing unit, it is characterized in that the number of described first in first out submodule reads the required clock periodicity of plug-in sram data more than or equal to described inquiry list item submodule.
CN2008102390552A 2008-12-05 2008-12-05 String matching processing method and apparatus Expired - Fee Related CN101420440B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008102390552A CN101420440B (en) 2008-12-05 2008-12-05 String matching processing method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008102390552A CN101420440B (en) 2008-12-05 2008-12-05 String matching processing method and apparatus

Publications (2)

Publication Number Publication Date
CN101420440A true CN101420440A (en) 2009-04-29
CN101420440B CN101420440B (en) 2011-08-24

Family

ID=40631044

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008102390552A Expired - Fee Related CN101420440B (en) 2008-12-05 2008-12-05 String matching processing method and apparatus

Country Status (1)

Country Link
CN (1) CN101420440B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101902461A (en) * 2010-04-07 2010-12-01 北京星网锐捷网络技术有限公司 Method and device for filtering data stream contents
CN101916262A (en) * 2010-07-29 2010-12-15 北京用友政务软件有限公司 Acceleration method of financial element matching
CN103412858A (en) * 2012-07-02 2013-11-27 清华大学 Method for large-scale feature matching of text content or network content analyses
CN105354150A (en) * 2015-10-31 2016-02-24 杭州华为数字技术有限公司 Content matching method and apparatus
CN106649836A (en) * 2016-12-29 2017-05-10 武汉新芯集成电路制造有限公司 Hardware lookup table pattern character searching method
CN113163387A (en) * 2021-05-21 2021-07-23 南通大学 Emergency communication service sensing method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1392497A (en) * 2002-07-24 2003-01-22 彭泉 Matching method for large character string
CN100530182C (en) * 2006-10-17 2009-08-19 中兴通讯股份有限公司 Character string matching information processing method in communication system
CN100452055C (en) * 2007-04-13 2009-01-14 清华大学 Large-scale and multi-key word matching method for text or network content analysis

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101902461A (en) * 2010-04-07 2010-12-01 北京星网锐捷网络技术有限公司 Method and device for filtering data stream contents
CN101902461B (en) * 2010-04-07 2013-01-30 北京星网锐捷网络技术有限公司 Method and device for filtering data stream contents
CN101916262A (en) * 2010-07-29 2010-12-15 北京用友政务软件有限公司 Acceleration method of financial element matching
CN101916262B (en) * 2010-07-29 2012-07-04 北京用友政务软件有限公司 Acceleration method of financial element matching
CN103412858A (en) * 2012-07-02 2013-11-27 清华大学 Method for large-scale feature matching of text content or network content analyses
CN103412858B (en) * 2012-07-02 2016-09-21 清华大学 For text or the method for the extensive characteristic matching of network content analysis
CN105354150A (en) * 2015-10-31 2016-02-24 杭州华为数字技术有限公司 Content matching method and apparatus
CN105354150B (en) * 2015-10-31 2018-03-16 杭州华为数字技术有限公司 A kind of content matching method and apparatus
CN106649836A (en) * 2016-12-29 2017-05-10 武汉新芯集成电路制造有限公司 Hardware lookup table pattern character searching method
CN106649836B (en) * 2016-12-29 2019-11-29 武汉新芯集成电路制造有限公司 A kind of lookup method of the mode character based on hardware lookup table
CN113163387A (en) * 2021-05-21 2021-07-23 南通大学 Emergency communication service sensing method
CN113163387B (en) * 2021-05-21 2023-08-15 南通大学 Emergency communication service sensing method

Also Published As

Publication number Publication date
CN101420440B (en) 2011-08-24

Similar Documents

Publication Publication Date Title
CN101420440B (en) String matching processing method and apparatus
CN103905311B (en) Flow table matching method and device and switch
EP2583175B1 (en) Parallel processing of continuous queries on data streams
CN107491355A (en) Funcall method and device between a kind of process based on shared drive
EP2515487B1 (en) Method and device for storing and searching keyword
CN105677683A (en) Batch data query method and device
CN104731956A (en) Method and system for synchronizing data and related database
CN101604261B (en) Task scheduling method for supercomputer
CN107959695B (en) Data transmission method and device
RU2341902C2 (en) Method and device for configuring data in communication device
CN109145055B (en) Data synchronization method and system based on Flink
CN102880724A (en) Method and system for processing Hash collision
CN101753542A (en) Method and device for speeding up matching of filter rules of firewalls
CN108776897A (en) Data processing method, device, server and computer readable storage medium
US20110238677A1 (en) Dynamic Sort-Based Parallelism
CN102508902A (en) Block size variable data blocking method for cloud storage system
CN110704438B (en) Method and device for generating bloom filter in blockchain
CN102880628A (en) Hash data storage method and device
US20180165333A1 (en) Big data calculation method and system
CN106789697A (en) A kind of method and device for improving high-volume network flow table ageing efficiency
CN106648839B (en) Data processing method and device
CN111277612A (en) Network message processing strategy generation method, system and medium
CN112148738A (en) Hash collision processing method and system
CN101013382A (en) Method and apparatus for managing session table
CN109617821B (en) Transmission method, main control board and equipment of multicast message

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110824

Termination date: 20121205