CN110909523A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN110909523A
CN110909523A CN201911212915.8A CN201911212915A CN110909523A CN 110909523 A CN110909523 A CN 110909523A CN 201911212915 A CN201911212915 A CN 201911212915A CN 110909523 A CN110909523 A CN 110909523A
Authority
CN
China
Prior art keywords
file
json
hashmap
mapfile
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911212915.8A
Other languages
Chinese (zh)
Other versions
CN110909523B (en
Inventor
朱晓峰
翁星晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN201911212915.8A priority Critical patent/CN110909523B/en
Publication of CN110909523A publication Critical patent/CN110909523A/en
Application granted granted Critical
Publication of CN110909523B publication Critical patent/CN110909523B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a data processing method and a data processing device, wherein the method comprises the steps of obtaining a JSON file and a JSON file tag of a data structure to be converted, wherein the JSON file tag is used for indicating the type of the JSON file, obtaining a first HashMap file and a tag chain set file mapfile which are stored in a configuration file in advance and correspond to the type of the JSON file, and converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and the mapfile. By means of the method, conversion of JSON files in different formats can be completed quickly on the basis of the first HashMap file and the tag chain set file mapfile stored in the configuration file in advance without modifying codes corresponding to the JSON file formats, and the purpose of improving the efficiency of converting unstructured files into structured files is achieved.

Description

Data processing method and device
Technical Field
The invention belongs to the field of data processing, and particularly relates to a data processing method and device.
Background
At present, an unstructured document can be converted into a structured document through a plurality of methods, so that the structured document after the unstructured document is converted has the characteristic of specification and can be better used for data processing.
In practical application, a json file is used as an unstructured document, and when the json file is converted into a structured document, the json file format needs to be modified through strong coupling, and the json file format needs to be modified in a code modifying mode to achieve the purpose of modifying the json file format.
However, there are multiple formats for the json file, each format corresponds to a different code, so when encountering json files with different formats, the codes corresponding to the json file formats need to be modified respectively, and the efficiency of converting unstructured files into structured files is reduced.
Disclosure of Invention
In view of this, the present invention provides a data processing method and apparatus, which are used to solve the problem that when JSON files with different formats are encountered, codes corresponding to the JSON file formats need to be modified respectively, so that the efficiency of converting unstructured files into structured files is reduced. The technical scheme is as follows:
the embodiment of the invention discloses a data processing method, which comprises the following steps:
acquiring a JSON file of a data structure to be converted and a JSON file tag of the JSON file, wherein the JSON file tag is used for indicating the type of the JSON file;
based on the JSON file type, acquiring a first HashMap file and a tag chain set file mapfile which are pre-stored in a configuration file and correspond to the JSON file type, wherein the mapfile is used for defining a mapping relation between a target structured file and the JSON file;
and converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and the mapfile of the tag chain set file.
Optionally, the obtaining, based on the JSON file type, a HashMap file and a tab chain set file mapfile that are pre-stored in a configuration file and correspond to the JSON file type includes:
searching a directory indicating the type of the JSON file stored in the configuration file;
acquiring a first HashMap file and a tag chain set file mapfile which are pre-stored in the directory and correspond to the JSON file type;
the first HashMap file and the tag chain set file mapfile generated in advance based on JSON files of different JSON file types are stored in each directory of the configuration file;
the configuration file comprises an XML configuration file.
Optionally, the method further includes:
if the first HashMap file and the tag chain set file mapfile corresponding to the JSON file type are not stored in the configuration file, establishing a directory corresponding to the JSON file in the configuration file;
and generating the first HashMap file and the tag chain set file mapfile based on the JSON file corresponding to the JSON file type, and storing the first HashMap file and the tag chain set file mapfile in the directory.
Optionally, the process of generating the first HashMap file based on the JSON file includes:
analyzing the JSON file to obtain node information of all nodes in the JSON file, wherein the node information comprises JsonSpot objects and tag chains of the JsonSpot objects;
and taking the label chain as a key, taking the JsonSpot object as a value, and storing the JsonSpot object and the label chain in a first HashMap in a key-value pair mode to obtain a first HashMap file comprising all node information of the JSON file.
Optionally, the process of generating the mapfile of the tab chain set file based on the JSON file includes:
analyzing the JSON file to obtain a mapping relation between the JSON file and a structured file;
setting the number N of rows of the mapfile based on the number of rows of a JsonSpot object of a type object array OA, wherein the value of N is the number of rows of the OA plus 1;
taking a tag chain set of a non-OA type in the JSON file as a first layer of the mapfile;
taking the labelshain set of each OA type in the JSON file as a second layer of the mapfile, wherein a row is correspondingly generated by the labelshain set of each OA type in the second layer, and if one or more OA is/are nested in the nodes of the OA types, the labelshain set of the row in which the nodes of the OA types are located is contained in the labelshain set of the row in which all the OA types are nested;
defining a structured file name for each line in the first layer and the second layer, and generating the mapfile containing the corresponding relation between the structured file name and the label chain set.
Optionally, the method further includes:
and modifying the structured file names, the arrangement sequence of the tag chain sets and the screening fields in the mapfile to obtain a new mapfile.
Optionally, the JSON file of the data structure to be converted is converted into a target structured file based on the first HashMap file and the mapfile of the tag chain set file, where the target structured file includes
Analyzing the JSON file of the data structure to be converted to obtain a second HashMap file and a third HashMap file, wherein the second HashMap file stores the values of the tag chains of all JsonSpot objects in the JSON file of the data structure to be converted, and the third HashMap file stores the number of type objects of each OA in the JSON file of the data structure to be converted;
traversing each row of the mapfile of the tab chain set file, wherein each row of the mapfile of the tab chain set file corresponds to a structured file, and acquiring a tab chain set corresponding to the structured file;
judging the type of each label chain in the label chain set according to the first HashMap file;
if the label chain is of a non-OA type, obtaining a value which is equal to the label chain of the non-OA type in the second HashMap file, and adding the value and the field separator into a character string to be written into the structured file;
if the label chain is of the OA type, determining the number of type objects of each OA based on the third HashMap file, searching the second HashMap file based on the number of the type objects of each OA, and adding the obtained value and the field separator into a character string to be written into the structured file;
and after each row of traversal of the mapfile of the tag chain set file is completed, writing the character string into a structured file until the traversal of the mapfile of the tag chain set file is completed, and obtaining a target structured file corresponding to the JSON file of the data structure to be converted.
The embodiment of the invention discloses a data processing device, which comprises:
the system comprises a first acquisition module, a second acquisition module and a conversion module, wherein the first acquisition module is used for acquiring a JSON file of a data structure to be converted and a JSON file tag of the JSON file, and the JSON file tag is used for indicating the type of the JSON file;
the second acquisition module is used for acquiring a first HashMap file and a tag chain set file mapfile which are pre-stored in a configuration file and correspond to the JSON file type based on the JSON file type, wherein the mapfile is used for defining the mapping relation between a target structured file and the JSON file;
and the conversion module is used for converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and the mapfile of the tag chain set file.
Optionally, the second obtaining module includes:
the searching unit is used for searching a directory which indicates the type of the JSON file and is stored in the configuration file;
the first acquisition unit is used for acquiring a first HashMap file and a tag chain set file mapfile which are pre-stored in the directory and correspond to the JSON file type; the first HashMap file and the tag chain set file mapfile generated in advance based on JSON files of different JSON file types are stored in each directory of the configuration file; the configuration file comprises an XML configuration file.
Optionally, the conversion module includes:
the analysis unit is used for analyzing the JSON file of the data structure to be converted to obtain a second HashMap file and a third HashMap file, wherein the second HashMap file stores the values of the tag chains of all JsonSpot objects in the JSON file of the data structure to be converted, and the third HashMap file stores the number of type objects of each OA in the JSON file of the data structure to be converted;
the first traversal unit is used for traversing each row of the mapfile of the tab chain set file, wherein each row of the mapfile of the tab chain set file corresponds to one structured file, and a tab chain set corresponding to the structured file is obtained;
the judging unit is used for judging the type of each label chain in the label chain set according to the first HashMap file;
the second obtaining unit is used for obtaining the value of the second HashMap file key equal to the label chain of the non-OA type if the label chain of the non-OA type is adopted, and adding the value and the field separator into a character string to be written into the structured file;
a third obtaining unit, configured to determine, if the label chain is of an OA type, the number of type objects of each OA based on the third HashMap file, search the second HashMap file based on the number of type objects of each OA, and add the obtained value and the field separator to a character string to be written in the structured file;
and the fourth obtaining unit writes the character string into a structured file after each traversal of one row of the tag chain set file mapfile is completed until the traversal of the tag chain set file mapfile is completed, and obtains a target structured file corresponding to the JSON file of the data structure to be converted.
Compared with the prior art, the technical scheme provided by the invention has the following advantages:
the method includes the steps of obtaining a JSON file and a JSON file tag of a data structure to be converted, wherein the JSON file tag is used for indicating the type of the JSON file, obtaining a first HashMap file and a tag chain set file mapfile which are stored in a configuration file in advance and correspond to the type of the JSON file, and converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and the mapfile. By means of the method, conversion of JSON files in different formats can be completed quickly on the basis of the first HashMap file and the tag chain set file mapfile stored in the configuration file in advance without modifying codes corresponding to the JSON file formats, and the purpose of improving the efficiency of converting unstructured files into structured files is achieved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart of a data processing method according to an embodiment of the present invention;
fig. 2 is a flowchart for acquiring a first HashMap file and a labchain aggregate file mapfile of a corresponding JSON file type pre-stored in a configuration file according to an embodiment of the present invention;
FIG. 3 is a flow chart of another data processing method provided by an embodiment of the invention;
fig. 4 is a flowchart of generating a first HashMap file based on a JSON file according to an embodiment of the present invention;
fig. 5 is a flowchart of a tag chain aggregate file mapfile generated based on a JSON file according to an embodiment of the present invention;
FIG. 6 is a flowchart of converting a JSON file of a data structure to be converted into a target structured file according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention.
Detailed Description
The invention provides a data processing method and device, which are used for solving the problems that when JSON files with different formats are encountered, codes corresponding to the JSON file formats need to be modified respectively, and the efficiency of converting unstructured files into structured files is reduced.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
For a better understanding of structured data and unstructured data, the structured data and unstructured data are explained herein as follows:
structuring data: the data is stored in the database, the data can be logically expressed by a two-dimensional table structure, the structure definition is not easy to change, and the data has fixed length.
Unstructured data: data that is not conveniently represented in a database two-dimensional logical table, has variable field lengths, and the record for each field may in turn be composed of sub-fields that may or may not be repeatable.
As can be seen from the background art, in the prior art, a JSON file is used as an unstructured document, and when the document is to be converted into a structured document, the JSON file format needs to be modified by strongly coupling the JSON file, and the JSON file format needs to be modified in a code modification manner to achieve the purpose of modifying the JSON file format.
Therefore, the invention provides a data processing method and device, which can quickly complete the conversion of JSON files in different formats without modifying codes corresponding to the JSON file format on the basis of the first HashMap file and the tag chain set file mapfile pre-stored in the configuration file, and achieve the purpose of improving the efficiency of converting unstructured files into structured files.
As shown in fig. 1, a flowchart of a data processing method provided by an embodiment of the present invention is shown, and the method includes the following steps:
s101, acquiring a JSON file of a data structure to be converted and a JSON file tag of the JSON file.
In S101, a json (javascript object notification) file is a lightweight data exchange format.
In the process of specifically implementing the S101, a guarantee is provided for subsequently parsing the JSON file by obtaining the JSON file of the data to be converted, and besides obtaining the JSON file, a JSON file tag of the JSON file needs to be obtained, where the JSON file tag is used to indicate the type of the JSON file.
It should be noted that JSON files have different file types, that is, JSON file formats are different.
S102, acquiring a first HashMap file and a tag chain set file mapfile which are pre-stored in a configuration file and correspond to the JSON file type based on the JSON file type.
In S102, in the field of computer science, a configuration file (configuration file) is a computer file, and parameters and initial settings may be configured for some computer programs. The JSON file type is used to distinguish JSON files of different formats.
In the process of specifically implementing S102, in the configuration file, since the file type of the JSON file has a corresponding relationship with the corresponding first HashMap file and the corresponding lablinkset mapfile, the first HashMap file and the lablinkset mapfile can be acquired in the configuration file based on the file type of the JSON file.
Specifically, the first HashMap file is composed of a < key: value > is formed, and a value can be quickly obtained according to a key value. Wherein, the label chain is a key value, and the JsonSpot object is a value.
The tag chain set file mapfile is composed of a structured file name and a tag chain set and is used for defining the mapping relation between a target structured file and a JSON file.
S103, converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and the mapfile of the tag chain set file.
In the process of specifically implementing S103, the JSON file of the data structure to be converted is converted into the target structured file based on the first HashMap file and the mapfile of the tab chain set file, and the structure configuration of the generated target structured file is as follows: xxx xxxx.
According to the data processing method disclosed by the embodiment of the invention, the method obtains the JSON file and the JSON file tag of the data structure to be converted, the JSON file tag is used for indicating the JSON file type, obtaining the first HashMap file and the tag chain set file mapfile which are stored in the configuration file in advance and correspond to the JSON file type, and converting the JSON file of the data structure to be converted into the target structured file based on the first HashMap file and the mapfile. By means of the method, conversion of JSON files in different formats can be completed quickly on the basis of the first HashMap file and the tag chain set file mapfile stored in the configuration file in advance without modifying codes corresponding to the JSON file formats, and the purpose of improving the efficiency of converting unstructured files into structured files is achieved.
Based on the data processing method disclosed in fig. 1 in the embodiment of the present invention, in S102 shown in fig. 1, based on the JSON file type, a specific implementation process of obtaining a first HashMap file and a tag chain set file mapfile, which are pre-stored in a configuration file and correspond to the JSON file type, as shown in fig. 2, mainly includes:
s201, searching a directory indicating the type stored in the JSON file in the configuration file.
In the process of specifically implementing S201, a plurality of directories of JSON file types are stored in the configuration file, that is, information corresponding to the JSON file type to be acquired can be found through the directories.
For example: in the configuration file, the JSON file type 1, the JSON file type 2 and the JSON file type 3 are stored in the directory, and if the JSON file type corresponding to the obtained JSON file is the JSON file type 2, the information corresponding to the JSON file type 2 can be obtained only by searching the directory in the configuration file.
It should be noted that the configuration file stores the first HashMap file and the tab chain set file mapfile corresponding to the JSON files of different JSON file types in advance, and when encountering JSON files of multiple different JSON file types, the configuration file can start multiple threads and parse the JSON files of multiple different JSON file types at the same time.
It should be noted that the first HashMap file and the tab chain set file mapfile corresponding to the JSON files of different JSON file types are generated by parsing the JSON file in advance, and then the first HashMap file and the tab chain set file mapfile corresponding to each JSON file are stored in each directory in the configuration file, where the configuration file includes, but is not limited to, an XML configuration file. By the method, the first HashMap file and the tag chain set file mapfile corresponding to the JSON file can be efficiently obtained, and the JSON file is guaranteed to be subsequently converted into the structured file efficiently.
It should be noted that, the JSON file type corresponding to the JSON file is specifically stored in the configuration file, and the JSON file type can be stored according to an actual situation, which is not described herein again.
S202, acquiring a first HashMap file and a tag chain set file mapfile of a corresponding JSON file type stored in a directory in advance.
In the process of specifically implementing S202, in the configuration file, the required JSON file type is searched for through the directory, and then the first HashMap file and the tag chain set file mapfile corresponding to the JSON file of the JSON file type in the directory are obtained through the JSON file type.
According to the data processing method disclosed by the embodiment of the invention, the method obtains the JSON file and the JSON file tag of the data structure to be converted, the JSON file tag is used for indicating the JSON file type, obtaining the first HashMap file and the tag chain set file mapfile which are stored in the configuration file in advance and correspond to the JSON file type, and converting the JSON file of the data structure to be converted into the target structured file based on the first HashMap file and the mapfile. By means of the method, conversion of JSON files in different formats can be completed quickly on the basis of the first HashMap file and the tag chain set file mapfile stored in the configuration file in advance without modifying codes corresponding to the JSON file formats, and the purpose of improving the efficiency of converting unstructured files into structured files is achieved.
As shown in fig. 3, a flowchart of another data processing method provided in the embodiment of the present invention mainly includes:
s301, acquiring a JSON file of a data structure to be converted and a JSON file tag of the JSON file.
The execution principle of S301 is the same as that of S101, and is not described herein again.
And S302, judging whether the first HashMap file and the tag chain set file mapfile corresponding to the JSON file type are stored in the configuration file, if so, executing S303, and if not, executing S304.
In the process of specifically implementing S302, after obtaining the JSON file of the data structure to be converted and the JSON file tag of the JSON file, it needs to be determined whether there is a first HashMap file and a tag chain set file mapfile corresponding to the JSON file stored in advance in the configuration file, and if there is a first HashMap file and a tag chain set file mapfile corresponding to the JSON file stored in the configuration file, the data structure conversion is performed on the JSON file based on the first HashMap file and the tag chain set file mapfile. If the first HashMap file and the tag chain set file mapfile corresponding to the JSON file are not stored in the configuration file, the JSON file needs to be analyzed to obtain the corresponding first HashMap file and tag chain set file mapfile, and then the data structure of the JSON file is converted.
S303, acquiring a first HashMap file and a tag chain set file mapfile which are pre-stored in a configuration file and correspond to the JSON file type based on the JSON file type.
The execution principle of S303 is the same as that of S102, and is not described herein again.
And S304, establishing a directory corresponding to the JSON file in the configuration file.
In the process of specifically implementing S304, if the configuration file does not pre-store the first HashMap file and the tag chain set file mapfile corresponding to the JSON file of the data structure to be converted, firstly, based on the JSON file type of the JSON file of the data structure to be converted, a directory corresponding to the JSON file type of the JSON file of the data structure to be converted is established in the configuration file, so that the first HashMap file and the tag chain set file mapfile generated after the JSON file of the data structure to be converted is parsed are stored in the directory.
It should be noted that, a directory corresponding to the JSON file type of the JSON file of the data structure to be converted is established in the configuration file, and the directory may be permanently stored in the configuration file, and the directory may be deleted when it is not needed.
S305, generating a first HashMap file and a tag chain set file mapfile based on the JSON file corresponding to the JSON file type, and storing the first HashMap file and the tag chain set file mapfile in the directory.
In the process of specifically implementing S305, after the directory corresponding to the JSON file of the data of the structure to be converted is established in the configuration file, the JSON file of the data structure to be converted needs to be parsed to generate a first HashMap file and a labelschaining aggregate file mapfile, and then the generated first HashMap file and labelschaining aggregate file mapfile are stored in the established directory.
S306, converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and the mapfile of the tag chain set file.
The execution principle of S306 is the same as that of S103, and is not described herein again.
It should be noted that, as shown in fig. 4, the specific implementation process of generating the first HashMap file based on the JSON file mainly includes:
s401, analyzing the JSON file to obtain node information of all nodes in the JSON file.
In S401, each node in the JSON file can be divided into these types: array, object, and common element. Wherein, the array can be divided into a common array and an OA array; common elements can be classified into numeric elements, string elements, boolean elements, and NULL elements.
In the process of specifically implementing S401, the JSON file is parsed by an org. Wherein, the JsonSpot object comprises the following elements: the node comprises a full path spotname of a node, a node type spottype, a node depth, a parent node and a node value type spotvalue type.
S402, taking the label chain as a key, taking the JsonSpot object as a value, and storing the JsonSpot object and the label chain in a first HashMap in a key-value pair mode to obtain a first HashMap file comprising all node information of the JSON file.
In S402, HashMap is a data structure of java language, and is represented by < key: value > is formed, and a value can be quickly obtained according to a key value. That is, the JsonSpot object can be quickly obtained through a tag chain.
In the process of implementing S402 specifically, by taking the tag chain as a key value, taking the JsonSpot object as a value, and taking the key value pair < key: and storing the JsonSpot object and the label chain in the first HashMap in a value mode to obtain a first HashMap file comprising all node information of the JSON file.
In the process of specifically realizing S401 and S402, the JSON file is analyzed, and the mapping relation between the JSON file and the structured file is obtained, namely the key value is the structured file name and the value is the JSON tag chain set.
It should be noted that, as shown in fig. 5, the specific implementation process of the tag chain set file mapfile generated based on the JSON file mainly includes:
s501, analyzing the JSON file to obtain the mapping relation between the JSON file and the structured file.
S502, setting the line number N of the mapfile based on the number of JsonSpot objects of the type object array OA, wherein the value of N is the line number of OA plus 1.
In the process of implementing S502 specifically, the number of mapfile lines is set by the number of JsonSpot objects in the type object array OA. Wherein, the added row is the row number corresponding to the non-OA type, and the row number contains the label chain set of the non-OA type.
And S503, taking the label chain set of the non-OA type in the JSON file as the first layer of the mapfile.
In S503, the set of non-OA type labelsigns means that no OA type labels are included in the labelsign chain.
And S504, taking the label chain set of each OA type in the JSON file as a second layer of the mapfile.
In S504, the set of tag chains of OA types means that the tag chains contain OA type tags.
In the process of implementing S504 specifically, in the second layer of mapfile, each row is a set of labelines of each OA type, wherein, if one or more OAs are nested in a node of an OA type, the set of labelines of the row where the node of the OA type is located is included in the set of labelines that nest the rows where all OA types are located. That is, any set of tagchains corresponding to OA will appear in the sets of tagchains corresponding to all its descendant OAs.
For example: node OA1 of OA type is nested with OA2 and OA2 is nested with OA3, so that the set of tagchains corresponding to OA2 includes the set of tagchains of OA1, and the set of tagchains corresponding to OA3 includes the set of tagchains of OA2, and certainly includes the set of tagchains of OA 1.
And S505, defining a structured file name for each line in the first layer and the second layer, and generating mapfile containing the corresponding relation between the structured file name and the label chain set.
In the process of implementing S505 specifically, a structured file name is defined for the row number corresponding to the non-OA type in the first layer, and a structured file name is defined for each row of the OA type in the second layer, and a mapfile containing the corresponding relationship between the structured file name and the tag chain set is generated.
Specifically, the file structure of mapfile is < structured filename: set of tag chains >.
For example: the content of the tag chain set file mapfile obtained by analyzing the JSON file is as follows:
TABLE0:transactionCode|bypassed|isCheck|requestSeq|_class|_id|serviceCode|finalDecision-->actionCode|finalDecision-->_class|extension-->financeCommon-->opponentInstitution|extension-->financeCommon-->transAmount
TABLE1:transactionCode|bypassed|isCheck|requestSeq|_class|_id|serviceCode|finalDecision-->actionCode|finalDecision-->_class|extension-->financeCommon-->opponentInstitution|extension-->financeCommon-->transAmount|ruleEngine-->decisionTreeId|ruleEngine-->sugg-->actionCode|ruleEngine-->sugg-->_class
TABLE2:transactionCode|bypassed|isCheck|requestSeq|_class|_id|serviceCode|finalDecision-->actionCode|finalDecision-->_class|extension-->financeCommon-->opponentInstitution|extension-->financeCommon-->transAmount|ruleEngine-->decisionTreeId|ruleEngine-->sugg-->actionCode|ruleEngine-->sugg-->_class|ruleEngine-->ruleEngineDetail-->ruleId|ruleEngine-->ruleEngineDetail-->result|ruleEngine-->ruleEngineDetail-->ruleName
wherein TABLE0 is the structured file name of the first layer.
Transactioncode | bypass | isCheck | requestSeq | _ class | _ id | serviceCode | finalDension | - > actionCode | finalDension > -class | extension | - > finecommon > -openmentInstitution | - > extension | - > fineCommon | - > openmentInstructions | - > finence common | - > transammount is a non-OA set of tag chains.
TABLE1 is the first structured file name of the second tier.
TransactionCode | bypass | IsCheck | requestSeq | _ class | _ id | serviceCode | finalDension | - > actionCode | finalDension >.
TABLE2 is the second structured file name of the second tier.
the transactivation code | bypass | isCheck | requestSeq | class | id | serviceCode | finaldiction | > actionCode | finaldiction | - > class | extension | - > finceCommon | - > openplacement instruction | - > actionCode | extension | - > finceCommon | - > transmi u-ment | rule Engine | - > discovery tree [ rule Engine | - > subset | - > actionCode | rule Engine | - > rule Engine group, rule England | - - > rule Engine | - > rule [ rule Engine | - > rule Engine | - - > rule Engine | -, rule Engine is a set of the rule, which is a type of a rule, and the rule [ OA.
Wherein TABLE0 represents the first tier, TABLE1, TABLE2 represents the second tier, and these three are also the target structured file names. From the contents of the above tag chain set file mapfile, TABLE1 contains the set of tag chains of TABLE0, and TABLE2 contains the sets of tag chains of TABLE0 and TABLE 1. The tag chains are separated by "|".
Note that OA is nested in OA corresponding to TABLE 1.
That is, the ruleEngineDetail is nested in the ruleEngine, and the label chain set ruleEngine- > ruleEngineDetail- > ruleId | ruleEngine- > ruleEngineDetail in TABLE2
result | rule Engine | > rule Engine detail | > rule Name is the set of tag chains of OA nested by TABLE 1.
If the mapfile needs to be modified, optionally, the structured file name, the arrangement sequence of the tag chain set and the screening field in the mapfile can be modified to obtain a new mapfile.
If the name of TABLE1 can be modified, the nested previous layer of tab chain set in the tab chain set can also be modified to optimize mapfile.
According to the data processing method disclosed by the embodiment of the invention, the JSON file of the data structure to be converted and the JSON file tag of the JSON file are obtained, the JSON file tag is used for indicating the JSON file type, the first HashMap file and the tag chain set file mapfile which are stored in the configuration file in advance and correspond to the JSON file type are obtained based on the JSON file type, the mapfile is used for defining the target structured file, and the JSON file of the data structure to be converted is converted into the target structured file based on the first HashMap file and the tag chain set file mapfile. Through the first HashMap file and the tag chain set file mapfile, the purpose of converting the JSON file of the data structure to be converted into the target structured file is achieved, the problems that when JSON files with different formats are met, codes corresponding to the JSON file formats need to be modified respectively, and the efficiency of converting unstructured files into structured files is reduced are solved.
Based on the data processing method disclosed in fig. 1 in the embodiment of the present invention, in S103 shown in fig. 1, a specific implementation process for converting a JSON file of a data structure to be converted into a target structured file based on a first HashMap file and a labelsefile, as shown in fig. 6, mainly includes:
s601, analyzing the JSON file of the data structure to be converted to obtain a second HashMap file and a third HashMap file.
In S601, stored in the second HashMap file are all tag chains in the JSON file of the data structure to be converted, plus subscripts and their corresponding value values, it should be noted that if it is non-OA, there is no subscript.
The third HashMap file stores the number of objects in each type object array OA in the JSON file of the data structure to be converted.
In the process of specifically implementing S601, the JSON file of the data structure to be converted is subjected to line-by-line parsing, the subscripts are added to all tag chains in each line of the JSON file, and the corresponding value thereof is parsed into the second HashMap file, and the cycle number of each OA in each line of the JSON file is parsed into the third HashMap file, so that the second HashMap file and the third HashMap file are obtained.
Specifically, the second HashMap file is composed of a value and covers all value values of JSON messages. Specifically, the < key: value > of the second Hashmap is < tag chain + subscript set: value > in json file, where the subscript set refers to that one subscript is paired when one OA occurs in the tag chain.
The third HashMap file is composed of < key: value >, and the < key: value > of the third HashMap file is < OA label chain + subscript set: number of objects > contained in the OA array, and the file is related to the OA node. The numbers of objects here refer to the number of objects in the OA array, which are present only in the OA nodes, and since the OA tags nest the OA, subscripts are also required here.
S602, traversing each line of the mapfile of the tab chain set file, wherein each line of the mapfile of the tab chain set file corresponds to a structured file, and acquiring a tab chain set corresponding to the structured file.
And S603, judging the type of each label chain in the label chain set according to the first HashMap file.
In S603, it is actually determined whether each tab chain in the tab chain set includes an OA tab.
S604, if the label chain is of the non-OA type, obtaining a value of the label chain corresponding to the non-OA type in the second HashMap file, and adding the value and the field separator into a character string to be written into the structured file. In S604, a tag chain of a non-OA type means that no OA tag is included in the tag chain.
S605, if the label chain is of the OA type, determining the number of type objects of each OA based on the third HashMap file, searching the second HashMap file based on the number of the type objects of each OA, and adding the obtained value and the field separator into the character string to be written into the structured file.
In S605, the tag chain of OA type means that the tag chain includes an OA tag.
In the process of specifically implementing S605, based on the third HashMap file, a value corresponding to the OA label chain, that is, the number of objects included in the OA array, where this number of objects corresponds to the number of lines of the structured file, then the OA label chain plus a subscript (starting from 1 to the number of objects) are used to sequentially search for the second HashMap to obtain a value, and the value and the field separator are added to the character string to be written into the structured file.
In the process of specifically implementing S602 to S605:
and if the parent node is the OA node, finding the cycle times of the OA node from the third HashMap, then starting the cycle, reading the values corresponding to the label chains with the same level and the same depth in the label chain set from the second HashMap, and if an O object is nested in the layer or the label chain set is not processed, continuing to process downwards in a recursive mode.
Wherein the same hierarchy refers to having a common parent node.
And if the parent node is an O node or is empty, reading out the value corresponding to the label chain at the same level and the same depth under the parent node from the second HashMap. If the label chain set is not processed, the next step is continued in a recursive manner.
And S606, after each row of traversal of the mapfile of the tag chain set file is completed, writing the character string into the structured file until the traversal of the mapfile of the tag chain set file is completed, and obtaining a target structured file corresponding to the JSON file of the data structure to be converted.
In S606, the character string refers to the character string appended to the structured file to be written.
According to the data processing method disclosed by the embodiment of the invention, the method obtains the JSON file and the JSON file tag of the data structure to be converted, the JSON file tag is used for indicating the JSON file type, obtaining the first HashMap file and the tag chain set file mapfile which are stored in the configuration file in advance and correspond to the JSON file type, and converting the JSON file of the data structure to be converted into the target structured file based on the first HashMap file and the mapfile. By means of the method, conversion of JSON files in different formats can be completed quickly on the basis of the first HashMap file and the tag chain set file mapfile stored in the configuration file in advance without modifying codes corresponding to the JSON file formats, and the purpose of improving the efficiency of converting unstructured files into structured files is achieved.
Based on the data processing method disclosed in the embodiment of the present invention, the following JSON files are exemplified here:
the content of the JSON file is as follows:
{"_id":"CREDIT_CARD_CHN:4807172220181118025222","_class":"com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog","serviceCode":"MONI","transactionCode":"0005","requestSeq":"4807172220181118025222","ruleEngine":[{"decisionTreeId":"520","sugg":{"_class":"com.bocsoft.ruleProcess.dto.MonitorResponseInfo","actionCode":"ALLOW"},"ruleEngineDetail":[{"ruleId":"12876","result":"false","ruleName":"combine_cond_2876"},{"ruleId":"12968","result":"false","ruleName":"combine_cond_2968"}]},{"decisionTreeId":"1510","sugg":{"_class":"com.bocsoft.ruleProcess.dto.MonitorResponseInfo","actionCode":"ALLOW"},"ruleEngineDetail":[{"ruleId":"2876","result":"false","ruleName":"combine_cond_2876"},{"ruleId":"AAAA","result":"false","ruleName":"combine_cond_2876"},{"ruleId":"BBBBB","result":"false","ruleName":"combine_cond_2876"},{"ruleId":"2968","result":"false","ruleName":"combine_cond_2968"}]}],"finalDecision":{"_class":"com.bocsoft.ruleProcess.dto.MonitorResponseInfo","actionCode":"ALLOW"},"bypassed":"false","isCheck":"false","extension":{"financeCommon":{"transAmount":"2100.0","opponentInstitution":"48021240"}}}
firstly, determining the JSON file type based on the content of the JSON file, and calling a first HashMap file and a tag chain set file mapfile corresponding to the JSON file type in a configuration file through the JSON file type.
Specifically, the file structure of the first HashMap file is < tag chain, JsonSpot object >.
The contents of the first HashMap file are:
ruleEngine-->ruleEngineDetail-->ruleId:
JsonSpot[spotname=ruleEngine-->ruleEngineDetail-->ruleId,spottype=L,parent=ruleEngine-->ruleEngineDetail,degree=3,spotvaluetype=S]
extension-->financeCommon-->opponentInstitution:
JsonSpot[spotname=extension-->financeCommon-->opponentInstitution,spottype=L,parent=extension-->financeCommon,degree=3,spotvaluetype=S]
finalDecision-->actionCode:
JsonSpot[spotname=finalDecision-->actionCode,spottype=L,parent=finalDecision,degree=2,spotvaluetype=S]
extension-->financeCommon-->transAmount:
JsonSpot[spotname=extension-->financeCommon-->transAmount,spottype=L,parent=extension-->financeCommon,degree=3,spotvaluetype=S]
transactionCode:
JsonSpot[spotname=transactionCode,spottype=L,parent=,degree=1,spotvaluetype=S]
ruleEngine-->sugg:
JsonSpot[spotname=ruleEngine-->sugg,spottype=M,parent=ruleEngine,degree=2,spotvaluetype=O]
extension-->financeCommon:
JsonSpot[spotname=extension-->financeCommon,spottype=M,parent=extension,degree=2,spotvaluetype=O]
ruleEngine-->ruleEngineDetail-->result:
JsonSpot[spotname=ruleEngine-->ruleEngineDetail-->result,spottype=L,parent=ruleEngine-->ruleEngineDetail,degree=3,spotvaluetype=S]
bypassed:
JsonSpot[spotname=bypassed,spottype=L,parent=,degree=1,spotvaluetype=S]
isCheck:
JsonSpot[spotname=isCheck,spottype=L,parent=,degree=1,spotvaluetype=S]
ruleEngine-->sugg-->actionCode:
JsonSpot[spotname=ruleEngine-->sugg-->actionCode,spottype=L,parent=ruleEngine-->sugg,degree=3,spotvaluetype=S]
ruleEngine-->ruleEngineDetail:
JsonSpot[spotname=ruleEngine-->ruleEngineDetail,spottype=M,parent=ruleEngine,degree=2,spotvaluetype=OA]
finalDecision:
JsonSpot[spotname=finalDecision,spottype=M,parent=,degree=1,spotvaluetype=O]
ruleEngine-->decisionTreeId:
JsonSpot[spotname=ruleEngine-->decisionTreeId,spottype=L,parent=ruleEngine,degree=2,spotvaluetype=S]
ruleEngine-->sugg-->_class:
JsonSpot[spotname=ruleEngine-->sugg-->_class,spottype=L,parent=ruleEngine-->sugg,degree=3,spotvaluetype=S]
extension:
JsonSpot[spotname=extension,spottype=M,parent=,degree=1,spotvaluetype=O]
requestSeq:
JsonSpot[spotname=requestSeq,spottype=L,parent=,degree=1,spotvaluetype=S]
_class:
JsonSpot[spotname=_class,spottype=L,parent=,degree=1,spotvaluetype=S]
finalDecision-->_class:JsonSpot[spotname=finalDecision-->_class,spottype=L,parent=finalDecision,degree=2,spotvaluetype=S]
_id:
JsonSpot[spotname=_id,spottype=L,parent=,degree=1,spotvaluetype=S]
serviceCode:
JsonSpot[spotname=serviceCode,spottype=L,parent=,degree=1,spotvaluetype=S]
ruleEngine:
JsonSpot[spotname=ruleEngine,spottype=M,parent=,degree=1,spotvaluetype=OA]
ruleEngine-->ruleEngineDetail-->ruleName:
JsonSpot[spotname=ruleEngine-->ruleEngineDetail-->ruleName,spottype=L,parent=ruleEngine-->ruleEngineDetail,degree=3,spotvaluetype=S]
wherein the spotname is the same as the label chain name, and the spottype is the node type: m is a middle node, and L is a leaf node; para is the tag chain name of the parent node of the tag chain; depth is depth, i.e. the number of tags on a tag chain; spotvaluetype is a tag type: s is a character, N is a numerical value, B is a Boolean type, SA is a character array, O (object) is an object, NA is a numerical array, and OA is an object array.
Specifically, the file structure of mapfile is < structured filename: set of tag chains >.
The contents of the Mapfile are as follows:
TABLE0:transactionCode|bypassed|isCheck|requestSeq|_class|_id|serviceCode|finalDecision-->actionCode|finalDecision-->_class|extension-->financeCommon-->opponentInstitution|extension-->financeCommon-->transAmount
TABLE1:transactionCode|bypassed|isCheck|requestSeq|_class|_id|serviceCode|finalDecision-->actionCode|finalDecision-->_class|extension-->financeCommon-->opponentInstitution|extension-->financeCommon-->transAmount|ruleEngine-->decisionTreeId|ruleEngine-->sugg-->actionCode|ruleEngine-->sugg-->_class
TABLE2:transactionCode|bypassed|isCheck|requestSeq|_class|_id|serviceCode|finalDecision-->actionCode|finalDecision-->_class|extension-->financeCommon-->opponentInstitution|extension-->financeCommon-->transAmount|ruleEngine-->decisionTreeId|ruleEngine-->sugg-->actionCode|ruleEngine-->sugg-->_class|ruleEngine-->ruleEngineDetail-->ruleId|ruleEngine-->ruleEngineDetail-->result|ruleEngine-->ruleEngineDetail-->ruleName
and then, analyzing the JSON file of the data structure to be converted to obtain a second HashMap file and a third HashMap file.
The specific contents of the second HashMap file are as follows: maplebellvalue < subscript of tag strand + OA tag: value >.
ruleEngine-->ruleEngineDetail-->result#1|1:false
ruleEngine-->ruleEngineDetail-->result#1|0:false
transactionCode:0005
bypassed:false
isCheck:false
finalDecision-->_class:com.bocsoft.ruleProcess.dto.MonitorResponseInfo
ruleEngine-->sugg-->_class#1:com.bocsoft.ruleProcess.dto.MonitorResponseInfo
ruleEngine-->ruleEngineDetail-->result#0|1:false
_id:CREDIT_CARD_CHN:4807172220181118025222
ruleEngine-->sugg-->_class#0:com.bocsoft.ruleProcess.dto.MonitorResponseInfo
ruleEngine-->ruleEngineDetail-->result#0|0:false
ruleEngine-->ruleEngineDetail-->result#1|2:false
ruleEngine-->ruleEngineDetail-->result#1|3:false
extension-->financeCommon-->opponentInstitution:48021240
finalDecision-->actionCode:ALLOW
extension-->financeCommon-->transAmount:2100.0
ruleEngine-->ruleEngineDetail-->ruleId#1|1:AAAA
ruleEngine-->ruleEngineDetail-->ruleId#1|0:2876
ruleEngine-->ruleEngineDetail-->ruleId#1|3:2968
ruleEngine-->ruleEngineDetail-->ruleId#0|0:12876
ruleEngine-->ruleEngineDetail-->ruleId#1|2:BBBBB
ruleEngine-->ruleEngineDetail-->ruleId#0|1:12968
ruleEngine-->ruleEngineDetail-->ruleName#1|3:combine_cond_2968
ruleEngine-->ruleEngineDetail-->ruleName#0|0:combine_cond_2876
ruleEngine-->ruleEngineDetail-->ruleName#1|2:combine_cond_2876
ruleEngine-->ruleEngineDetail-->ruleName#1|1:combine_cond_2876
_class:com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog
requestSeq:4807172220181118025222
ruleEngine-->ruleEngineDetail-->ruleName#1|0:combine_cond_2876
ruleEngine-->decisionTreeId#1:1510
ruleEngine-->decisionTreeId#0:520
serviceCode:MONI
ruleEngine-->sugg-->actionCode#0:ALLOW
ruleEngine-->sugg-->actionCode#1:ALLOW
ruleEngine-->ruleEngineDetail-->ruleName#0|1:combine_cond_2968
The third HashMap file specifically comprises the following contents: mapoanum: < OA-type tag chain: number of times >.
ruleEngine:2
ruleEngine-->ruleEngineDetail#0:2
ruleEngine-->ruleEngineDetail#1:4
It should be noted that these are all OA-type tag chains.
RuleEngine:2 indicates that the object array contains two objects.
ruleEngineDetail #0:2 indicates that ruleEngineDetail is a nested array of objects under ruleEngine, which contains two objects when the ruleEngine index is 0.
ruleEngineDetail #1:4 indicates that ruleEngineDetail is a nested array of objects under ruleEngine, which contains four objects with a ruleEngine index of 1.
And finally, traversing the mapfile of the tag chain set file to obtain a corresponding tag chain based on the second HashMap file and the third HashMap obtained by analyzing the JSON file of the data structure to be converted, and searching the JsonSpot object corresponding to the obtained tag chain in the first HashMap file based on the obtained tag chain.
Converting the JSON file into a target structured file according to the target structured file defined by the mapfile, wherein the content of the target structured file is as follows:
TABLE0:
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|
TABLE1:
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|520|
ALLOW|com.bocsoft.ruleProcess.dto.MonitorResponseInfo|
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|1510|ALLOW|com.bocsoft.ruleProcess.dto.MonitorResponseInfo|
TABLE2:
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|520|ALLOW|com.bocsoft.ruleProcess.dto.MonitorResponseInfo|12876|false|combine_cond_2876|
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|520|ALLOW|com.bocsoft.ruleProcess.dto.MonitorResponseInfo|12968|false|combine_cond_2968|
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|1510|ALLOW|com.bocsoft.ruleProcess.dto.MonitorResponseInfo|2876|false|
combine_cond_2876|
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|1510|ALLOW|com.bocsoft.ruleProcess.dto.MonitorResponseInfo|AAAA|false|
combine_cond_2876|
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|1510|
ALLOW|com.bocsoft.ruleProcess.dto.MonitorResponseInfo|BBBBB|false|
combine_cond_2876|
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|1510|
ALLOW|com.bocsoft.ruleProcess.dto.MonitorResponseInfo|2968|false|
combine_cond_2968|
based on the data processing method disclosed by the embodiment of the invention, the second HashMap file and the third HashMap file obtained by analyzing the JSON file of the data structure to be converted are traversed through the mapfile of the tag chain set file to obtain the corresponding tag chain, and then the JsonSpot object corresponding to the obtained tag chain is searched in the first HashMap file based on the obtained tag chain. According to the target structured file defined by the mapfile, the JSON file is converted into the target structured file, so that the conversion of JSON files in different formats can be quickly completed under the condition that codes corresponding to the JSON file format do not need to be modified, and the aim of improving the efficiency of converting unstructured files into structured files is fulfilled.
Based on the data processing method disclosed in the embodiment of the present invention, the embodiment of the present invention further discloses a data processing apparatus correspondingly, as shown in fig. 7, which is a schematic structural diagram of the data processing apparatus provided in the embodiment of the present invention, and the data processing method mainly includes: a first acquisition module 70, a second acquisition module 71 and a conversion module 72.
The first obtaining module 70 is configured to obtain a JSON file of the data structure to be converted and a JSON file tag of the JSON file, where the JSON file tag is used to indicate a JSON file type.
The second obtaining module 71 is configured to obtain, based on the JSON file type, a first HashMap file and a tag chain set file mapfile that are pre-stored in the configuration file and correspond to the JSON file type, where the mapfile is used to define a mapping relationship between the target structured file and the JSON file.
And the conversion module 72 is configured to convert the JSON file of the data structure to be converted into the target structured file based on the first HashMap file and the mapfile of the tag chain set file.
An optional structure of the second obtaining module 71 in the embodiment of the present invention is: the second obtaining module 71 includes a searching unit and a second obtaining unit.
And the searching unit is used for searching the directory which indicates the type stored in the JSON file in the configuration file.
The system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a first HashMap file and a tag chain set file mapfile of a corresponding JSON file type stored in a directory in advance; the method comprises the steps that a first HashMap file and a tag chain set file mapfile which are generated in advance based on JSON files of different JSON file types are stored in each directory of a configuration file; the configuration file comprises an XML configuration file.
Optionally, the first obtaining unit includes: a first generating subunit.
The JSON file generation method comprises a first generation subunit and a second generation subunit, wherein the first generation subunit is used for analyzing a JSON file to obtain node information of all nodes in the JSON file, and the node information comprises JsonSpot objects and tag chains of the JsonSpot objects; the JsonSpot object and the tag chain are stored in the first HashMap in a key-value pair mode by taking the tag chain as a key and taking the JsonSpot object as a value, so that the first HashMap file comprising all node information of the JSON file is obtained.
Optionally, the first obtaining unit includes: a second generating subunit.
The second generation subunit is used for analyzing the JSON file to obtain the mapping relation between the JSON file and the structured file; setting the number N of the rows of mapfile based on the number of JsonSpot objects of the type object array OA, wherein the value of N is the number of the rows of OA plus 1; taking a tag chain set of a non-OA type in the JSON file as a first layer of the mapfile; taking the tag chain set of each OA type in the JSON file as a second layer of the mapfile, and correspondingly generating a row of the tag chain set of each OA type in the second layer, wherein if one or more OA is/are nested in the node of the OA type, the tag chain set of the row where the node of the OA type is located is contained in the tag chain set of the row where all the OA types are nested; and defining a structured file name for each line in the first layer and the second layer, and generating mapfile containing the corresponding relation between the structured file name and the label chain set.
An alternative configuration of the conversion module 72 in the present embodiment is: the conversion module 72 includes an analysis unit, a first traversal unit, a determination unit, a second acquisition unit, a third acquisition unit, a second traversal unit, and a fourth acquisition unit.
And the analysis unit is used for analyzing the JSON file of the data structure to be converted to obtain a second HashMap file and a third HashMap file, wherein the second HashMap file stores the values of the tag chains of all JsonSpot objects in the JSON file of the data structure to be converted, and the third HashMap file stores the number of type objects of each OA array in the JSON file of the data structure to be converted.
And the first traversal unit is used for traversing each line of the mapfile of the tab chain set file, wherein each line of the mapfile of the tab chain set file corresponds to one structured file, and the tab chain set corresponding to the structured file is obtained.
And the judging unit is used for judging the type of each label chain in the label chain set according to the first HashMap.
And the second obtaining unit is used for obtaining the value of the label chain which is equal to the non-OA type in the second HashMap file if the label chain is the non-OA type, and adding the value and the field separator into the character string to be written into the structured file.
And the third obtaining unit is used for determining the number of type objects of each OA based on the third HashMap file if the label chain is of the OA type, searching the second HashMap file based on the number of the type objects of each OA, and adding the obtained value and the field separator into the character string to be written into the structured file.
And the fourth obtaining unit writes the character string into the structured file after each traversal of one row of the tag chain set file mapfile is completed until the traversal of the tag chain set file mapfile is completed, and obtains a target structured file corresponding to the JSON file of the data structure to be converted.
According to the data processing device disclosed in the embodiment of the present invention, the JSON file and the JSON file tag of the data structure to be converted are obtained, the JSON file tag is used to indicate the JSON file type, the first HashMap file and the tag chain set file mapfile corresponding to the JSON file type stored in the configuration file in advance are obtained, and the JSON file of the data structure to be converted is converted into the target structured file based on the first HashMap file and the mapfile. By means of the method, conversion of JSON files in different formats can be completed quickly on the basis of the first HashMap file and the tag chain set file mapfile stored in the configuration file in advance without modifying codes corresponding to the JSON file formats, and the purpose of improving the efficiency of converting unstructured files into structured files is achieved.
Based on the data processing device disclosed in the above embodiment of the present invention, the data processing device further includes: the device comprises a building module and a storage module.
And the establishing module is used for establishing a directory corresponding to the JSON file in the configuration file if the first HashMap file and the tag chain set file mapfile corresponding to the JSON file type are not stored in the configuration file.
And the storage module is used for generating a first HashMap file and a tag chain set file mapfile based on the JSON file corresponding to the JSON file type, and storing the first HashMap file and the tag chain set file mapfile in the directory.
Based on the data processing device disclosed in the above embodiment of the present invention, the data processing device further includes: and modifying the module.
And the modification module is used for modifying the structured file names, the arrangement sequence of the tag chain sets and the screening fields in the mapfile to obtain a new mapfile.
According to the data processing device disclosed in the embodiment of the present invention, the JSON file and the JSON file tag of the data structure to be converted are obtained, the JSON file tag is used to indicate the JSON file type, the first HashMap file and the tag chain set file mapfile corresponding to the JSON file type stored in the configuration file in advance are obtained, and the JSON file of the data structure to be converted is converted into the target structured file based on the first HashMap file and the mapfile. By means of the method, conversion of JSON files in different formats can be completed quickly on the basis of the first HashMap file and the tag chain set file mapfile stored in the configuration file in advance without modifying codes corresponding to the JSON file formats, and the purpose of improving the efficiency of converting unstructured files into structured files is achieved.
It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. For the device-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A method of data processing, the method comprising:
acquiring a JSON file of a data structure to be converted and a JSON file tag of the JSON file, wherein the JSON file tag is used for indicating the type of the JSON file;
based on the JSON file type, acquiring a first HashMap file and a tag chain set file mapfile which are pre-stored in a configuration file and correspond to the JSON file type, wherein the mapfile is used for defining a mapping relation between a target structured file and the JSON file;
and converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and the mapfile of the tag chain set file.
2. The method of claim 1, wherein the obtaining, based on the JSON file type, a first HashMap file and a labelset file mapfile corresponding to the JSON file type, which are pre-stored in a configuration file, comprises:
searching a directory indicating the type of the JSON file stored in the configuration file;
acquiring a first HashMap file and a tag chain set file mapfile which are pre-stored in the directory and correspond to the JSON file type;
the first HashMap file and the tag chain set file mapfile generated in advance based on JSON files of different JSON file types are stored in each directory of the configuration file;
the configuration file comprises an XML configuration file.
3. The method of claim 1, further comprising:
if the first HashMap file and the tag chain set file mapfile corresponding to the JSON file type are not stored in the configuration file, establishing a directory corresponding to the JSON file in the configuration file;
and generating the first HashMap file and the tag chain set file mapfile based on the JSON file corresponding to the JSON file type, and storing the first HashMap file and the tag chain set file mapfile in the directory.
4. The method of claim 2, wherein the process of generating the first HashMap file based on the JSON file comprises:
analyzing the JSON file to obtain node information of all nodes in the JSON file, wherein the node information comprises JsonSpot objects and tag chains of the JsonSpot objects;
and taking the label chain as a key, taking the JsonSpot object as a value, and storing the JsonSpot object and the label chain in a first HashMap in a key-value pair mode to obtain a first HashMap file comprising all node information of the JSON file.
5. The method according to claim 2, wherein the process of generating the labelshain assembly file mapfile based on the JSON file comprises:
analyzing the JSON file to obtain a mapping relation between the JSON file and a structured file;
setting the number N of rows of the mapfile based on the number of rows of a JsonSpot object of a type object array OA, wherein the value of N is the number of rows of the OA plus 1;
taking a tag chain set of a non-OA type in the JSON file as a first layer of the mapfile;
taking the labelshain set of each OA type in the JSON file as a second layer of the mapfile, wherein a row is correspondingly generated by the labelshain set of each OA type in the second layer, and if one or more OA is/are nested in the nodes of the OA types, the labelshain set of the row in which the nodes of the OA types are located is contained in the labelshain set of the row in which all the OA types are nested;
defining a structured file name for each line in the first layer and the second layer, and generating the mapfile containing the corresponding relation between the structured file name and the label chain set.
6. The method of claim 5, further comprising:
and modifying the structured file names, the arrangement sequence of the tag chain sets and the screening fields in the mapfile to obtain a new mapfile.
7. The method according to any one of claims 1 to 6, wherein the converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and the labelshain assembly file mapfile comprises
Analyzing the JSON file of the data structure to be converted to obtain a second HashMap file and a third HashMap file, wherein the second HashMap file stores the values of the tag chains of all JsonSpot objects in the JSON file of the data structure to be converted, and the third HashMap file stores the number of type objects of each OA in the JSON file of the data structure to be converted;
traversing each row of the mapfile of the tab chain set file, wherein each row of the mapfile of the tab chain set file corresponds to a structured file, and acquiring a tab chain set corresponding to the structured file;
judging the type of each label chain in the label chain set according to the first HashMap file;
if the label chain is of a non-OA type, obtaining a value which is equal to the label chain of the non-OA type in the second HashMap file, and adding the value and the field separator into a character string to be written into the structured file;
if the label chain is of the OA type, determining the number of type objects of each OA based on the third HashMap file, searching the second HashMap file based on the number of the type objects of each OA, and adding the obtained value and the field separator into a character string to be written into the structured file;
and after each row of traversal of the mapfile of the tag chain set file is completed, writing the character string into a structured file until the traversal of the mapfile of the tag chain set file is completed, and obtaining a target structured file corresponding to the JSON file of the data structure to be converted.
8. A data processing apparatus, characterized in that the apparatus comprises:
the system comprises a first acquisition module, a second acquisition module and a conversion module, wherein the first acquisition module is used for acquiring a JSON file of a data structure to be converted and a JSON file tag of the JSON file, and the JSON file tag is used for indicating the type of the JSON file;
the second acquisition module is used for acquiring a first HashMap file and a tag chain set file mapfile which are pre-stored in a configuration file and correspond to the JSON file type based on the JSON file type, wherein the mapfile is used for defining the mapping relation between a target structured file and the JSON file;
and the conversion module is used for converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and the mapfile of the tag chain set file.
9. The apparatus of claim 8, wherein the second obtaining module comprises:
the searching unit is used for searching a directory which indicates the type of the JSON file and is stored in the configuration file;
the first acquisition unit is used for acquiring a first HashMap file and a tag chain set file mapfile which are pre-stored in the directory and correspond to the JSON file type; the first HashMap file and the tag chain set file mapfile generated in advance based on JSON files of different JSON file types are stored in each directory of the configuration file; the configuration file comprises an XML configuration file.
10. The apparatus of claim 8, wherein the conversion module comprises:
the analysis unit is used for analyzing the JSON file of the data structure to be converted to obtain a second HashMap file and a third HashMap file, wherein the second HashMap file stores the values of the tag chains of all JsonSpot objects in the JSON file of the data structure to be converted, and the third HashMap file stores the number of type objects of each OA in the JSON file of the data structure to be converted;
the first traversal unit is used for traversing each row of the mapfile of the tab chain set file, wherein each row of the mapfile of the tab chain set file corresponds to one structured file, and a tab chain set corresponding to the structured file is obtained;
the judging unit is used for judging the type of each label chain in the label chain set according to the first HashMap file;
the second obtaining unit is used for obtaining the value of the second HashMap file key equal to the label chain of the non-OA type if the label chain of the non-OA type is adopted, and adding the value and the field separator into a character string to be written into the structured file;
a third obtaining unit, configured to determine, if the label chain is of an OA type, the number of type objects of each OA based on the third HashMap file, search the second HashMap file based on the number of type objects of each OA, and add the obtained value and the field separator to a character string to be written in the structured file;
and the fourth obtaining unit writes the character string into a structured file after each traversal of one row of the tag chain set file mapfile is completed until the traversal of the tag chain set file mapfile is completed, and obtains a target structured file corresponding to the JSON file of the data structure to be converted.
CN201911212915.8A 2019-12-02 2019-12-02 Data processing method and device Active CN110909523B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911212915.8A CN110909523B (en) 2019-12-02 2019-12-02 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911212915.8A CN110909523B (en) 2019-12-02 2019-12-02 Data processing method and device

Publications (2)

Publication Number Publication Date
CN110909523A true CN110909523A (en) 2020-03-24
CN110909523B CN110909523B (en) 2023-10-27

Family

ID=69821264

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911212915.8A Active CN110909523B (en) 2019-12-02 2019-12-02 Data processing method and device

Country Status (1)

Country Link
CN (1) CN110909523B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113448923A (en) * 2020-04-17 2021-09-28 北京新氧科技有限公司 File generation method and device and terminal
CN114185855A (en) * 2022-02-15 2022-03-15 中博信息技术研究院有限公司 Simplified method and system for generating OFD file based on JSON

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103389991A (en) * 2012-05-09 2013-11-13 阿里巴巴集团控股有限公司 Data interaction method, data interaction device, data conversion method and data conversion device
CN105787128A (en) * 2016-03-29 2016-07-20 四川秘无痕信息安全技术有限责任公司 Method for recovering Java serialized file data
CN106934011A (en) * 2017-03-09 2017-07-07 济南浪潮高新科技投资发展有限公司 A kind of structuring analysis method and device of JSON data
US20180018165A1 (en) * 2015-01-08 2018-01-18 Fasoo. Com Co., Ltd Source code transfer control method, computer program therefor, and recording medium therefor
CN108037915A (en) * 2017-11-07 2018-05-15 福建天泉教育科技有限公司 A kind of method and terminal of acquisition json configuration files

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103389991A (en) * 2012-05-09 2013-11-13 阿里巴巴集团控股有限公司 Data interaction method, data interaction device, data conversion method and data conversion device
US20180018165A1 (en) * 2015-01-08 2018-01-18 Fasoo. Com Co., Ltd Source code transfer control method, computer program therefor, and recording medium therefor
CN105787128A (en) * 2016-03-29 2016-07-20 四川秘无痕信息安全技术有限责任公司 Method for recovering Java serialized file data
CN106934011A (en) * 2017-03-09 2017-07-07 济南浪潮高新科技投资发展有限公司 A kind of structuring analysis method and device of JSON data
CN108037915A (en) * 2017-11-07 2018-05-15 福建天泉教育科技有限公司 A kind of method and terminal of acquisition json configuration files

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113448923A (en) * 2020-04-17 2021-09-28 北京新氧科技有限公司 File generation method and device and terminal
CN113448923B (en) * 2020-04-17 2023-09-12 北京新氧科技有限公司 File generation method, device and terminal
CN114185855A (en) * 2022-02-15 2022-03-15 中博信息技术研究院有限公司 Simplified method and system for generating OFD file based on JSON
CN114185855B (en) * 2022-02-15 2022-05-24 中博信息技术研究院有限公司 Simplified method and system for generating OFD file based on JSON

Also Published As

Publication number Publication date
CN110909523B (en) 2023-10-27

Similar Documents

Publication Publication Date Title
US10031973B2 (en) Method and system for identifying a sensor to be deployed in a physical environment
US9141727B2 (en) Information search device, information search method, computer program, and data structure
EP2721517A1 (en) Method and system of extracting web page information
US20130311507A1 (en) Representing Incomplete and Uncertain Information in Graph Data
CN109308300B (en) Logic operation processing method and device, conversion plug-in and storage medium
WO2017177872A1 (en) Data collection method and apparatus, and storage medium
CN110909523A (en) Data processing method and device
CN109144514B (en) JSON format data analysis and storage method and device
US20080104108A1 (en) Schemaless xml payload generation
CN111752542A (en) Database query interface engine based on XML template
JP2008134906A (en) Business process definition generation method, device and program
CN109684438B (en) Method for retrieving data with parent-child hierarchical structure
JPWO2013111287A1 (en) SPARQL query optimization method
CN114372174A (en) XML document distributed query method and system
US11327746B2 (en) Reduced processing loads via selective validation specifications
US10133826B2 (en) UDDI based classification system
JP5172931B2 (en) SEARCH DEVICE, SEARCH METHOD, AND SEARCH PROGRAM
Prabhune et al. P-PIF: a ProvONE provenance interoperability framework for analyzing heterogeneous workflow specifications and provenance traces
CN102486731A (en) Method, device and system for enhancing visualization of software call stack of software
Settle et al. aMatReader: Importing adjacency matrices via Cytoscape Automation
CN112115125A (en) Database access object name resolution method and device and electronic equipment
CN113127861A (en) Rule hit detection method and device, electronic equipment and readable storage medium
CN114764406B (en) Database query method and related device
Kamińska Plos one-a case study of quantitative and dynamic citation analysis of research papers based on the data in an open citation index (the opencitations corpus)
CN111198877B (en) Data storage and query method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant