CN107704620A

CN107704620A - A kind of method, apparatus of file administration, equipment and storage medium

Info

Publication number: CN107704620A
Application number: CN201711021845.9A
Authority: CN
Inventors: 张立志; 万月亮; 王梅
Original assignee: Beijing Ruian Technology Co Ltd
Current assignee: Beijing Ruian Technology Co Ltd
Priority date: 2017-10-27
Filing date: 2017-10-27
Publication date: 2018-02-16
Anticipated expiration: 2037-10-27
Also published as: CN107704620B

Abstract

The invention discloses a kind of method, apparatus of file administration, equipment and storage medium.Archives initial data is obtained, wherein, every archives initial data includes two attribute informations；After handling the archives initial data, archives initial data is screened according to business demand and forms archives graph data as archives；Receive the attribute information for being used to inquire about of input；Information corresponding with the attribute information for being used to inquire about of the input is searched in archives, generates Query Result.The technical scheme of the embodiment of the present invention, which solves prior art, can not ensure to obtain whole attribute informations, and due to needing repeatedly to be inquired about, the problem of causing query time long, the effective management for realizing personnel's archive information is reached, ensure when entering the inquiry of administrative staff's archive information, be capable of the effect of quick obtaining whole attribute information.

Description

A kind of method, apparatus of file administration, equipment and storage medium

Technical field

The present embodiments relate to big data analysis and figure computing technique, more particularly to a kind of method of file administration, dress Put, equipment and storage medium.

Background technology

Present society is the society of a high speed development, and science and technology is flourishing, information flow, and the exchange between people is increasingly Closely, life is also more and more convenient.Big data is the product of this cyberage.Contain substantial amounts of letter in these data Breath.Personnel's archives would generally be established for business demand, and personnel's archive information is looked into the personnel's archives established Ask and analyze.A large amount of correlation attribute informations are included in personnel's archives.In order to realize the effective query of personnel's archive information, it is necessary to right The data of these magnanimity carry out specialized management.

Fig. 1 is the schematic diagram of personnel's file data of the prior art.In the prior art, the attribute information in personnel's archives It is discrete storage, the original relevance of reserved property information, does not add extra incidence relation.As shown in figure 1, certain personnel Attribute information includes：Attribute information A, attribute information B, attribute information C, attribute information D, attribute information E, attribute information F, attribute Information G and attribute information H.Wherein, attribute information A is associated with attribute information B, attribute information C；Attribute information B believes with attribute Breath D, attribute information E are associated；Attribute information C and attribute information F；Attribute information E and attribute information G；Attribute information G and attribute Information H.

In personnel's archives in the prior art, want to obtain whole attribute informations of certain personnel as shown in Figure 1, it is necessary to First pass through attribute information A and inquire attribute information B and attribute information C；Go to inquire by attribute information B and attribute information C again Attribute information D, attribute information E and attribute information F；Attribute information G is inquired by attribute information E, inquired about by attribute information G To attribute information H.In personnel's archives in the case that back end is uncertain, data volume is very big, this inquiry mode is basic Whole attribute informations of certain personnel can not be ensured to obtain.Simultaneously as needing repeatedly to be inquired about, query time is caused to be grown.

The content of the invention

In view of this, the present invention provides a kind of method, apparatus of file administration, equipment and storage medium, to realize personnel Effective management of archive information, ensure when entering the inquiry of administrative staff's archive information, being capable of quick obtaining whole attribute information.

In a first aspect, the embodiments of the invention provide a kind of method of file administration, including：

Archives initial data is obtained, wherein, every archives initial data includes two attribute informations；

After handling the archives initial data, archives initial data is screened and formed according to business demand Archives graph data is as archives；

Receive the attribute information for being used to inquire about of input；

Information corresponding with the attribute information for being used to inquire about of the input is searched in archives, generates Query Result.

Further, it is described the archives initial data is handled after, according to business demand to archives initial data Screened and form archives graph data, generate archives, including：

Remove the duplicate data in archives initial data；

Bridge point is selected according to business demand, the bridge point is the costly attribute information type of business；

Archives initial data is screened according to the bridge point, filters out single bridge data and doube bridge data；

Archives graph data is formed as archives according to bridge point, single bridge data and doube bridge data.

Further, single bridge data be comprising two attribute informations in only an attribute information be bridge point shelves Case initial data；

The doube bridge data be comprising two attribute informations be all bridge point archives initial data.

Further, it is described to form archives graph data according to bridge point, single bridge data and doube bridge data, generate archives, bag Include：

Extract summit of all attribute informations for belonging to bridge point as archives graph data；

The side of archives graph data is formed according to single bridge data and doube bridge data, and bridge is not belonging to by what single bridge packet contained The attribute information of point is connected to corresponding summit.

Further, it is described that information corresponding with the attribute information for being used to inquire about of the input, life are searched in archives Into Query Result, including：

According to the attribute information for being used to inquire about of input, obtain corresponding with the attribute information for being used to inquire about of the input Archives graph data；

Attribute information generation Query Result in the archives graph data.

Second aspect, the embodiment of the present invention additionally provide a kind of device of file administration, including：

Data acquisition module, for obtaining archives initial data, wherein, every archives initial data includes two attributes and believed Breath；

Data processing module, it is original to archives according to business demand after handling the archives initial data Data are screened and form archives graph data as archives；

Query Information input module, for receiving the attribute information for being used to inquire about of input；

Query Result generation module, it is corresponding with the attribute information for being used to inquire about of the input for being searched in archives Information, generate Query Result.

Further, data processing module includes：

Data deduplication unit, for removing the duplicate data in archives initial data；

Bridge clicks order member, and for selecting bridge point according to business demand, the bridge point is the costly attribute information of business Type；

Data screening unit, for being screened according to the bridge point to archives initial data, filter out single bridge data and Doube bridge data；

Graph data generation unit, for forming archives graph data conduct according to bridge point, single bridge data and doube bridge data Archives.

The third aspect, the embodiment of the present invention additionally provide a kind of equipment of file administration, including memory, processor and deposit Store up the computer program that can be run on a memory and on a processor, it is characterised in that the computing device described program The method of file administration described in the Shi Shixian embodiment of the present invention.

Fourth aspect, the embodiment of the present invention additionally provide a kind of computer-readable recording medium, are stored thereon with computer Program, it is characterised in that the program realizes the file administration described in embodiment of the present invention method when being executed by processor.

Method, apparatus, equipment and the storage medium of the file administration of above-mentioned offer, by obtaining archives initial data；It is right After the archives initial data is handled, archives initial data is screened according to business demand and forms archives figure number According to as archives；Receive the attribute information for being used to inquire about of input；The category for being used to inquire about with the input is searched in archives Property information corresponding to information, generate Query Result, solving prior art can not ensure to obtain whole attribute informations, and due to Need repeatedly to be inquired about, the problem of causing query time long, reached the effective management for realizing personnel's archive information, ensured When entering the inquiry of administrative staff's archive information, it is capable of the effect of quick obtaining whole attribute information.

Brief description of the drawings

Fig. 1 is the schematic diagram of personnel's file data of the prior art；

Fig. 2 is a kind of flow chart of the method for file administration that the embodiment of the present invention one provides；

Fig. 3 is a kind of flow chart of the method for file administration that the embodiment of the present invention two provides；

Fig. 4 is a kind of structural representation of the device for file administration that the embodiment of the present invention three provides；

Fig. 5 is a kind of structural representation of the equipment for file administration that the embodiment of the present invention four provides.

Embodiment

The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention, rather than limitation of the invention.It also should be noted that in order to just Part related to the present invention rather than entire infrastructure are illustrate only in description, accompanying drawing.

Embodiment one

Fig. 2 is a kind of flow chart of the method for file administration that the embodiment of the present invention one provides, and the present embodiment is applicable to Situation about being managed to archives, this method can be performed by the device of file administration, and described device is by software and/or hardware To perform, and can typically be integrated in the equipment of data syn-chronization.The equipment of data syn-chronization includes but is not limited to computer etc..Ginseng Fig. 2 is examined, it specifically comprises the following steps：

Step 110, archives initial data is obtained, wherein, every archives initial data includes two attribute informations.

Wherein, archives initial data is the associated data for generating archives.Archives initial data may be from social network The visiting record of network, e-commerce website or customer, it is also possible to have many other sources.By archives raw data acquisition method from Archives initial data is obtained in above-mentioned source.Archives raw data acquisition method includes：System journal acquisition method, network data Acquisition method, and by cooperating with enterprise or research institution, use relevant way gathered data such as particular system interface etc..By It is bigger in the data volume of archives initial data, it will usually to record the archives initial data of the magnanimity collected according to business demand Enter to big data platform, stored and data processing.Wherein, big data platform be using store, computing and show big data as Purpose platform, integrate the functions such as Data Integration, data processing, data storage, data analysis, visualization, for excavating number According to the service logic of behind, the problem of finding data behind, adjustment in time.

Specifically, the present embodiment uses Distributed Computing Platform hadoop as big data platform, by archives initial data Be entered into Distributed Computing Platform Hadoop distributed file system (Hadoop Distributed File System, HDFS in), the elasticity distribution formula in the seamless computing engines Spark being integrated into Distributed Computing Platform Hadoop of energy is used The file of data set (Resilient Distributed Datasets, RDD) reads instruction textFile and will stored in HDFS Archives initial data be read into RDD, and archives initial data is handled using computing engines Spark.

Distributed Computing Platform Hadoop, which is one, can allow the light framework of user and the Distributed Computing Platform that uses.Number According to being stored in Distributed Computing Platform Hadoop HDFS, user easily can open on Distributed Computing Platform Hadoop The application program of hair and operation processing mass data.Distributed Computing Platform Hadoop has high reliability, high efficiency, height fault-tolerant Property, low cost etc. have advantage.

Computing engines Spark is the computing engines for the Universal-purpose quick for aiming at large-scale data processing and designing, can be seamless It is integrated into Distributed Computing Platform Hadoop platform, various computings is completed by operating RDD, including structuring is looked into Ask language (Structured Query Language, SQL) inquiry, text-processing, figure calculates and machine learning.Computing engines Spark has the characteristics that high efficiency, ease for use, versatility.

Step 120, after handling the archives initial data, archives initial data is sieved according to business demand Select and form archives graph data as archives.

Wherein, in practical business, data are often duplicated in archives initial data, duplicate removal is to archives original number The step of according to being handled.Specifically, handling archives initial data, the duplicate removal of the RDD in computing engines Spark is used Instruction distinct removes the duplicate data in archives initial data.

The data volume of archives initial data is very huge, and archives initial data is screened according to business demand, retains The costly archives initial data of business, filter out business and be worth low archives initial data, meeting the premise of business needs Lower raising operation efficiency.Specifically, after removing the duplicate data in archives initial data, the costly attribute information of selecting business Type, archives initial data is screened.It is if at least one in two attribute informations that an archives initial data includes Attribute information belongs to the costly attribute information type of business, then carries out next step data processing to this archives initial data； If the neither one attribute information in two attribute informations that an archives initial data includes belongs to the costly attribute of business Information type, then next step data processing is not carried out to this archives initial data.

Calculated using figure, according to the relevance of archives initial data, discrete archives initial data is associated, formed Corresponding archives graph data is as archives.Figure calculating is taking out for one kind " figure " structure to real world based on " graph theory " As expression, and the computation schema in this data structure.Generally, in figure calculates, the expression of basic data structure is exactly：

G=(V, E, D)

V=vertex (summit or node)

E=edge (side)

D=data (weight).

The data structure that figure calculates is made up of summit and side.Summit includes vertex attribute.Side includes weight and direction, i.e., While the data correlation relation between containing summit.While being connected associated summit according to the relevance of data, formed Scheme the data structure calculated.Graph data structure expresses the relevance between data well.It is big data meter that relevance, which calculates, The core of calculation.By obtaining the relevance of data, useful information can be extracted from mass data.

Specifically, the attribute information by the use of the archives initial data after computing engines Spark extraction screenings is used as summit, root According to the incidence relation of the attribute information of archives initial data, summit is connected, forms the data structure with figure calculating Archives graph data is as archives.Save is instructed by the preservation of the RDD in computing engines Spark, archives graph data is entered Row preserves.The summit of archives graph data is connected to become an entirety by side, and summit includes corresponding attribute information, i.e. archives Attribute information in graph data is connected as an entirety.

Step 130, the attribute information for being used to inquire about for receiving input.

Wherein, the archives graph data in archives all includes corresponding attribute information.In archives during Query Information, pass through Attribute information for inquiry is inputted according to query demand, all attributes for being used to inquire about with input in archives can be inquired about The related attribute information of information.

Step 140, information corresponding with the attribute information for being used to inquire about of the input, generation inquiry are searched in archives As a result.

Wherein, the attribute information in archives graph data is connected as an entirety, is obtained and input by one query For archives graph data corresponding to the attribute information of inquiry, it is possible to obtain all attribute informations for being used to inquire about with input Related attribute information.According to input be used for inquire about attribute information, find in archives with input be used for inquire about Archives graph data corresponding to attribute information, generate Query Result.Query Result includes the institute in the archives graph data obtained There is attribute information.

The method for a kind of file administration that the present embodiment provides, by obtaining archives initial data；It is original to the archives After data are handled, archives initial data is screened according to business demand and forms archives graph data as archives； Receive the attribute information for being used to inquire about of input；Searched in archives corresponding with the attribute information for being used to inquire about of the input Information, Query Result is generated, solving prior art can not ensure to obtain whole attribute informations, and due to needing to carry out repeatedly Inquiry, the problem of causing query time long, the effective management for realizing personnel's archive information is reached, has ensured entering administrative staff's archives During the inquiry of information, it is capable of the effect of quick obtaining whole attribute information.

Embodiment two

Fig. 3 is a kind of flow chart of the method for file administration that the embodiment of the present invention two provides, and the present embodiment is above-mentioned each Embodied on the basis of embodiment.As shown in figure 3, this method specifically includes：

Step 210, archives initial data is obtained, wherein, every archives initial data includes two attribute informations.

Step 220, the duplicate data in archives initial data removed.

Wherein, archives initial data is handled, instructed using the duplicate removal of the RDD in computing engines Spark Distinct removes the duplicate data in archives initial data.

Step 230, bridge point is selected according to business demand, the bridge point is the costly attribute information type of business.

Wherein, in the archives initial data after duplicate removal, the costly attribute information type of selecting business is as bridge point.No The attribute information for belonging to bridge point is non-bridge point.

Step 240, according to the bridge point archives initial data is screened, filter out single bridge data and doube bridge data.

Wherein, single bridge data be comprising two attribute informations in an only attribute information be that the archives of bridge point are former Beginning data；The doube bridge data be comprising two attribute informations be all bridge point archives initial data.

Archives initial data is screened according to selected bridge point, it is next to filter out single bridge data and doube bridge data progress Step data processing.If it is bridge point to only have an attribute information in two attribute informations that an archives initial data includes, should Bar archives initial data is single bridge data；If two attribute informations that an archives initial data includes all are bridge point, this Archives initial data is doube bridge data；If the neither one attribute letter in two attribute informations that an archives initial data includes Breath belongs to bridge point, i.e., two attribute informations that this archives initial data includes are all non-bridge point, then not original to this archives Data carry out next step data processing.So as to realize the costly archives initial data of retained business, it is low to filter out business value Archives initial data, on the premise of business needs are met improve operation efficiency effect.

Step 250, according to bridge point, single bridge data and doube bridge data formed archives graph data as archives.

Wherein, summit is selected from single bridge data and doube bridge data according to bridge point, then according to single bridge data and doube bridge number Attribute information in single bridge data and doube bridge data is connected into an entirety by the incidence relation of the attribute information in, forms shelves Case graph data is as archives.

Preferably, it is described to form archives graph data according to bridge point, single bridge data and doube bridge data, archives are generated, including：

Specifically, all attribute informations for belonging to bridge point are first extracted in single bridge data and doube bridge data as archives figure number According to summit.Then the side of archives graph data is formed according to single bridge data and doube bridge data.Due to two in doube bridge data Attribute information is all bridge point, i.e. doube bridge packet contains two summits.So the association of the attribute information in doube bridge data is closed System forms the side of archives graph data, can be connected on summit with summit.Due to single bridge data be comprising two attribute informations In only an attribute information be bridge point, another is the attribute information for being not belonging to bridge point, i.e., single bridge packet contains a summit With a non-bridge point., can be with so the incidence relation of attribute information in single bridge data forms the side of archives graph data The attribute information for being not belonging to bridge point that single bridge packet is contained is connected to corresponding summit.

Step 260, the attribute information for being used to inquire about for receiving input.

Specifically, provide the attribute letter that is used to inquire about of the Webpage for user's input for being used to input attribute information Breath, improve the experience of user.User inputs some attribute information, after confirmation proceeds by inquiry, receives input and is used to inquiring about Attribute information.

Step 270, the attribute information for being used to inquire about according to input, obtain and believe with the attribute for being used to inquire about of the input Archives graph data corresponding to breath.

Wherein, the attribute information in archives graph data is connected as an entirety, is obtained and input by one query For archives graph data corresponding to the attribute information of inquiry, it is possible to obtain all attribute informations for being used to inquire about with input Related attribute information.

Step 280, the attribute information generation Query Result in the archives graph data.

Wherein, Query Result includes all properties information in the archives graph data obtained.Specifically, pass through Network page Face shows that Query Result is checked for user.

The method for a kind of file administration that the present embodiment provides, by obtaining archives initial data；Remove archives original number Duplicate data in；Bridge point is selected according to business demand；Archives initial data is screened according to the bridge point, filtered out Single bridge data and doube bridge data；Then archives graph data is formed as archives according to bridge point, single bridge data and doube bridge data；Connect Receive the attribute information for being used to inquire about of input；According to the attribute information for being used to inquire about of input, acquisition is used for the input Archives graph data corresponding to the attribute information of inquiry；Query Result is generated, solving prior art can not ensure to obtain all Attribute information, and due to needing repeatedly to be inquired about, the problem of causing query time long, reached and realized personnel's archive information Effective management, ensure when entering the inquiry of administrative staff's archive information, be capable of the effect of quick obtaining whole attribute information.

Embodiment three

Fig. 4 is a kind of structural representation of the device for file administration that the embodiment of the present invention three provides, and the present embodiment can fit For situation about being synchronized to data, as shown in figure 4, described device includes：

Data acquisition module 310, data processing module 320, Query Information input module 330 and Query Result generation module 340。

Wherein, data acquisition module 310, for obtaining archives initial data, wherein, every archives initial data includes two Individual attribute information；Data processing module 320, after handling the archives initial data, according to business demand to shelves Case initial data is screened and forms archives graph data, generates archives；Query Information input module 330, it is defeated for receiving The attribute information for being used to inquire about entered；Query Result generation module 340, for being searched in archives with the input for looking into Information corresponding to the attribute information of inquiry, generate Query Result.

The device for a kind of file administration that the present embodiment provides, by obtaining archives initial data；It is original to the archives After data are handled, archives initial data is screened according to business demand and forms archives graph data as archives； Receive the attribute information for being used to inquire about of input；Searched in archives corresponding with the attribute information for being used to inquire about of the input Information, Query Result is generated, solving prior art can not ensure to obtain whole attribute informations, and due to needing to carry out repeatedly Inquiry, the problem of causing query time long, the effective management for realizing personnel's archive information is reached, has ensured entering administrative staff's archives During the inquiry of information, it is capable of the effect of quick obtaining whole attribute information.

On the basis of the various embodiments described above, the data processing module 320 can include：

Bridge clicks order member, and for selecting bridge point according to business demand, the bridge point is that business is worth higher attribute letter Cease type；

Graph data generation unit, for forming archives graph data, generation according to bridge point, single bridge data and doube bridge data Archives.

On the basis of the various embodiments described above, single bridge data can be comprising two attribute informations in only one Attribute information is the archives initial data of bridge point；

The doube bridge data can be comprising two attribute informations be all bridge point archives initial data.

On the basis of the various embodiments described above, the graph data generation unit, it can include：

Summit generates subelement, for extracting summit of all attribute informations for belonging to bridge point as archives graph data；

Figure connects subelement, for forming the side of archives graph data according to single bridge data and doube bridge data, and will be single The attribute information that what bridge packet contained be not belonging to bridge point is connected to corresponding summit.

On the basis of the various embodiments described above, the Query Result generation module 340, it can include：

Data acquisition subelement, for the attribute information for being used to inquire about according to input, acquisition is used for the input Archives graph data corresponding to the attribute information of inquiry；

As a result subelement is generated, Query Result is generated for the attribute information in the archives graph data.

Said apparatus can perform the method that any embodiment of the present invention is provided, and possess the corresponding functional module of execution method And beneficial effect.

Example IV

Fig. 5 is a kind of structural representation of the equipment for file administration that the embodiment of the present invention four provides, as shown in figure 5, should Equipment includes processor 410, memory 420, input unit 430 and output device 440；The quantity of processor 410 can in equipment To be one or more, in Fig. 5 by taking a processor 410 as an example；Device handler 410, memory 420, the and of input unit 430 Output device 440 can be connected by bus or other modes, in Fig. 5 exemplified by being connected by bus.

Memory 420 is used as a kind of computer-readable recording medium, and journey is can perform available for storage software program, computer Sequence and module, programmed instruction/module is (for example, file administration as corresponding to the method for the file administration in the embodiment of the present invention Device in data acquisition module 310, data processing module 320, Query Information input module 330 and Query Result generation mould Block 340).Processor 410 is by running software program, instruction and the module of storage in store 410, so as to perform equipment Various function application and data processing, that is, the method for realizing above-mentioned file administration.

Memory 420 can mainly include storing program area and storage data field, wherein, storing program area can store operation system Application program needed for system, at least one function；Storage data field can store uses created data etc. according to terminal.This Outside, memory 420 can include high-speed random access memory, can also include nonvolatile memory, for example, at least one Disk memory, flush memory device or other non-volatile solid state memory parts.In some instances, memory 420 can enter one Step includes that relative to the remotely located memory of processor 410, these remote memories network connection to equipment can be passed through.On The example for stating network includes but is not limited to internet, intranet, LAN, mobile radio communication and combinations thereof.

Input unit 430 can be used for the archives initial data for receiving outside input.Output device 440 can be used for output to inquire about As a result and display Query Result.

The equipment of above-mentioned file administration can perform the method that any embodiment of the present invention is provided, and it is corresponding to possess execution method Functional module and beneficial effect.

Embodiment five

The embodiment of the present invention five also provides a kind of storage medium for including computer executable instructions, and the computer can be held When being performed by computer processor for performing a kind of method of file administration, this method includes for row instruction：

Receive the attribute information for being used to inquire about of input；

Certainly, a kind of storage medium for including computer executable instructions that the embodiment of the present invention is provided, its computer The method operation that executable instruction is not limited to the described above, can also carry out the file administration that any embodiment of the present invention is provided Method in associative operation

By the description above with respect to embodiment, it is apparent to those skilled in the art that, the present invention It can be realized by software and required common hardware, naturally it is also possible to realized by hardware, but the former is more in many cases Good embodiment.Based on such understanding, what technical scheme substantially contributed to prior art in other words Part can be embodied in the form of software product, and the computer software product can be stored in computer-readable recording medium In, floppy disk, read-only storage (Read-Only Memory, ROM), random access memory (Random such as computer Access Memory, RAM), flash memory (FLASH), hard disk or CD etc., including some instructions are causing a computer to set Standby (can be personal computer, server, or network equipment etc.) performs the method described in each embodiment of the present invention.

It is worth noting that, in the embodiment of the device of above-mentioned file administration, included unit and module are Divided according to function logic, but be not limited to above-mentioned division, as long as corresponding function can be realized；Separately Outside, the specific name of each functional unit is also only to facilitate mutually distinguish, the protection domain being not intended to limit the invention.

Pay attention to, above are only presently preferred embodiments of the present invention and institute's application technology principle.It will be appreciated by those skilled in the art that The invention is not restricted to specific embodiment described here, can carry out for a person skilled in the art various obvious changes, Readjust and substitute without departing from protection scope of the present invention.Therefore, although being carried out by above example to the present invention It is described in further detail, but the present invention is not limited only to above example, without departing from the inventive concept, also Other more equivalent embodiments can be included, and the scope of the present invention is determined by scope of the appended claims.

Claims

A kind of 1. method of file administration, it is characterised in that including：

Archives initial data is obtained, wherein, every archives initial data includes two attribute informations；

After handling the archives initial data, archives initial data is screened according to business demand and forms archives Graph data is as archives；

Receive the attribute information for being used to inquire about of input；

Information corresponding with the attribute information for being used to inquire about of the input is searched in archives, generates Query Result.
2. according to the method for claim 1, it is characterised in that it is described the archives initial data is handled after, root Archives initial data is screened according to business demand and forms archives graph data, generates archives, including：

Remove the duplicate data in archives initial data；

Bridge point is selected according to business demand, the bridge point is the costly attribute information type of business；

Archives initial data is screened according to the bridge point, filters out single bridge data and doube bridge data；

Archives graph data is formed as archives according to bridge point, single bridge data and doube bridge data.
3. according to the method for claim 2, it is characterised in that single bridge data be comprising two attribute informations in only There is the archives initial data that an attribute information is bridge point；

The doube bridge data be comprising two attribute informations be all bridge point archives initial data.
4. according to the method for claim 2, it is characterised in that described to be formed according to bridge point, single bridge data and doube bridge data Archives graph data, archives are generated, including：

Extract summit of all attribute informations for belonging to bridge point as archives graph data；

Form the side of archives graph data according to single bridge data and doube bridge data, and bridge point is not belonging to by what single bridge packet contained Attribute information is connected to corresponding summit.
5. according to the method for claim 1, it is characterised in that described to be searched in archives with the input for inquiring about Attribute information corresponding to information, generate Query Result, including：

According to the attribute information for being used to inquire about of input, archives corresponding with the attribute information for being used to inquire about of the input are obtained Graph data；

Attribute information generation Query Result in the archives graph data.
A kind of 6. device of file administration, it is characterised in that including：

Data acquisition module, for obtaining archives initial data, wherein, every archives initial data includes two attribute informations；

Data processing module, after handling the archives initial data, according to business demand to archives initial data Screened and form archives graph data as archives；

Query Information input module, for receiving the attribute information for being used to inquire about of input；

Query Result generation module, for searching letter corresponding with the attribute information for being used to inquire about of the input in archives Breath, generate Query Result.
7. device according to claim 6, it is characterised in that data processing module includes：

Data deduplication unit, for removing the duplicate data in archives initial data；

Bridge clicks order member, and for selecting bridge point according to business demand, the bridge point is the costly attribute information type of business；

Data screening unit, for being screened according to the bridge point to archives initial data, filter out single bridge data and doube bridge Data；

Graph data generation unit, for forming archives graph data as archives according to bridge point, single bridge data and doube bridge data.
8. device according to claim 7, it is characterised in that single bridge data be comprising two attribute informations in only There is the archives initial data that an attribute information is bridge point；

The doube bridge data be comprising two attribute informations be all bridge point archives initial data.
9. a kind of equipment of file administration, including memory, processor and storage can be run on a memory and on a processor Computer program, it is characterised in that during the computing device described program realize as described in any in claim 1-5 The method of file administration.
10. a kind of computer-readable recording medium, is stored thereon with computer program, it is characterised in that the program is by processor The method that the file administration as described in any in claim 1-5 is realized during execution.