CN110191128A - A kind of tax shared file system and implementation method based on HDFS - Google Patents

A kind of tax shared file system and implementation method based on HDFS Download PDF

Info

Publication number
CN110191128A
CN110191128A CN201910462960.2A CN201910462960A CN110191128A CN 110191128 A CN110191128 A CN 110191128A CN 201910462960 A CN201910462960 A CN 201910462960A CN 110191128 A CN110191128 A CN 110191128A
Authority
CN
China
Prior art keywords
file
server end
tax
user
hdfs
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910462960.2A
Other languages
Chinese (zh)
Inventor
苗坡
杨培强
程林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Inspur Business System Co Ltd
Original Assignee
Shandong Inspur Business System Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Inspur Business System Co Ltd filed Critical Shandong Inspur Business System Co Ltd
Priority to CN201910462960.2A priority Critical patent/CN110191128A/en
Publication of CN110191128A publication Critical patent/CN110191128A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/176Support for shared access to files; File sharing support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/10Tax strategies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Computer Hardware Design (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Technology Law (AREA)
  • Human Computer Interaction (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of tax shared file system and implementation method based on HDFS, belong to tax technical field, the technical problem to be solved in the present invention is how to be more subtly managed to tax office file, sufficiently improve the safety of tax office internal file, improve the efficiency of each department, each tax staff's file-sharing, the technical solution of use are as follows: the system includes at least one Web client, at least one third-party application system and at least one server end;The functional module of Web client includes document management module, system management module and shares management module;Third-party application system includes at least one big data group system, at least one cluster management module, at least one interface administration module and at least one file conversion system;Server end includes at least one database.The implementation method of the invention also discloses a kind of tax shared file system based on HDFS.

Description

A kind of tax shared file system and implementation method based on HDFS
Technical field
The present invention relates to tax technical field, specifically a kind of tax shared file system and realization based on HDFS Method.
Background technique
With the fast development of cloud computing and development of Mobile Internet technology, the Dropbox technology based on cloud storage has obtained very great Cheng The application of degree.By using Dropbox, what user can be convenient the share file of oneself, and can quickly to data carry out backup and Restore, evades the data hazards that may occur.
With the popularization of Golden Taxes the third stage of the project and the foundation of tax big data platform, unified national expropriation and management data standard and Bore, realize the whole nation expropriation and management data centralization, Golden Taxes the third stage of the project using " apply provincial concentration, creation data province office landing, Then focus on general bureau " mode, therefore there is great demand in taxation authority to internal shared file system.
Since current cloud storage service is commercially produced product mostly, for national tax informatization, exist as Significant defect down:
(1), the data center facility that existing cloud storage service needs that third party is relied on to provide, by Data Hosting to third Side carries out on-demand accessing operation, this non-fully autonomous management mould to data by public cloud, private clound or mixed cloud form There is the risk divulged a secret in formula;
(2), the storage service Platform Price that third party provides is high, fully transparent to upper-layer user, lacks reliability and protects Barrier.
In conclusion how to be more subtly managed to tax office file, tax office internal file is sufficiently improved Safety improves each department, the efficiency of file-sharing of each tax staff becomes urgent problem to be solved.
The patent document of Patent No. CN108985915A discloses a kind of property tax shared system, including cashier's client, Financial client and tax client;Cashier's client provides upload, printing, inquiry ticketing services for enterprise cashier, and enterprise goes out It receives and bar code printing voucher is scanned to service platform, printing bar code, barcode scanning gun by the photo that cashier's client uploads business bill; Financial client user handles accounting event in real time and result is transferred to cashier's client confirmation from financial client order. But the technical solution cannot subtly be managed tax office file, sufficiently improve the safety of tax office internal file, improve Each department, each tax staff file-sharing efficiency.
The patent document of Patent No. CN107943958A discloses a kind of Individual Income Tax master data shared system and method, passes through Lateral sharing module utilizes database copy mode, and legal person's master data of core Individual Income Tax system production is copied to Individual Income Tax system, Legal person's master data is shared to each subsystem of Individual Income Tax internal system by shared service;Pass through shared service and database duplication side Natural person's master data of Individual Income Tax system is copied to paying taxes service platform by formula, is issued by shared service to core Individual Income Tax system Master data interface;Total tax system master data to be shared is synchronized to target point by database copy mode by total score sharing module Tax system;The unified distribution service of master data more new record is called, master data is changed into record data and is issued by message queue It is cached to target tax system.The technical solution guarantees Individual Income Tax master data laterally altogether using shared service and database copy mode Enjoy real-time;By message queue and database copy mode, guarantee Individual Income Tax master data between total tax system and point tax system Real-time, but subtly tax office file cannot be managed, sufficiently improve the safety of tax office internal file, improved each Department, each tax staff's file-sharing efficiency.
Summary of the invention
Technical assignment of the invention is to provide a kind of tax shared file system and implementation method based on HDFS, to solve How more subtly tax office file to be managed, sufficiently the safety of raising tax office internal file, raising each department, The problem of efficiency of each tax staff's file-sharing.
Technical assignment of the invention realizes that a kind of tax shared file system based on HDFS should in the following manner System includes,
At least one Web client, for being deployed in the Dropbox using B/S framework of server end by browser access System is simultaneously shown by Web page;The functional module of Web client includes document management module, system management module and shares Management module;
At least one third-party application system, for disposing big data group system by server end;Third-party application System includes at least one big data group system, at least one cluster management module, at least one interface administration module and extremely A few file conversion system;Big data group system for realizing in cloud storage it is multiple storage equipment between collaborative works, So that multiple storage equipment is externally provided same service, and bigger, stronger, better data access performance is provided;Cluster Management of the management module for cluster in big data group system;Interface administration module connects for cluster in big data group system The management of mouth;File conversion system is for being converted to pdf file for the file of word, excel, ppt, txt and sending out pdf file It is sent in big data group system;
At least one server end disposes big data group system for disposing Dropbox system, and by server end;Clothes Business device end includes at least one database;Database is for storage file management module, system management module and shares management mould The data of block, document management module, system management module pass through DAO data access object access data with management module is shared Library;The data encryption of database uses kerberos Authority Verification technology.
Preferably, the document management module is used to manage the file in Dropbox system, document management module includes mesh Record management submodule, file management submodule and previewing file submodule;Directory management submodule is used for administrative directory file;Text Part management submodule is for managing file;Previewing file submodule is used for the preview of file;
The system management module is for being arranged system parameter;System management module includes user management submodule and permission Manage submodule;User management submodule is for being arranged customer parameter and managing user information;Rights management submodule is for setting It sets the rights parameters of user and manages the permission of user;
The management module of sharing is used to manage the shared of tax file in Dropbox system;Sharing management module includes sharing Management submodule is shared with Dropbox user submodule and open sharing submodule.
More preferably, which is developed using Java, JavaScript language;B/S framework using LouShang6, The frame of Spring MVC, Spring, MyBatis;Big data group system is by HDFS, HBase storage file and logical directories Composition;Database uses Oracle;Server end uses the mainstream middleware server of Tomcat or Weblogic.
A kind of implementation method of the tax shared file system based on HDFS, specific step is as follows for the implementation method:
S1, it a kind of above-mentioned tax shared file system based on HDFS is passed through to server end disposes Hadoop cluster, it is real Collaborative work in existing cloud storage between multiple storage equipment makes multiple storage equipment externally provide same service and provides more Greatly, stronger, better data access performance;
The database in the tax shared file system based on HDFS in S2, step S1 uses kerberos Authority Verification Technology guarantees that database will not be accessed by the user of unauthorized in Dropbox system;
S3, guarantee that the database in Dropbox system will not be lost by more technology of data copy, guarantee cloud storage itself It is safe and stable;
S4, each user connection is assigned to by the minimum server end of load by dynamic load equilibrium technology, realized high Effect service;
S5, user utilize the tax based on HDFS in the browser accessing step S1 in Web client or mobile client File is uploaded to personal Dropbox by business shared file system, user, the file renamed, moved to file, replicated, deleted Operation, while by the sharing files of oneself to the user in addition to user, by realizing file management and text in Web page Part is shared, and file upload, file download, sharing files, the online preview of file, big document breakpoint transmission and duplicate file are completed It passes within MD5 seconds.
Preferably, the process that the file uploads is as follows:
(1), Web client front end obtains file size;
(2), Web client front end thinks that server end issues the request for obtaining user's limit;
(3), server end calculates user's limit of big data group system (HDFS group system) and returns to user's limit To Web client front end, the case where user's limit that Web client front end is returned according to server end, judges big data cluster system The user of system whether quota:
1., if so, being back to step (2);
2., if it is not, then in next step execute step (4);
(4), transmitting file is to server end on Web client front end, and transmitting file is to big data group system on server end;
(5), after the completion of file uploads, big data cluster system update user limit is simultaneously sent to server end;
(6), the return of big data group system uploads the message terminated to server end, and server end, which returns, uploads end Message is to Web service front end.
Preferably, the process of the file download is as follows:
(1), Web client front end sends downloading file request to server end;
(2), server end sends downloading file request to big data group system;
(3), big data group system returns to file read-write and flows to server end;
(4), server end returns to file read-write and flows to Web client front end;
(5), file is downloaded from big data group system in Web client front end;
(6), file download is completed, and big data group system returns to the message of the downloading end of file to Web client front end.
Preferably, the process of the sharing files is as follows:
(1), Web client front end is sent to server end shares file request;
(2), server end obtain share type with shares object and returns to sharing type to Web client front end;
(3), Web client front end will request selected object of sharing to be sent to server end;
(4), server end binding share file with shares object and return to the successful message to Web client of sharing before End.
Preferably, the process of the online preview of file is as follows:
(1), server end passes through the openoffice tool of third-party application system, by word, excel, ppt, txt File is converted to pdf file;
(2), pdf file is converted into the file of swf format by swfTools by server end;
(3), the front end of the customer side Web is shown in Web page by FlexPaper document component.
Preferably, the process of the big document breakpoint transmission is as follows:
(1), before preparing upper transmitting file, big file is first divided into the blocks of files and number of same size;
(2), multiple threads are opened and uploads multiple blocks of files to server end simultaneously;
(3), before sending each blocks of files, first inquired to server end, this document block whether on be transmitted through:
If 1., this document block successfully upload, then skip the upload of this document block;
If 2., this document block do not upload or completely upload, then follow the steps (4);
(4), this document block is uploaded;
(5), after Web client has uploaded all blocks of files, notice server end merges all blocks of files.
Wherein, breakpoint is that the file that one to be uploaded is divided into multiple portions in upload procedure, using multiple concurrent Thread carries out the upload of multiple portions, and when some time point, for some reason, task is suspended, and uploads the position of pause at this time Setting is exactly breakpoint;The succeeded at this time part of upload will be saved by server;It resumes exactly when user again continues to upload Before when unfinished file, system will not upload the part for the upload that succeeded before again, but directly suspend from before Part start to upload.
Preferably, the described duplicate file MD5 seconds processes passed are as follows:
(1), at the beginning of file uploads, local file is subjected to HASH calculating, obtains file fingerprint;
(2), file fingerprint data are uploaded onto the server end;
(3), file fingerprint and existing file fingerprint are compared server end, and return to comparison result and give Web visitor Family end;
(4), whether Web client obtains comparison result, and compare and succeed:
If 1., compare successfully, illustrate that server end with the presence of same file, executes step (5) in next step;
If 2., comparison it is unsuccessful, with commonly uploading, by way of HTTP, file is uploaded onto the server end;
(5), directly filename, file fingerprint and file identifier are uploaded onto the server end, and server-side is receiving Later, filename is stored in client under one's name, file is mapped in former documentary path, and the return second, which passes, successfully believes Breath.
Of the invention tax shared file system and implementation method based on HDFS has the advantage that
(1), the present invention gives full play to the advantage of big data platform, is the tax using big data and big data safe practice Department's file security provides technical support with shared;
(2), the present invention realizes the distributed storage of the Miscellaneous Documents such as text, picture, music, video, realizes file The operation of the network disk files such as upload, downloading, movement, duplication, deletion, unloading, realizes the online preview of a plurality of types of files, realizes By tax staff, taxation authority, post, it is open a variety of files and the file sharing mode such as share, it is disconnected to realize big file Point resumes and realizes duplicate file second biography;
(3), the present invention is based on big data basic platform using Tax, and exploitation web access layer is right on this basis Outer offer service realizes multiple storage equipment in cloud storage by big data basic platform by server disposition Hadoop cluster Between collaborative work, so that multiple storage equipment is externally provided same service, and provide bigger, stronger, better number According to access performance;Data encryption guarantees that the data in cloud storage will not be by the use of unauthorized using kerberos Authority Verification technology Family is accessed;By more technology of data copy guarantee cloud storage in data will not lose, guarantee cloud storage itself safety and Stablize;Each user connection is assigned to by dynamic load equilibrium technology and loads minimum server, realizes the efficient of system Service;Any one authorized user may log on system and enjoy independent Dropbox service simultaneously, pass through Web page, user File can be managed, the functions such as be uploaded, downloaded, shared to file;
(4) present invention realizes securely and reliably, autonomous controllable large data files storage service, while can be conveniently fast Document manipulation is completed in victory, is realized the file-sharing service of simple multiplicity, is improved the file-sharing efficiency in tax intra-office portion.
Detailed description of the invention
The following further describes the present invention with reference to the drawings.
Attached drawing 1 is the tax shared file system structural block diagram based on HDFS;
Attached drawing 2 is the structural block diagram of Web client;
Attached drawing 3 is the flow diagram that file uploads;
Attached drawing 4 is the flow diagram of file download;
Attached drawing 5 is the flow diagram of sharing files.
Specific embodiment
Referring to Figure of description and specific embodiment to a kind of tax shared file system based on HDFS of the invention and Implementation method is described in detail below.
Embodiment 1:
As shown in Fig. 1, the tax shared file system of the invention based on HDFS, structure mainly includes web client End, third-party application system and server end.Web client is used to be deployed in the use BS of server end by browser access The Dropbox system of framework is simultaneously shown by Web page;As shown in Fig. 2, the functional module of Web client includes file management mould Block, system management module and share management module;Document management module is used to manage the file in Dropbox system, file management mould Block includes directory management submodule, file management submodule and previewing file submodule;Directory management submodule is for managing mesh Record file;File management submodule is for managing file;Previewing file submodule is used for the preview of file;System management module is used In setting system parameter;System management module includes user management submodule and rights management submodule;User management submodule For customer parameter and managing user information to be arranged;Rights management submodule is used to be arranged the rights parameters of user and manages user Permission;Share management module and is used to manage the shared of tax file in Dropbox system;Sharing management module includes sharing management Submodule is shared with Dropbox user submodule and open sharing submodule.Third-party application system is for passing through server end Dispose big data group system;Third-party application system includes big data group system, cluster management module, interface administration module And file conversion system;Big data group system makes more for realizing the collaborative work between storage equipment multiple in cloud storage A storage equipment can externally provide same service, and provide bigger, stronger, better data access performance;Cluster management Management of the module for cluster in big data group system;Interface administration module is for cluster interface in big data group system Management;File conversion system is for being converted to pdf file for the file of word, excel, ppt, txt and sending pdf file to In big data group system;Server end disposes big data group system for disposing Dropbox system, and by server end;Clothes Business device end includes database;Database is used for storage file management module, system management module and the data for sharing management module, Document management module, system management module pass through DAO data access object access database with management module is shared;Database Data encryption uses kerberos Authority Verification technology.The system is developed using Java, JavaScript language;B/S framework Using the frame of LouShang6, Spring MVC, Spring, MyBatis;Big data group system is stored by HDFS, HBase File and logical directories composition;Database uses Oracle;Server end uses the mainstream middleware of Tomcat or Weblogic Server.
Embodiment 2:
The implementation method of tax shared file system based on HDFS of the invention, specific step is as follows for the implementation method:
S1, one of embodiment 1 is disposed into Hadoop by server end based on the tax shared file system of HDFS Cluster realizes the collaborative work in cloud storage between multiple storage equipment, multiple storage equipment is made externally to provide same service simultaneously Bigger, stronger, better data access performance is provided;
The database in the tax shared file system based on HDFS in S2, step S1 uses kerberos Authority Verification Technology guarantees that database will not be accessed by the user of unauthorized in Dropbox system;
S3, guarantee that the database in Dropbox system will not be lost by more technology of data copy, guarantee cloud storage itself It is safe and stable;
S4, each user connection is assigned to by the minimum server end of load by dynamic load equilibrium technology, realized high Effect service;
S5, user utilize the tax based on HDFS in the browser accessing step S1 in Web client or mobile client File is uploaded to personal Dropbox by business shared file system, user, the file renamed, moved to file, replicated, deleted Operation, while by the sharing files of oneself to the user in addition to user, by realizing file management and text in Web page Part is shared, and file upload, file download, sharing files, the online preview of file, big document breakpoint transmission and duplicate file are completed It passes within MD5 seconds.
As shown in Fig. 3, the process that file uploads is as follows:
(1), Web client front end obtains file size;
(2), Web client front end thinks that server end issues the request for obtaining user's limit;
(3), server end calculates user's limit of big data group system (HDFS group system) and returns to user's limit To Web client front end, the case where user's limit that Web client front end is returned according to server end, judges big data cluster system The user of system whether quota:
1., if so, being back to step (2);
2., if it is not, then in next step execute step (4);
(4), transmitting file is to server end on Web client front end, and transmitting file is to big data group system on server end;
(5), after the completion of file uploads, big data cluster system update user limit is simultaneously sent to server end;
(6), the return of big data group system uploads the message terminated to server end, and server end, which returns, uploads end Message is to Web service front end.
As shown in Fig. 4, the process of file download is as follows:
(1), Web client front end sends downloading file request to server end;
(2), server end sends downloading file request to big data group system;
(3), big data group system returns to file read-write and flows to server end;
(4), server end returns to file read-write and flows to Web client front end;
(5), file is downloaded from big data group system in Web client front end;
(6), file download is completed, and big data group system returns to the message of the downloading end of file to Web client front end.
As shown in Fig. 5, the process of sharing files is as follows:
(1), Web client front end is sent to server end shares file request;
(2), server end obtain share type with shares object and returns to sharing type to Web client front end;
(3), Web client front end will request selected object of sharing to be sent to server end;
(4), server end binding share file with shares object and return to the successful message to Web client of sharing before End.
The process of the online preview of file is as follows:
(1), server end passes through the openoffice tool of third-party application system, by word, excel, ppt, txt File is converted to pdf file;
(2), pdf file is converted into the file of swf format by swfTools by server end;
(3), the front end of the customer side Web is shown in Web page by FlexPaper document component.
The process of big document breakpoint transmission is as follows:
(1), before preparing upper transmitting file, big file is first divided into the blocks of files and number of same size;
(2), multiple threads are opened and uploads multiple blocks of files to server end simultaneously;
(3), before sending each blocks of files, first inquired to server end, this document block whether on be transmitted through:
If 1., this document block successfully upload, then skip the upload of this document block;
If 2., this document block do not upload or completely upload, then follow the steps (4);
(4), this document block is uploaded;
(5), after Web client has uploaded all blocks of files, notice server end merges all blocks of files.
Wherein, breakpoint is that the file that one to be uploaded is divided into multiple portions in upload procedure, using multiple concurrent Thread carries out the upload of multiple portions, and when some time point, for some reason, task is suspended, and uploads the position of pause at this time Setting is exactly breakpoint;The succeeded at this time part of upload will be saved by server;It resumes exactly when user again continues to upload Before when unfinished file, system will not upload the part for the upload that succeeded before again, but directly suspend from before Part start to upload.
The process passed is as follows within duplicate file MD5 seconds:
(1), at the beginning of file uploads, local file is subjected to HASH calculating, obtains file fingerprint;
(2), file fingerprint data are uploaded onto the server end;
(3), file fingerprint and existing file fingerprint are compared server end, and return to comparison result and give Web visitor Family end;
(4), whether Web client obtains comparison result, and compare and succeed:
If 1., compare successfully, illustrate that server end with the presence of same file, executes step (5) in next step;
If 2., comparison it is unsuccessful, with commonly uploading, by way of HTTP, file is uploaded onto the server end;
(5), directly filename, file fingerprint and file identifier are uploaded onto the server end, and server-side is receiving Later, filename is stored in client under one's name, file is mapped in former documentary path, and the return second, which passes, successfully believes Breath.
Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent Pipe present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: its according to So be possible to modify the technical solutions described in the foregoing embodiments, or to some or all of the technical features into Row equivalent replacement;And these are modified or replaceed, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution The range of scheme.

Claims (10)

1. a kind of tax shared file system based on HDFS, which is characterized in that the system includes,
At least one Web client, for being deployed in the Dropbox system using B/S framework of server end by browser access And it is shown by Web page;The functional module of Web client includes document management module, system management module and shares management Module;
At least one third-party application system, for disposing big data group system by server end;Third-party application system Including at least one big data group system, at least one cluster management module, at least one interface administration module and at least one A file conversion system;Big data group system makes more for realizing the collaborative work between storage equipment multiple in cloud storage A storage equipment can externally provide same service, and provide bigger, stronger, better data access performance;Cluster management Management of the module for cluster in big data group system;Interface administration module is for cluster interface in big data group system Management;File conversion system is for being converted to pdf file for the file of word, excel, ppt, txt and sending pdf file to In big data group system;
At least one server end disposes big data group system for disposing Dropbox system, and by server end;Server End includes at least one database;Database is for storage file management module, system management module and shares management module Data, document management module, system management module pass through DAO data access object access database with management module is shared;Number Kerberos Authority Verification technology is used according to the data encryption in library.
2. the tax shared file system according to claim 1 based on HDFS, which is characterized in that the file management mould Block is used to manage the file in Dropbox system, and document management module includes directory management submodule, file management submodule and text Part preview submodule;Directory management submodule is used for administrative directory file;File management submodule is for managing file;File is pre- Submodule of looking at is used for the preview of file;
The system management module is for being arranged system parameter;System management module includes user management submodule and rights management Submodule;User management submodule is for being arranged customer parameter and managing user information;Rights management submodule is used for being arranged The rights parameters at family and the permission for managing user;
The management module of sharing is used to manage the shared of tax file in Dropbox system;Sharing management module includes sharing management Submodule is shared with Dropbox user submodule and open sharing submodule.
3. the tax shared file system according to claim 1 or 2 based on HDFS, which is characterized in that the system uses Java, JavaScript language are developed;B/S framework uses the frame of LouShang6, Spring MVC, Spring, MyBatis Frame;Big data group system is made of HDFS, HBase storage file and logical directories;Database uses Oracle;Server end Using the mainstream middleware server of Tomcat or Weblogic.
4. a kind of implementation method of the tax shared file system based on HDFS, which is characterized in that the implementation method specific steps It is as follows:
S1, any one in claim 1-3 is disposed based on the tax shared file system of HDFS by server end Hadoop cluster realizes the collaborative work in cloud storage between multiple storage equipment, provides multiple storage equipment externally same It services and data access performance is provided;
The database in the tax shared file system based on HDFS in S2, step S1 uses kerberos Authority Verification skill Art guarantees that database will not be accessed by the user of unauthorized in Dropbox system;
S3, guarantee that the database in Dropbox system will not be lost by more technology of data copy, guarantee the safety of cloud storage itself And stabilization;
S4, each user connection is assigned to by the minimum server end of load by dynamic load equilibrium technology, realizes efficiently clothes Business;
S5, user utilize the tax text based on HDFS in the browser accessing step S1 in Web client or mobile client File is uploaded to personal Dropbox by part shared system, user, the file behaviour renamed, moved to file, replicated, deleted Make, while by the sharing files of oneself to the user in addition to user, by realizing file management and file in Web page Share, completes file upload, file download, sharing files, the online preview of file, big document breakpoint transmission and duplicate file It passes within MD5 seconds.
5. the implementation method of the tax shared file system according to claim 4 based on HDFS, which is characterized in that described The process that file uploads is as follows:
(1), Web client front end obtains file size;
(2), Web client front end thinks that server end issues the request for obtaining user's limit;
(3), server end calculates user's limit of big data group system and returns to user's limit to Web client front end, Web The case where user's limit that client front end is returned according to server end judge big data group system user whether quota:
1., if so, being back to step (2);
2., if it is not, then in next step execute step (4);
(4), transmitting file is to server end on Web client front end, and transmitting file is to big data group system on server end;
(5), after the completion of file uploads, big data cluster system update user limit is simultaneously sent to server end;
(6), big data group system returns to the message that the message for uploading and terminating terminates to server end, server end return upload To Web service front end.
6. the implementation method of the tax shared file system according to claim 4 based on HDFS, which is characterized in that described The process of file download is as follows:
(1), Web client front end sends downloading file request to server end;
(2), server end sends downloading file request to big data group system;
(3), big data group system returns to file read-write and flows to server end;
(4), server end returns to file read-write and flows to Web client front end;
(5), file is downloaded from big data group system in Web client front end;
(6), file download is completed, and big data group system returns to the message of the downloading end of file to Web client front end.
7. the implementation method of the tax shared file system according to claim 4 based on HDFS, which is characterized in that described The process of sharing files is as follows:
(1), Web client front end is sent to server end shares file request;
(2), server end obtain share type with shares object and returns to sharing type to Web client front end;
(3), Web client front end will request selected object of sharing to be sent to server end;
(4), server end binding share file with shares object and returns to the successful message of sharing to Web client front end.
8. the implementation method of the tax shared file system according to claim 4 based on HDFS, which is characterized in that described The process of the online preview of file is as follows:
(1), server end passes through the openoffice tool of third-party application system, by the file of word, excel, ppt, txt Be converted to pdf file;
(2), pdf file is converted into the file of swf format by swfTools by server end;
(3), the front end of the customer side Web is shown in Web page by FlexPaper document component.
9. the implementation method of the tax shared file system according to claim 4 based on HDFS, which is characterized in that described The process of big document breakpoint transmission is as follows:
(1), before preparing upper transmitting file, big file is first divided into the blocks of files and number of same size;
(2), multiple threads are opened and uploads multiple blocks of files to server end simultaneously;
(3), before sending each blocks of files, first inquired to server end, this document block whether on be transmitted through:
If 1., this document block successfully upload, then skip the upload of this document block;
If 2., this document block do not upload or completely upload, then follow the steps (4);
(4), this document block is uploaded;
(5), after Web client has uploaded all blocks of files, notice server end merges all blocks of files.
10. the implementation method of the tax shared file system according to claim 4 based on HDFS, which is characterized in that institute It is as follows to state the process that duplicate file MD5 seconds pass:
(1), at the beginning of file uploads, local file is subjected to HASH calculating, obtains file fingerprint;
(2), file fingerprint data are uploaded onto the server end;
(3), file fingerprint and existing file fingerprint are compared server end, and return to comparison result to Web client;
(4), whether Web client obtains comparison result, and compare and succeed:
If 1., compare successfully, illustrate that server end with the presence of same file, executes step (5) in next step;
If 2., comparison it is unsuccessful, with commonly uploading, by way of HTTP, file is uploaded onto the server end;
(5), directly filename, file fingerprint and file identifier are uploaded onto the server end, and server-side is after receiving, Filename is stored in client under one's name, file is mapped in former documentary path, and second biography successful information is returned.
CN201910462960.2A 2019-05-30 2019-05-30 A kind of tax shared file system and implementation method based on HDFS Pending CN110191128A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910462960.2A CN110191128A (en) 2019-05-30 2019-05-30 A kind of tax shared file system and implementation method based on HDFS

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910462960.2A CN110191128A (en) 2019-05-30 2019-05-30 A kind of tax shared file system and implementation method based on HDFS

Publications (1)

Publication Number Publication Date
CN110191128A true CN110191128A (en) 2019-08-30

Family

ID=67718802

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910462960.2A Pending CN110191128A (en) 2019-05-30 2019-05-30 A kind of tax shared file system and implementation method based on HDFS

Country Status (1)

Country Link
CN (1) CN110191128A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110888853A (en) * 2019-11-26 2020-03-17 廊坊新奥燃气有限公司 Data management system and method
CN111784296A (en) * 2020-07-01 2020-10-16 山东爱城市网信息技术有限公司 Government affair material management tool and business handling method thereof
CN112104740A (en) * 2020-09-21 2020-12-18 浪潮云信息技术股份公司 Software automatic pushing and upgrading system and method based on domestic CPU and OS
CN112328566A (en) * 2020-11-10 2021-02-05 天元大数据信用管理有限公司 Shared file storage service assembly
CN112702380A (en) * 2020-08-20 2021-04-23 纬领(青岛)网络安全研究院有限公司 Private cloud disk mobile plate
CN112988166A (en) * 2021-03-10 2021-06-18 中国电建集团昆明勘测设计研究院有限公司 Model conversion service providing method based on user side
CN113179230A (en) * 2021-03-18 2021-07-27 深圳微众信用科技股份有限公司 Data acquisition method and device
CN114185484A (en) * 2021-11-04 2022-03-15 福建升腾资讯有限公司 Method, device, equipment and medium for clustering document storage

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102571916A (en) * 2011-12-02 2012-07-11 曙光信息产业(北京)有限公司 Framework of leasing software of cloud storage space and operating method of framework
CN102761521A (en) * 2011-04-26 2012-10-31 上海格尔软件股份有限公司 Cloud security storage and sharing service platform
CN103442037A (en) * 2013-08-09 2013-12-11 华南理工大学 Method for achieving multithreading breakpoint upload of oversized file based on FTP
CN103729338A (en) * 2013-12-29 2014-04-16 国云科技股份有限公司 File on-line previewing method
CN104010016A (en) * 2013-02-27 2014-08-27 联想(北京)有限公司 Data management method, cloud server and terminal device
US9141814B1 (en) * 2014-06-03 2015-09-22 Zettaset, Inc. Methods and computer systems with provisions for high availability of cryptographic keys

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102761521A (en) * 2011-04-26 2012-10-31 上海格尔软件股份有限公司 Cloud security storage and sharing service platform
CN102571916A (en) * 2011-12-02 2012-07-11 曙光信息产业(北京)有限公司 Framework of leasing software of cloud storage space and operating method of framework
CN104010016A (en) * 2013-02-27 2014-08-27 联想(北京)有限公司 Data management method, cloud server and terminal device
CN103442037A (en) * 2013-08-09 2013-12-11 华南理工大学 Method for achieving multithreading breakpoint upload of oversized file based on FTP
CN103729338A (en) * 2013-12-29 2014-04-16 国云科技股份有限公司 File on-line previewing method
US9141814B1 (en) * 2014-06-03 2015-09-22 Zettaset, Inc. Methods and computer systems with provisions for high availability of cryptographic keys

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
高正九: ""基于HDFS的云存储***的设计与实现"", 《万方》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110888853A (en) * 2019-11-26 2020-03-17 廊坊新奥燃气有限公司 Data management system and method
CN111784296A (en) * 2020-07-01 2020-10-16 山东爱城市网信息技术有限公司 Government affair material management tool and business handling method thereof
CN112702380A (en) * 2020-08-20 2021-04-23 纬领(青岛)网络安全研究院有限公司 Private cloud disk mobile plate
CN112104740A (en) * 2020-09-21 2020-12-18 浪潮云信息技术股份公司 Software automatic pushing and upgrading system and method based on domestic CPU and OS
CN112104740B (en) * 2020-09-21 2023-03-28 浪潮云信息技术股份公司 Software automatic pushing and upgrading system and method based on domestic CPU and OS
CN112328566A (en) * 2020-11-10 2021-02-05 天元大数据信用管理有限公司 Shared file storage service assembly
CN112988166A (en) * 2021-03-10 2021-06-18 中国电建集团昆明勘测设计研究院有限公司 Model conversion service providing method based on user side
CN112988166B (en) * 2021-03-10 2022-12-02 中国电建集团昆明勘测设计研究院有限公司 Model conversion service providing method based on user side
CN113179230A (en) * 2021-03-18 2021-07-27 深圳微众信用科技股份有限公司 Data acquisition method and device
CN114185484A (en) * 2021-11-04 2022-03-15 福建升腾资讯有限公司 Method, device, equipment and medium for clustering document storage

Similar Documents

Publication Publication Date Title
CN110191128A (en) A kind of tax shared file system and implementation method based on HDFS
CN110443658B (en) Tax management method, apparatus, medium and electronic device based on block chain system
WO2021213065A1 (en) Blockchain data archiving method and apparatus, and computer readable storage medium
US10789597B2 (en) Systems and methods for using a distributed ledger for data handling
CN109639406A (en) Efficient trust solution based on block chain and IPFS
CN106156359B (en) A kind of data synchronization updating method under cloud computing platform
CN108076148A (en) Storage system based on block chain
DE102021123128A1 (en) DATA MIGRATION REVIEW PROTOCOL REALIZED VIA BLOCKCHAINS
US20090144183A1 (en) Managing user accounts for storage delivery network
CN110688261A (en) Heterogeneous electronic file cloud disaster recovery system based on block chain
DE112021001413T5 (en) ADMINISTRATION OF PRIVILEGED ACCESS WITH LOW TRUST
Nair et al. Blockchain‐Based Decentralized Cloud Solutions for Data Transfer
CN110138881A (en) A kind of distributed memory system and its storage method
Daraghmi et al. A Blockchain‐Based Editorial Management System
CN111625873A (en) Controllable information disclosure method and system based on mixed block chain
CN102932443A (en) HDFS (hadoop distributed file system) cluster based distributed cloud storage system
US20240031157A1 (en) Multi-level Access Distributed Ledger System
WO2023221719A1 (en) Data processing method and apparatus, computer device, and readable storage medium
US11533377B2 (en) Hybrid cloud
CN107483571A (en) A kind of dynamic cloud storage method and system
CN113190609A (en) Data warehouse management method, system, device, storage medium and electronic equipment
TW583539B (en) Internet-based document management system and method of providing Internet-based document management
CN111861117A (en) Musical instrument evaluation data sharing method and system based on alliance chain
CN105844171B (en) Method and device for file synchronization control
WO2019224593A1 (en) Method and system for generating block chain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190830