CN110457696A - A kind of talent towards file data and policy intelligent Matching system and method - Google Patents

A kind of talent towards file data and policy intelligent Matching system and method Download PDF

Info

Publication number
CN110457696A
CN110457696A CN201910701445.5A CN201910701445A CN110457696A CN 110457696 A CN110457696 A CN 110457696A CN 201910701445 A CN201910701445 A CN 201910701445A CN 110457696 A CN110457696 A CN 110457696A
Authority
CN
China
Prior art keywords
policy
talent
talents
information
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910701445.5A
Other languages
Chinese (zh)
Inventor
黄丽丽
游河仁
卢佩
石宝玉
姚智振
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou Institute Of Data Technology Co Ltd
Original Assignee
Fuzhou Institute Of Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou Institute Of Data Technology Co Ltd filed Critical Fuzhou Institute Of Data Technology Co Ltd
Priority to CN201910701445.5A priority Critical patent/CN110457696A/en
Publication of CN110457696A publication Critical patent/CN110457696A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Tourism & Hospitality (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Primary Health Care (AREA)
  • Human Resources & Organizations (AREA)
  • Educational Administration (AREA)
  • Economics (AREA)
  • Multimedia (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention discloses a kind of talent towards file data and policy intelligent Matching system and method, is editable electronic document by the personal file scanning recognition of papery and forms personnel record information library;Conditional random field models based on event extraction extract the valuable information in personnel record information library and structured storage forms talents information library;Acquisition obtains current talents selection and constructs talents selection database;Policy semantic primitive abstracting method based on artificial rule is extracted to obtain quantifiable indicator information to the talents selection key message of talents selection database;By the matching of the quantifiable indicator information in talents selection database and the field in talents information library and export the talent in all talent's information banks and policy matching result;The policy categories high from matching result screening comprehensive matching degree score value reject the talent to have declared and push policy to the talent for registration and show project information.The present invention is convenient for precisely pushing to not declaring the talent, and attract talent the project of declaring, and provides power-assisted for government's talent introduction.

Description

A kind of talent towards file data and policy intelligent Matching system and method
Technical field
The present invention relates to data processing technique more particularly to a kind of talent towards file data and policy intelligent Matching systems System and method.
Background technique
The talent is the most crucial element of a regional development, and the outstanding talent is the basis for realizing region development objective, is The power of regional development.One region first has to pay attention to gathering innovation talent, to try to explore Talent Construction to develop The new measure, new route and new method, continually strengthen Talent Construction, provide strong human resources for regional development and guarantee.
Since the type of various policies is different, issuing time is different different with affiliated administrative department, these policies is caused to need It expends considerable time and effort to search for policy source, verifying policy timeliness, assess and declare feasibility etc., this is unfavorable for The talent fully understands policy information in time, is also not easy to it and filters out from magnanimity policy meet its own policy declared.Institute With how the policy data of magnanimity is precisely matched to the talent becomes important research direction.
Talents market, office, people society, archives form natural talent bank there are mass talent archive information, but wherein deposit Archives of paper quality how digitization, talents selection how Auto-matching the problem of.The present invention is based on above-mentioned backgrounds, propose one kind The talent and policy intelligent Matching system and method towards file data.
The talent and policy matching system are based primarily upon the information progress policy matching that user oneself fills in upload at present, such as specially Benefit number 201811287900.3 discloses " a kind of policy intelligent Matching system and method ", and the patent system is according to class of subscriber The basic information for obtaining talent user is filled in selection, user information, carries out successive policy matching.The main problem of the program is base Policy matching is carried out in the spontaneous upload information of user, cannot achieve extensive automatically talent's matching feature.
The patent No. 201710934706.9 discloses a kind of " matching recommendation side based on city specific crowd with the policy that is associated with Method and system ", the patent leave family, urban floating population for city, and the old,weak,sick and disabled three classes crowd collects essential information, lead to It crosses data mining and obtains crowd demand's label, then policy administrative plan of going forward side by side is decomposed based on crowd's label and is recommended.It is with crowd demand It instructs to push policy to crowd not directed to this special documentation in archives, and guard station office is needed manually to acquire phase Pass crowd's information, is inevitably omitted, and promotion are not comprehensive enough.
Summary of the invention
The purpose of the present invention is to provide a kind of talent towards file data and policy intelligent Matching system and method, faces To local archives data, it is automatic to propose that condition random field (CRF) model based on event extraction carries out information to personal file Extract, while automatic collection talents selection, using the policy semantic primitive abstracting method based on artificial rule parse policy and Intelligent Matching carries out policy push to talents information, to the talent, realizes the talent and the policy intelligent Matching system of scale automation System.
The technical solution adopted by the present invention is that:
A kind of talent towards file data and policy intelligent Matching method comprising following steps;
Step 1, obtaining local talent's papery personal file and scanning is image, and will be scanned using image recognition technology Image be converted into editable electronic document, form personnel record information library;
Step 2, valuable letter is extracted using the conditional random field models based on event extraction based on personnel record information library It ceases and structured storage forms talents information library;
Step 3, acquisition obtains current talents selection and constructs talents selection database;
Step 4, the policy semantic primitive abstracting method based on artificial rule closes the talents selection of talents selection database Key information is extracted to obtain quantifiable indicator information;
Step 5, the quantifiable indicator information in talents selection database is matched with the field in talents information library The talent in all talent's information banks and policy matching result are obtained,
Step 6, from the high policy categories of matching result screening comprehensive matching degree score value reject the talent to have declared to The talent for registration pushes policy and shows project information.To the policy categories that do not register to the talent in the form of mail, short message into Administrative plan push shows the declarable project name of the talent, project application economic welfare and the notice link of newest policy.
Further, the specific steps of step 1 are as follows:
Step 1.1, mode typing computer papery personal file scanned, saves as image;
Step 1.2, several subgraphs are divided the image into using OCR technique, includes an individual word in each subgraph It is female;
Step 1.3, by subgraph from image format conversion at binary format, and by binary data transmission to BP nerve Network;
Step 1.4, BP neural network finds out the association between character image data and numerical value by training process, will be swept The image retouched is converted into editable electronic document, and recognition result enters system database.
Further, step 2 carries out the specific steps of event extraction with conditional random field models to personnel record information Are as follows:
Step 2.1, personnel record information archives text carry out word segmentation processing by document representation at tf/idf weight to Amount,
Step 2.2, using based on document frequency method carry out feature extraction, filter out for correctly classify contribute it is low Word;
Step 2.3, archives text training set is set, completes the conversion of archives text training set to characteristic set, and pass through Artificial mark label learns tagsort;
Step 2.4, archives text test set is obtained into same type characteristic set, is then differentiated by sorter model To corresponding tag along sort, carries out structured storage and establish talents information library.
The archives text of step 2.1 generally includes personal essential information, education background, work experience and paper Patent Publication Situation.
Further, the talents selection acquisition modes of step 3 include lead-in mode and extracting mode, and lead-in mode refers to government Policy information editor is actively stored in talents selection database by mechanism or the third-party institution;Extracting mode refer to using robot, Web crawlers, Web Spider cyber stalker associated with talent policies all in selected target government website are believed It ceases progress automatic collection and is downloaded to local server and arrange deposit talents selection database again.
Further, step 4 specifically includes the building of policy dictionary, policy Text Pretreatment and policy information and extracts three steps Suddenly, specifically:
Policy dictionary building: by the class condition of policy type, policy title, applicable elements and keyword to talents selection The vocabulary of database carries out including to form talents selection dictionary, and the text feature and description to local talents selection corpus are accustomed to It is analyzed, extracts the trigger word to the word for playing identification, mark effect of semantic primitive as triggering extraction task.Such as Declaring condition description generally will appear the vocabulary such as " satisfaction ", " meeting ", " necessary ".
Policy Text Pretreatment: it is segmented using Chinese Academy of Sciences ICTCLAS participle tool and marks part of speech;
Policy information extracts: formulating decimation rule based on triggering vocabulary, and describes decimation rule using regular expression and build Vertical rule base finally carries out the extraction of the semantic primitive of local talents selection and is stored in talents selection database.
Further, being carried out in step 5 using the matching process based on semantic matches rule will be in talents selection database Quantifiable indicator information matched with the field in talents information library, specifically includes the following steps:
Step 5.1, matching rule is from the easier to the more advanced sorted, is matched since the index being easiest to, returning to every can quantify Targets match rules results;
Step 5.2, each matching rule assigns different weights and already present index weights are greater than the index not set up, And then the matching degree for obtaining the talent and policy categories is calculated, matching degree includes 5 grades: very matching (1 point), matching (0.8 Point), comparison match (0.6 point), general matching (0.4 point), mismatch (0.2 point).If talents information substantially conforms to the project institute There is matching rule, comprehensive matching similarity then shows that the talent can declare the project close to 1 point.
Step 5.3, all policy categories are retrieved and return to all results and by formation policy after the sequence of comprehensive matching degree With list.
Further, the policy matching result in step 5 includes project name, type of subject, project application condition, declares Economic welfare and the notice link of newest policy.
Further, the invention also discloses a kind of talent towards file data and policy intelligent Matching system, packets It includes with lower module:
Personal file identification module: for being editable electronic document by the personal file scanning recognition of papery and being formed Personnel record information library;
Personal file property extracting module: the conditional random field models based on event extraction extract personnel record information library Valuable information and structured storage formation talents information library;
Policy automatic collection module: current talents selection building talents selection database is obtained for acquiring;
Policy segments parsing module: the policy semantic primitive abstracting method based on artificial rule is to talents selection database Talents selection key message is extracted to obtain quantifiable indicator information;
The talent and policy matching module: in the quantifiable indicator information and talents information library in talents selection database Field matching and export the talent in all talent's information banks and policy matching result;
Policy pushing module: for rejecting from the high policy categories of matching result screening comprehensive matching degree score value with Shen The talent of report pushes policy to the talent for registration and shows project information.
The invention adopts the above technical scheme, establishes policy automatic patching system based on local talent's resources bank.Pass through base Papery talent archive information is identified in the OCR technique of BP neural network, it is automatic to match relevant policies for the local talent and go forward side by side administration Plan casting push solves the problems, such as that talents selection matching takes time and effort.The present invention passes through the condition random field based on event extraction (CRF) technology automatically can efficiently extract valuable information from a variety of non-structured personal files, and carry out Structured storage establishes talents information library, with easy-to-look-up and recycling.The present invention uses big data integrating means, will The policy data of each " information island " is integrated, and proposes the policy semantic primitive abstracting method based on artificial rule to the talent Policy realizes that key message extracts, and talents selection database is constructed, convenient for matching with talent's archive information.The present invention is based on Talents information has been declared in the matching result in talents selection library and talents information library, filtering, realizes precisely push to the talent is not declared.
Detailed description of the invention
The present invention is described in further details below in conjunction with the drawings and specific embodiments;
Fig. 1 is the configuration diagram of a kind of talent towards file data and policy intelligent Matching system of the invention;
Fig. 2 is that the policy semantic primitive based on artificial rule extracts flow chart.
Specific embodiment
In order to solve the problems, such as the effective use of local archives papery talent's data, the present invention offers random field (CRF) model realizes automatic extraction, the policy semantic primitive abstracting method based on artificial rule to extensive personnel record information Talents selection key message is extracted, the talent for establishing automation and policy intelligent Matching system provide automatically for the talent Policy Push Service.As shown in Figure 1, the invention also discloses a kind of talents towards file data and policy intelligent Matching system System comprising with lower module:
Personal file identification module: for being editable electronic document by the personal file scanning recognition of papery and being formed Personnel record information library;
Personal file property extracting module: the conditional random field models based on event extraction extract personnel record information library Valuable information and structured storage formation talents information library;
Policy automatic collection module: current talents selection building talents selection database is obtained for acquiring;
Policy segments parsing module: the policy semantic primitive abstracting method based on artificial rule is to talents selection database Talents selection key message is extracted to obtain quantifiable indicator information;
The talent and policy matching module: in the quantifiable indicator information and talents information library in talents selection database Field matching and export the talent in all talent's information banks and policy matching result;
Policy pushing module: for rejecting from the high policy categories of matching result screening comprehensive matching degree score value with Shen The talent of report pushes policy to the talent for registration and shows project information.
As shown in the figures 1 and 2, the invention discloses a kind of talent towards file data and policy intelligent Matching method, Include the following steps;
Step 1, obtaining local talent's papery personal file and scanning is image, and will be scanned using image recognition technology Image be converted into editable electronic document, form personnel record information library;
Further, the specific steps of step 1 are as follows:
Step 1.1, from talents market, office, people society, the acquisitions local such as archives talent's papery personal file, by papery occurrences in human life The mode typing computer of archives scanning, saves as image;
Step 1.2, several subgraphs are divided the image into using OCR technique, includes an individual word in each subgraph It is female;
Step 1.3, by subgraph from image format conversion at binary format, and by binary data transmission to BP nerve Network;
Step 1.4, BP neural network finds out the association between character image data and numerical value by training process, will be swept The image retouched is converted into editable electronic document, and recognition result enters system database, facilitates subsequent progress full text inspection Rope.
Step 2, the big and non-structured personnel record information for quantity, using the condition random field based on event extraction Model extraction valuable information and structured storage formation talents information library, facilitate the matching of the subsequent talent and policy.
Further, condition random field (CRF) model is a kind of sorter model that can be used for naming Entity recognition, this Model regards the variety classes of information extraction as a kind of label for being directed to feature, thus by information extraction task be converted into Determine the classification problem of text and its feature, feature used in condition random field (CRF) model is as shown in table 1.
1 condition random field of table (CRF) model occurrences in human life archive feature
The present invention carries out the specific steps of event extraction with conditional random field models to personnel record information are as follows:
Step 2.1, personnel record information archives text carry out word segmentation processing by document representation at tf/idf weight to Amount,
Step 2.2, using based on document frequency method carry out feature extraction, filter out for correctly classify contribute it is low Word;
Step 2.3, archives text training set is set, completes the conversion of archives text training set to characteristic set, and pass through Artificial mark label learns tagsort;
Step 2.4, archives text test set is obtained into same type characteristic set, is then differentiated by sorter model To corresponding tag along sort, carries out structured storage and establish talents information library.
Specifically, archives text generally includes personal essential information, education background, work experience and paper Patent Publication feelings Condition.In different content blocks, item of information is listed with different modes, as education background content blocks include may be the time, learn School, learned profession etc.;Paper publishing situation content blocks include may be author, thesis topic, the periodical delivered or meeting Discuss collected works, deliver the time etc..On the other hand, the content blocks of different archives texts may be listed in a different order, most of Archives text is personal essential information first, is next successively education background, work experience, is finally the patent feelings that publish thesis Condition.The item of information listed in each content blocks is sequentially arranged, therefore these items of information can be counted as a system The event of column, therefore event can be expressed as to a five-tuple:
E=<who, when, where, what, how>
Representation method building text feature and mark label based on above-mentioned event, mark label of the invention such as 2 institute of table Show.Essential characteristic includes the features such as entity word, entity word part of speech, upper and lower cliction, upper and lower cliction part of speech, and mark label includes surname Name, event bodies, time, place, the knots such as date of birth, address, phone, position, education experience/work experience/publication paper Fruit etc..The personal file event information of condition random field (CRF) model treatment is subjected to structured storage, establishes talents information Library.
2 condition random field of table (CRF) model personal file label
Step 3, the acquisition of policy automatic collection module obtains current talents selection and constructs talents selection database;Talent's political affairs Plan acquisition modes include lead-in mode and extracting mode, and lead-in mode refers to government organs or the third-party institution actively by policy information Editor's deposit talents selection database;Extracting mode refer to using robot, web crawlers, Web Spider cyber stalker Automatic collection is carried out to policy informations associated with the talent all in selected target government website and is downloaded to local service Device arranges deposit talents selection database again.
Step 4, policy segments policy semantic primitive abstracting method of the parsing module based on artificial rule to talents selection number It is extracted to obtain quantifiable indicator information according to the talents selection key message in library;
Further, as shown in Fig. 2, step 4 specifically includes the building of policy dictionary, policy Text Pretreatment and policy information Three steps are extracted, specifically:
The building of policy dictionary: in order to improve policy participle effect, need to construct talents selection dictionary and information extraction triggering Vocabulary.Information of the building of talents selection dictionary based on local talents selection corpus, by policy type, policy title is applicable in The vocabulary in corpus is included in condition, the classification such as keyword.Trigger word refers to that the extraction to a certain semantic primitive plays Identification, mark effect, can trigger the word of extraction task.The building that information extraction triggers vocabulary is based on to local talents selection The text feature and description habit of corpus are analyzed.Such as declare condition description generally will appear " satisfaction ", " meeting ", Vocabulary such as " necessary ".
Policy Text Pretreatment: Text Pretreatment module mainly includes subordinate sentence, participle and part-of-speech tagging.Use the local talent The title and text of policy corpus are segmented using Chinese Academy of Sciences ICTCLAS participle tool as experiment corpus and mark word Property.
Policy information extracts: by the analysis to corpus to be extracted, formulating decimation rule based on triggering vocabulary, and using just Then expression formula describes decimation rule and establishes rule base, finally carries out the extraction of the semantic primitive of local talents selection.By all political affairs Plan information according to project name, type of subject, declare condition, declare economic welfare, policy link carry out data normalization arrangement Talents selection database is imported afterwards.Feature Words extracting method in applicating text excavation declares item in talents selection database Part carries out further semantic excavation, resolves into the index that can quantify to measure, is stored in talents selection database.
Step 5, the talent and policy matching module are by the quantifiable indicator information and talents information in talents selection database Field in library carries out the talent and policy matching result in all talent's information banks of matching acquisition, wherein policy matching result packet It includes project name, type of subject, project application condition, declare economic welfare and the notice link of newest policy.
Further, being carried out in step 5 using the matching process based on semantic matches rule will be in talents selection database Quantifiable indicator information matched with the field in talents information library, such as " publishing thesis at least 5 " be equivalent to " paper number Amount >=5 ", specifically includes the following steps:
Step 5.1, matching rule is from the easier to the more advanced sorted, is matched since the index being easiest to, returning to every can quantify Targets match rules results;Matching rule is ranked up by matching complexity, first matches the index being easiest to, such as educational background, year Age, the already present index such as the quantity that publishes thesis match the index not set up by keyword, and returning to every can measure Change targets match rules results.
Step 5.2, each matching rule assigns different weights and already present index weights are greater than the index not set up, And then comprehensive matching degree formula is constructed according to quantifiable indicator matching rule result, for measuring the matching of the talent and policy categories Degree, matching degree include 5 grades: very matching (1 point), matching (0.8 point), comparison match (0.6 point), general matching (0.4 Point), mismatch (0.2 point).If talents information substantially conforms to all matching rules of the project, comprehensive matching similarity is close to 1 Point, then show that the talent can declare the project.
Step 5.3, auto-returned policy list of matches information after all policy categories is retrieved, comprehensive matching degree is pressed in list Sequence, tabulating result include relevant project name, type of subject, project application condition, declare economic welfare and newest policy Notice link.
Step 6, policy pushing module screens comprehensive after the talent and policy matching result returned in all talent's information banks The high policy categories of matching degree score value are closed, are compared with project talents information library has been declared, to the policy categories that do not register Carry out policy push in the form of mail, short message to the talent, show the declarable project name of the talent, project application economic welfare and Newest policy notice link.
The invention adopts the above technical scheme, establishes policy automatic patching system based on local talent's resources bank.Pass through base Papery talent archive information is identified in the OCR technique of BP neural network, it is automatic to match relevant policies for the local talent and go forward side by side administration Plan casting push solves the problems, such as that talents selection matching takes time and effort.The present invention passes through the condition random field based on event extraction (CRF) technology automatically can efficiently extract valuable information from a variety of non-structured personal files, and carry out Structured storage establishes talents information library, with easy-to-look-up and recycling.The present invention uses big data integrating means, will The policy data of each " information island " is integrated, and proposes the policy semantic primitive abstracting method based on artificial rule to the talent Policy realizes that key message extracts, and talents selection database is constructed, convenient for matching with talent's archive information.The present invention is based on Talents information has been declared in the matching result in talents selection library and talents information library, filtering, realizes precisely push to the talent is not declared.
Bibliography
[1] research [D] the Xi'an Polytechnic University of occurrences in human life resume intelligence extraction system of all wise man based on condition random field, 2016.
[2] personnel and post matching degree Calculating model and its application [J] Hubei University Of Technology journal in Zhu Pingli enterprise, 2009, 24(6):58-59.
[3] Zhang Xian, Liu Shengyan, Wang Wenguang personnel and post matching model construction and application: by taking the practice of company A personnel and post matching as an example The exploitation of [J] Chinese human resources, 2014 (22): 54-60.

Claims (10)

1. a kind of talent towards file data and policy intelligent Matching method, it is characterised in that: it includes the following steps;
Step 1, obtaining local talent's papery personal file and scanning is image, and the figure that will be scanned using image recognition technology As being converted into editable electronic document, personnel record information library is formed;
Step 2, valuable information is extracted simultaneously using the conditional random field models based on event extraction based on personnel record information library Structured storage forms talents information library;
Step 3, acquisition obtains current talents selection and constructs talents selection database;
Step 4, the policy semantic primitive abstracting method based on artificial rule believes the talents selection key of talents selection database Breath is extracted to obtain quantifiable indicator information;
Step 5, the quantifiable indicator information in talents selection database and the field in talents information library are subjected to matching acquisition The talent and policy matching result in all talent's information banks,
Step 6, the talent Xiang Weibao to have declared is rejected from the high policy categories of matching result screening comprehensive matching degree score value The talent of name pushes policy and shows project information.
2. a kind of talent towards file data according to claim 1 and policy intelligent Matching method, it is characterised in that: The specific steps of step 1 are as follows:
Step 1.1, mode typing computer papery personal file scanned, saves as image;
Step 1.2, several subgraphs are divided the image into using OCR technique, includes an individual letter in each subgraph;
Step 1.3, by subgraph from image format conversion at binary format, and by binary data transmission to BP nerve net Network;
Step 1.4, BP neural network finds out the association between character image data and numerical value by training process, by what is scanned Image is converted into editable electronic document, and recognition result enters system database.
3. a kind of talent towards file data according to claim 1 and policy intelligent Matching method, it is characterised in that: Step 2 carries out the specific steps of event extraction with conditional random field models to personnel record information are as follows:
Step 2.1, the archives text of personnel record information carries out the vector that word segmentation processing weights document representation at tf/idf,
Step 2.2, feature extraction is carried out using the method based on document frequency, filters out and contributes low list for correctly classifying Word;
Step 2.3, archives text training set is set, completes the conversion of archives text training set to characteristic set, and by artificial Mark label learns tagsort;
Step 2.4, archives text test set is obtained into same type characteristic set, then differentiates to obtain pair by sorter model The tag along sort answered carries out structured storage and establishes talents information library.
4. a kind of talent towards file data according to claim 3 and policy intelligent Matching method, it is characterised in that: The archives text of step 2.1 generally includes personal essential information, education background, work experience and paper Patent Publication situation.
5. a kind of talent towards file data according to claim 1 and policy intelligent Matching method, it is characterised in that: The talents selection acquisition modes of step 3 include lead-in mode and extracting mode, and lead-in mode refers to government organs or the third-party institution Policy information editor is actively stored in talents selection database;Extracting mode, which refers to, utilizes robot, web crawlers, Web Spider Cyber stalker automatic collections are carried out simultaneously to associated with talent policy informations all in selected target government website It is downloaded to local server and arranges deposit talents selection database again.
6. a kind of talent towards file data according to claim 1 and policy intelligent Matching method, it is characterised in that: Step 4 specifically includes the building of policy dictionary, policy Text Pretreatment and policy information and extracts three steps, specifically:
The building of policy dictionary: including talents selection dictionary and trigger word, talents selection dictionary by by policy type, policy title, The class condition of applicable elements and keyword includes the vocabulary of talents selection database;Trigger word is appointed for triggering to extract Business, trigger word are the word for playing identification, mark effect to semantic primitive, and trigger word passes through to local talents selection corpus Text feature and description habit carry out analysis extract obtain;
Policy Text Pretreatment: it is segmented using Chinese Academy of Sciences ICTCLAS participle tool and marks part of speech;
Policy information extracts: formulating decimation rule based on triggering vocabulary, and describes decimation rule using regular expression and establish rule Then library finally carries out the extraction of the semantic primitive of local talents selection and is stored in talents selection database.
7. a kind of talent towards file data according to claim 1 and policy intelligent Matching method, it is characterised in that: In step 5 using the matching process based on semantic matches rule carry out by talents selection database quantifiable indicator information and Field in talents information library is matched, specifically includes the following steps:
Step 5.1, matching rule is from the easier to the more advanced sorted, is matched since the index being easiest to, return to every quantifiable indicator Matching rule result;
Step 5.2, each matching rule assigns different weights and already present index weights are greater than the index not set up, in turn The matching degree for obtaining the talent and policy categories is calculated,
Step 5.3, all policy categories are retrieved to return to all results and arrange by policy matching is formed after the sequence of comprehensive matching degree Table.
8. a kind of talent towards file data according to claim 1 and policy intelligent Matching method, it is characterised in that: The policy matching result of step 5 includes project name, type of subject, project application condition, declares economic welfare and newest policy Notice link.
9. a kind of talent towards file data according to claim 1 and policy intelligent Matching method, it is characterised in that: In step 6 policy push is carried out to the talent to the policy categories that do not register in the form of mail, short message and shows declarable project Title, project application economic welfare and the notice link of newest policy.
10. a kind of talent towards file data and policy intelligent Matching system comprising with lower module:
Personal file identification module: for being editable electronic document by the personal file scanning recognition of papery and forming occurrences in human life Archive information library;
Personal file property extracting module: the conditional random field models based on event extraction extract the valuable of personnel record information library Value information and structured storage formation talents information library;
Policy automatic collection module: current talents selection building talents selection database is obtained for acquiring;
Policy segments parsing module: the talent of the policy semantic primitive abstracting method based on artificial rule to talents selection database Policy key message is extracted to obtain quantifiable indicator information;
The talent and policy matching module: for the quantifiable indicator information in talents selection database and the word in talents information library The matching of section simultaneously exports the talent in all talent's information banks and policy matching result;
Policy pushing module: for what is declared from the high policy categories rejecting of matching result screening comprehensive matching degree score value The talent pushes policy to the talent for registration and shows project information.
CN201910701445.5A 2019-07-31 2019-07-31 A kind of talent towards file data and policy intelligent Matching system and method Pending CN110457696A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910701445.5A CN110457696A (en) 2019-07-31 2019-07-31 A kind of talent towards file data and policy intelligent Matching system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910701445.5A CN110457696A (en) 2019-07-31 2019-07-31 A kind of talent towards file data and policy intelligent Matching system and method

Publications (1)

Publication Number Publication Date
CN110457696A true CN110457696A (en) 2019-11-15

Family

ID=68484267

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910701445.5A Pending CN110457696A (en) 2019-07-31 2019-07-31 A kind of talent towards file data and policy intelligent Matching system and method

Country Status (1)

Country Link
CN (1) CN110457696A (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110852722A (en) * 2019-11-18 2020-02-28 江苏苏伦大数据科技研究院有限公司 Information matching system for introducing high-level talents
CN111428037A (en) * 2020-03-24 2020-07-17 合肥科捷通科技信息服务有限公司 Method for analyzing matching performance of behavior policy
CN111652524A (en) * 2020-06-11 2020-09-11 中力数创(重庆)科技有限公司 Method and device for intelligently matching policy and guiding improvement path
CN111931031A (en) * 2020-08-19 2020-11-13 太仓中科信息技术研究院 Method for calculating policy information matching degree
CN112035653A (en) * 2020-11-05 2020-12-04 北京智源人工智能研究院 Policy key information extraction method and device, storage medium and electronic equipment
CN112036842A (en) * 2020-09-18 2020-12-04 重庆强大知识产权服务有限公司 Intelligent matching platform for scientific and technological services
CN112184525A (en) * 2020-09-28 2021-01-05 上海市浦东新区行政服务中心(上海市浦东新区市民中心) System and method for realizing intelligent matching recommendation through natural semantic analysis
CN112258144A (en) * 2020-09-27 2021-01-22 重庆生产力促进中心 Policy file information matching and pushing method based on automatic construction of target entity set
CN112380264A (en) * 2020-11-23 2021-02-19 政和科技股份有限公司 Policy analysis and matching method and device based on personal full life cycle
CN112765338A (en) * 2020-12-30 2021-05-07 江苏风云科技服务有限公司 Policy data pushing method, policy calculator and computer equipment
CN112765441A (en) * 2021-04-07 2021-05-07 北京零号窗网络信息技术有限公司 Enterprise policy information multiple dynamic intelligent matching recommendation method for digital government affairs
CN112989195A (en) * 2021-03-20 2021-06-18 重庆图强工程技术咨询有限公司 Big data based whole process consultation method and device, electronic equipment and storage medium
CN113191436A (en) * 2021-05-07 2021-07-30 广州博士信息技术研究院有限公司 Talent image tag identification method and system and cloud platform
CN113268573A (en) * 2021-05-19 2021-08-17 上海博亦信息科技有限公司 Extraction method of academic talent information
CN113537927A (en) * 2021-06-28 2021-10-22 北京航空航天大学 Scientific and technological resource service platform transaction coordination system and method
CN113590584A (en) * 2021-07-23 2021-11-02 无锡海创智慧谷科技有限公司 Talent base construction method based on big data
CN113609836A (en) * 2021-09-29 2021-11-05 深圳市指南针医疗科技有限公司 Medical policy full definition analysis system and method
CN114495145A (en) * 2022-02-16 2022-05-13 平安国际智慧城市科技股份有限公司 Policy document number extraction method, device, equipment and storage medium
CN115587786A (en) * 2022-08-31 2023-01-10 广州市弋迦信息科技有限公司 Talent information management system and method and talent information management platform
CN115630080A (en) * 2022-10-26 2023-01-20 深圳市纵横云数信息科技有限公司 Guided talent policy welfare calculation method and device
CN116483940A (en) * 2023-04-26 2023-07-25 深圳市国房云数据技术服务有限公司 Method for extracting and structuring data of whole-flow type document
CN116681261A (en) * 2023-07-27 2023-09-01 山东创亿智慧信息科技发展有限责任公司 Intelligent archive management control system
CN116956130A (en) * 2023-07-25 2023-10-27 北京安联通科技有限公司 Intelligent data processing method and system based on associated feature carding model
CN116992035A (en) * 2023-09-27 2023-11-03 湖南正宇软件技术开发有限公司 Intelligent classification method, device, computer equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080162513A1 (en) * 2006-12-28 2008-07-03 Pitney Bowes Incorporated Universal address parsing system and method
WO2013013283A1 (en) * 2011-07-28 2013-01-31 Wairever Inc. Method and system for validation of claims against policy with contextualized semantic interoperability
CN106021553A (en) * 2016-05-30 2016-10-12 深圳市华傲数据技术有限公司 Structuralized data matching method and system
CN106447298A (en) * 2016-09-30 2017-02-22 深圳市华傲数据技术有限公司 Information processing system and method based on talent service system
CN108764835A (en) * 2018-05-24 2018-11-06 广州合摩计算机科技有限公司 Reverse talent's pushed information method and apparatus
CN109408683A (en) * 2018-10-31 2019-03-01 广州高企云信息科技有限公司 A kind of policy intelligent Matching system and method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080162513A1 (en) * 2006-12-28 2008-07-03 Pitney Bowes Incorporated Universal address parsing system and method
WO2013013283A1 (en) * 2011-07-28 2013-01-31 Wairever Inc. Method and system for validation of claims against policy with contextualized semantic interoperability
CN106021553A (en) * 2016-05-30 2016-10-12 深圳市华傲数据技术有限公司 Structuralized data matching method and system
CN106447298A (en) * 2016-09-30 2017-02-22 深圳市华傲数据技术有限公司 Information processing system and method based on talent service system
CN108764835A (en) * 2018-05-24 2018-11-06 广州合摩计算机科技有限公司 Reverse talent's pushed information method and apparatus
CN109408683A (en) * 2018-10-31 2019-03-01 广州高企云信息科技有限公司 A kind of policy intelligent Matching system and method

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110852722A (en) * 2019-11-18 2020-02-28 江苏苏伦大数据科技研究院有限公司 Information matching system for introducing high-level talents
CN111428037A (en) * 2020-03-24 2020-07-17 合肥科捷通科技信息服务有限公司 Method for analyzing matching performance of behavior policy
CN111428037B (en) * 2020-03-24 2022-09-20 合肥科捷通科技信息服务有限公司 Method for analyzing matching performance of behavior policy
CN111652524A (en) * 2020-06-11 2020-09-11 中力数创(重庆)科技有限公司 Method and device for intelligently matching policy and guiding improvement path
CN111931031A (en) * 2020-08-19 2020-11-13 太仓中科信息技术研究院 Method for calculating policy information matching degree
CN112036842B (en) * 2020-09-18 2023-08-08 重庆强大知识产权服务有限公司 Intelligent matching device for scientific and technological service
CN112036842A (en) * 2020-09-18 2020-12-04 重庆强大知识产权服务有限公司 Intelligent matching platform for scientific and technological services
CN112258144B (en) * 2020-09-27 2022-04-26 重庆生产力促进中心 Policy file information matching and pushing method based on automatic construction of target entity set
CN112258144A (en) * 2020-09-27 2021-01-22 重庆生产力促进中心 Policy file information matching and pushing method based on automatic construction of target entity set
CN112184525A (en) * 2020-09-28 2021-01-05 上海市浦东新区行政服务中心(上海市浦东新区市民中心) System and method for realizing intelligent matching recommendation through natural semantic analysis
CN112035653A (en) * 2020-11-05 2020-12-04 北京智源人工智能研究院 Policy key information extraction method and device, storage medium and electronic equipment
CN112380264A (en) * 2020-11-23 2021-02-19 政和科技股份有限公司 Policy analysis and matching method and device based on personal full life cycle
CN112765338A (en) * 2020-12-30 2021-05-07 江苏风云科技服务有限公司 Policy data pushing method, policy calculator and computer equipment
CN112989195B (en) * 2021-03-20 2023-09-05 重庆图强工程技术咨询有限公司 Whole-process consultation method and device based on big data, electronic equipment and storage medium
CN112989195A (en) * 2021-03-20 2021-06-18 重庆图强工程技术咨询有限公司 Big data based whole process consultation method and device, electronic equipment and storage medium
CN112765441B (en) * 2021-04-07 2021-11-02 北京零号窗网络信息技术有限公司 Enterprise policy information multiple dynamic intelligent matching recommendation method for digital government affairs
CN112765441A (en) * 2021-04-07 2021-05-07 北京零号窗网络信息技术有限公司 Enterprise policy information multiple dynamic intelligent matching recommendation method for digital government affairs
CN113191436A (en) * 2021-05-07 2021-07-30 广州博士信息技术研究院有限公司 Talent image tag identification method and system and cloud platform
CN113268573A (en) * 2021-05-19 2021-08-17 上海博亦信息科技有限公司 Extraction method of academic talent information
CN113537927A (en) * 2021-06-28 2021-10-22 北京航空航天大学 Scientific and technological resource service platform transaction coordination system and method
CN113537927B (en) * 2021-06-28 2024-06-07 北京航空航天大学 Transaction collaboration system and method for scientific and technological resource service platform
CN113590584A (en) * 2021-07-23 2021-11-02 无锡海创智慧谷科技有限公司 Talent base construction method based on big data
CN113609836A (en) * 2021-09-29 2021-11-05 深圳市指南针医疗科技有限公司 Medical policy full definition analysis system and method
CN113609836B (en) * 2021-09-29 2022-01-28 深圳市指南针医疗科技有限公司 Medical policy full definition analysis system and method
CN114495145A (en) * 2022-02-16 2022-05-13 平安国际智慧城市科技股份有限公司 Policy document number extraction method, device, equipment and storage medium
CN114495145B (en) * 2022-02-16 2024-05-28 平安国际智慧城市科技股份有限公司 Policy and document extraction method, device, equipment and storage medium
CN115587786A (en) * 2022-08-31 2023-01-10 广州市弋迦信息科技有限公司 Talent information management system and method and talent information management platform
CN115630080B (en) * 2022-10-26 2023-08-04 深圳市纵横云数信息科技有限公司 Guided talent policy welfare calculation method and device
CN115630080A (en) * 2022-10-26 2023-01-20 深圳市纵横云数信息科技有限公司 Guided talent policy welfare calculation method and device
CN116483940A (en) * 2023-04-26 2023-07-25 深圳市国房云数据技术服务有限公司 Method for extracting and structuring data of whole-flow type document
CN116956130A (en) * 2023-07-25 2023-10-27 北京安联通科技有限公司 Intelligent data processing method and system based on associated feature carding model
CN116681261A (en) * 2023-07-27 2023-09-01 山东创亿智慧信息科技发展有限责任公司 Intelligent archive management control system
CN116681261B (en) * 2023-07-27 2023-10-17 山东创亿智慧信息科技发展有限责任公司 Intelligent archive management control system
CN116992035A (en) * 2023-09-27 2023-11-03 湖南正宇软件技术开发有限公司 Intelligent classification method, device, computer equipment and medium
CN116992035B (en) * 2023-09-27 2023-12-08 湖南正宇软件技术开发有限公司 Intelligent classification method, device, computer equipment and medium

Similar Documents

Publication Publication Date Title
CN110457696A (en) A kind of talent towards file data and policy intelligent Matching system and method
CN110427623B (en) Semi-structured document knowledge extraction method and device, electronic equipment and storage medium
TWI424325B (en) Systems and methods for organizing collective social intelligence information using an organic object data model
CN111309936A (en) Method for constructing portrait of movie user
CN107315738A (en) A kind of innovation degree appraisal procedure of text message
CN112307351A (en) Model training and recommending method, device and equipment for user behavior
CN109299271A (en) Training sample generation, text data, public sentiment event category method and relevant device
CN114238573B (en) Text countercheck sample-based information pushing method and device
CN115526590B (en) Efficient person post matching and re-pushing method combining expert knowledge and algorithm
CN108681548A (en) A kind of lawyer&#39;s information processing method and system
CN108681977A (en) A kind of lawyer&#39;s information processing method and system
Oppong et al. Business decision support system based on sentiment analysis
CN116304035A (en) Multi-notice multi-crime name relation extraction method and device in complex case
Nawaz et al. Mining public opinion: a sentiment based forecasting for democratic elections of Pakistan
Sandhu et al. Enhanced Text Mining Approach for Better Ranking System of Customer Reviews
Tallapragada et al. Improved Resume Parsing based on Contextual Meaning Extraction using BERT
CN113515699A (en) Information recommendation method and device, computer-readable storage medium and processor
Gajanayake et al. Candidate selection for the interview using github profile and user analysis for the position of software engineer
Nguyen et al. Analyzing customer experience in hotel services using topic modeling
Aurnhammer et al. Manual Annotation of Unsupervised Models: Close and Distant Reading of Politics on Reddit.
CN116186422A (en) Disease-related public opinion analysis system based on social media and artificial intelligence
Jiang et al. ChouBERT: Pre-training french language model for crowdsensing with tweets in phytosanitary context
Tijare et al. A smart resume screening tool for customized shortlisting
CN118093881B (en) Audit object portrait modeling method and system based on knowledge graph
CN113220850B (en) Case image mining method for court trial and reading

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191115