CN106469200B - It is a kind of to predict that enterprise changes the method and system that but industry and commerce is put on record not in time there are address location - Google Patents

It is a kind of to predict that enterprise changes the method and system that but industry and commerce is put on record not in time there are address location Download PDF

Info

Publication number
CN106469200B
CN106469200B CN201610796065.0A CN201610796065A CN106469200B CN 106469200 B CN106469200 B CN 106469200B CN 201610796065 A CN201610796065 A CN 201610796065A CN 106469200 B CN106469200 B CN 106469200B
Authority
CN
China
Prior art keywords
information
enterprise
record
address
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610796065.0A
Other languages
Chinese (zh)
Other versions
CN106469200A (en
Inventor
王晶
刘光辉
夏耘海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guoxin Youe Data Co Ltd
Original Assignee
Guoxin Youe Data Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guoxin Youe Data Co Ltd filed Critical Guoxin Youe Data Co Ltd
Priority to CN201610796065.0A priority Critical patent/CN106469200B/en
Publication of CN106469200A publication Critical patent/CN106469200A/en
Application granted granted Critical
Publication of CN106469200B publication Critical patent/CN106469200B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • Development Economics (AREA)
  • Quality & Reliability (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Remote Sensing (AREA)
  • Educational Administration (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides prediction enterprise, there are address locations to change but the industrial and commercial method put on record not in time, comprising: S1, crawls newest enterprise's record information building enterprise's record information library;S2, the newest recruitment data information building recruitment information resources bank of enterprise is crawled;S3, it determines enterprise to be tested, and obtains the address information of enterprise to be tested from enterprise's record information library and recruitment information resources bank;S4, address information is converted into latitude coordinate;S5, the coordinate difference for calculating the latitude and longitude coordinates that address information is converted;S6, judge that enterprise to be tested changes the case where but industry and commerce is put on record not in time with the presence or absence of enterprise address location according to coordinate difference, and determine whether to carry out early warning.The present invention recruits data information by internet, compared with the record information of the State Administration for Industry and Commerce, effectively determines that enterprise does not put on record to the State Administration for Industry and Commerce but with the presence or absence of address change, effectively realizes the identification judgement of company information, and carries out early warning for problem enterprise.

Description

A kind of prediction enterprise there are address location change but the industrial and commercial method put on record not in time and System
Technical field
The present invention relates to technical field of data processing, and in particular to a kind of prediction enterprise there are address location change but not and When the method and system put on record of industry and commerce.
Background technique
With the emergence of internet, the application of network is become increasingly prevalent, and includes a large amount of information in network, side People's information lookup.
Nowadays more and more enterprises by its recruitment channel by under previous line recruit mode be adjusted to line it is offline on simultaneously The mode of row recruitment, or even there are Some Enterprises only to remain its recruitment channel on line and recruit, and no longer carry out recruiting under line, because This, this is but also have a large amount of company information on internet.
Enterprise is in development process, due to the increase of personnel, the expansion of unit, so that enterprise may change place of working The case where location, but be also possible to have enterprise and although change work address, but do not put on record as the State Administration for Industry and Commerce, this makes Obtaining company information, there are certain uncertainties.
Due to the quick update of internet information, would be more advantageous in obtaining the relevant information updated, therefore, according to interconnection The recruitment data information of online enterprise's publication will carry out the identification judgement of enterprise's address information, whether facilitate determining enterprise There are address locations to change the case where but State Administration for Industry and Commerce puts on record not in time.
Therefore, online enterprises recruitment information can effectively be obtained and analyze extraction by how providing one kind, while being capable of basis State Administration for Industry and Commerce's record information carries out judging that company information just becomes urgently to be resolved ask with the presence or absence of the method and system of change Topic.
Summary of the invention
The present invention provides a kind of prediction enterprise, there are address locations to change but the industrial and commercial method and system put on record not in time, Based on effectively recruiting data information on internet, and combine enterprise in the data information of putting on record of the State Administration for Industry and Commerce, by score Analysis, to there are the changes of enterprise address location judge simultaneously early warning the case where but the State Administration for Industry and Commerce puts on record not in time.
First part of the invention provides a kind of prediction enterprise and puts on record there are address location change is but industrial and commercial not in time Method, comprising the following steps:
S1, using static downloading mode, newest enterprise's record information is crawled from industrial and commercial bureau's record information resource system, and Construct enterprise's record information library;
S2, using static downloading mode, the newest recruitment data information of enterprise is crawled from internet, and construct recruitment information Resources bank;
S3, it determines enterprise to be tested, and obtains the first address information of enterprise to be tested from enterprise's record information library, from The second address information of enterprise to be tested is obtained in recruitment information resources bank;
S4, the first address information and the second address information are converted into latitude coordinate;
S5, the coordinate difference for calculating the latitude and longitude coordinates that the first address information and the second address information are converted;
S6, judge that enterprise to be tested changes but what industry and commerce not in time was put on record with the presence or absence of enterprise address location according to coordinate difference Situation, and determine whether to carry out early warning.
Preferably, in S1, crawl newest enterprise's record information the following steps are included:
S11, crawl are from industrial and commercial bureau's record information resource system about the first HTML information of enterprise's record information;
S12, the first link information in the first HTML information is captured by Python;
S13, data cleansing is carried out to the first link information using R language.
Preferably, in S2, crawling newest recruitment data information includes following steps:
The second HTML information in S21, crawl internet about enterprises recruitment information;
S22, the second link information in the second HTML information is captured by Python;
S23, data cleansing is carried out to the second link information using R language.
Preferably, include in newest enterprise's record information corporate social credit code, enterprise name, enterprise put on record residence and The information scratching time;It include enterprise name, enterprise publication residence and information issuing time in newest recruitment data information.
Preferably, in S4, the first address information and the second address information are converted into latitude coordinate by Baidu API.
Preferably, in S6, as coordinate difference >=10 ' ', then judging enterprise to be tested, there are enterprise address locations to change but not The case where timely industry and commerce is put on record, and carry out early warning;
As coordinate difference < 10 ' ', then judging enterprise to be tested, there is no the change of enterprise address location is but industrial and commercial standby not in time The case where case, without early warning.
Second part of the invention, which provides a kind of prediction enterprise, to be put on record there are address location change is but industrial and commercial not in time System, comprising:
Enterprise's record information library constructs server, connects industrial and commercial bureau's record information resource system, and put on record from industrial and commercial bureau Newest enterprise's record information is crawled in information resource system to construct enterprise's record information library;
Recruitment information resources bank constructs server, connects internet, and the newest recruitment number of enterprise is crawled from internet It is believed that breath is to construct recruitment information resources bank;
Longitude and latitude converting unit, longitude and latitude converting unit construct server and recruitment information from enterprise's record information library respectively Obtain the first address information and the second address information of enterprise to be tested in resources bank building server, and by the first address information Latitude and longitude coordinates are respectively converted into the second address information;
Server is examined in residence, according to the coordinate difference for the latitude and longitude coordinates that the first address information and the second address information are converted Judge that enterprise to be tested changes the case where but industry and commerce is put on record not in time with the presence or absence of enterprise address location;
Prewarning unit, according to the enterprise to be tested of judgement, with the presence or absence of the change of enterprise address location, but industry and commerce is put on record not in time The case where result, it is determined whether carry out early warning.
Preferably, building server in enterprise's record information library includes:
Enterprise's record information acquiring unit is used to grab and put on record from industrial and commercial bureau's record information resource system about enterprise First HTML information of information;
Record information processing unit is used to capture the first link information in the first HTML information by Python;
Record information cleaning unit is used to carry out data cleansing to the first link information using R language.
Preferably, recruitment information resources bank building server includes:
Recruitment information acquiring unit is used to grab the second HTML information in internet about enterprises recruitment breath;
Recruitment information processing unit is used to capture the second link letter in second HTML information by Python Breath;
Recruitment information cleaning unit is used to carry out data cleansing to second link information using R language.
Preferably, longitude and latitude converting unit uses Baidu API conversion chip, and the first address information and the second address are believed Breath is converted into latitude coordinate.
Preferably, inspection server in residence includes:
Coordinate difference computing unit is used to calculate the latitude and longitude coordinates of the first address information and the conversion of the second address information Coordinate difference;
Address check unit judges that enterprise to be tested changes but not in time with the presence or absence of enterprise address location according to coordinate difference The case where industry and commerce is put on record.
Preferably, address check unit judges:
As coordinate difference >=10 ' ', then judging enterprise to be tested, but industry and commerce is put on record not in time there are the change of enterprise address location The case where;
As coordinate difference < 10 ' ', then judging enterprise to be tested, there is no the change of enterprise address location is but industrial and commercial standby not in time The case where case.
It is further preferred that prewarning unit judges:
As the coordinate difference >=10 ' ', then early warning is carried out;
As the coordinate difference < 10 ' ', then without early warning.
There are address locations to change but the industrial and commercial method and system put on record not in time for prediction enterprise of the invention, by crawling The recruitment data information that enterprise issues on internet, and be compared with the record information of the State Administration for Industry and Commerce crawled, it can be effective Determination enterprise do not put on record to the State Administration for Industry and Commerce but with the presence or absence of address change, the significantly more efficient identification for realizing company information Judgement, and early warning is carried out for enterprise of problems.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment Attached drawing be briefly described, it should be apparent that, drawings in the following description are some embodiments of the invention, for ability For the those of ordinary skill of domain, without any creative labor, it can also be obtained according to these attached drawings others Attached drawing.
Fig. 1 is that there are address locations to change but one embodiment of the industrial and commercial method put on record not in time for present invention prediction enterprise Flow chart.
Fig. 2 is that there are address locations to change but one embodiment of the industrial and commercial method put on record not in time for present invention prediction enterprise The flow chart for crawling newest enterprise's record information.
Fig. 3 is that there are address locations to change but one embodiment of the industrial and commercial method put on record not in time for present invention prediction enterprise Crawl it is newest recruitment data information flow chart.
Fig. 4 is that there are address locations to change but one embodiment of the industrial and commercial system put on record not in time for present invention prediction enterprise Structure chart.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached in the embodiment of the present invention Figure, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is the present invention A part of the embodiment, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not having Every other embodiment obtained under the premise of creative work is made, shall fall within the protection scope of the present invention.
Embodiment one
Fig. 1 is that there are address locations to change but one embodiment of the industrial and commercial method put on record not in time for present invention prediction enterprise Flow chart, as shown in Figure 1, in the present embodiment, there are address locations to change but the industrial and commercial method put on record not in time for prediction enterprise, The following steps are included:
S1, using static downloading mode, newest enterprise's record information is crawled from industrial and commercial bureau's record information resource system, and Construct enterprise's record information library;
S2, using static downloading mode, the newest recruitment data information of enterprise is crawled from internet, and construct recruitment information Resources bank;
S3, it determines enterprise to be tested, and obtains the first address information of enterprise to be tested from enterprise's record information library, from The second address information of enterprise to be tested is obtained in recruitment information resources bank;
S4, the first address information and the second address information are converted into latitude coordinate;
S5, the coordinate difference for calculating the latitude and longitude coordinates that the first address information and the second address information are converted;
S6, judge that enterprise to be tested changes but what industry and commerce not in time was put on record with the presence or absence of enterprise address location according to coordinate difference Situation, and determine whether to carry out early warning.
Specifically, connecting industrial and commercial bureau's record information resource system by using the mode of static state downloading in S1, and from wherein It crawls newest enterprise's record information, and integrates the newest enterprise's record information crawled, enterprise's record information library is constructed with this.
Further, the flow chart for crawling newest enterprise's record information as shown in Figure 2, then it is standby to crawl newest enterprise in S1 Case information can be specifically divided into following steps:
The first HTML information in S11, crawl industrial and commercial bureau's record information resource system about enterprise's record information.
By static downloading mode, believe in the crawl of industrial and commercial bureau's record information resource system about the HTML of enterprise's record information Breath, i.e. the first HTML information are handled for first HTML information so as to subsequent, obtain the first address information of enterprise.
S12, the first link information in first HTML information is captured by Python.
Capture the link information in the first HTML information of crawl, specifically can by urllib2 module in Python or Request module captures the link of the href Attribute class label in the first HTML information, i.e. the first link information.
S13, data cleansing is carried out to first link information using R language.
For the first link information captured, data cleansing is carried out to it using R language, screens the first link information In useful information, to obtain newest enterprise's record information.
And integrate newest enterprise's record information by obtaining after data cleansing, with according to newest enterprise's record information come structure Build enterprise's record information library.
It further, can specifically include corporate social credit code, enterprise name, enterprise in newest enterprise's record information Put on record residence and information scratching time, specific as shown in table 1:
Table 1
Wherein, shown in table 1 as above, corporate social credit code and enterprise name are for identifying enterprise, enterprise Putting on record, enterprise puts on record residence for identification in residence, and whether the information scratching time for confirming newest enterprise's record information It updates, to guarantee that newest enterprise's record information has better timeliness.
Likewise, connecting internet by using the mode of static state downloading in S2, and it is newest to crawl from internet enterprise It data information is recruited, and integrates the newest recruitment data information that is crawled, recruitment information resources bank is constructed with this.
Further, the flow chart for crawling newest recruitment data information as shown in Figure 3, then crawl newest recruitment number in S2 It is believed that breath can be specifically divided into following steps:
The second HTML information in S21, crawl internet about enterprises recruitment information.
Since current most of enterprise all issues enterprises recruitment information by internet, the recruitment issued with this in enterprise It include a large amount of data information in information, such as: enterprise name, enterprise business address, position vacant, job requirement, correspondent party Therefore formula etc. by being analyzed and processed to the recruitment information that enterprise issues, will obtain more business data information.
Based on this, by static downloading mode, HTML information of the crawl about enterprises recruitment information in internet, i.e., second HTML information is handled for second HTML information so as to subsequent, obtains the second address information of enterprise.
S22, the second link information in second HTML information is captured by Python.
It is similar to above-mentioned S12, the link information in the second HTML information of primary capture crawl, and can specifically pass through Urllib2 module or request module in Python capture the link of the href Attribute class label in the second HTML information, That is the second link information.
S23, data cleansing is carried out to second link information using R language.
It is similar to above-mentioned S13, and for the second link information captured, it is clear to carry out data to it using R language It washes, the useful information in the second link information is screened, to obtain newest recruitment data information.
And integrate newest recruitment data information by obtaining after data cleansing, with according to newest recruitment data information come structure Build recruitment information resources bank.
It further, include enterprise name, enterprise publication residence and information issuing time, tool in newest recruitment data information Body is as shown in table 2:
Table 2
Wherein, shown in table 2 as above, for enterprise name for identifying to enterprise, enterprise issues residence enterprise for identification Where its residence of publication, and issuing time is for confirming whether the newest recruitment data information has been updated, newest to guarantee Recruiting data information has better timeliness.
Further, according to Tables 1 and 2 it is found that since newest enterprise's record information and newest recruitment data information are same Including enterprise name information, thus will be that newest enterprise's record information and newest recruitment data information form mutual pair It answers, it is aftermentioned according to newest enterprise's record information and newest comparison of the recruitment data information to enterprise's address information to realize, and Judge that enterprise changes the case where but industry and commerce is put on record not in time with the presence or absence of address location.
The enterprise for the address change that needs to test, i.e., enterprise to be tested, so as to be tested for this are determined in S3 first The first address information of the enterprise to be tested is obtained in enterprise's record information library that enterprise constructs in S1, and is constructed from S2 Recruitment information resources bank in obtain the second address information of the enterprise to be tested.
Specifically, the enterprise name of the enterprise can be obtained according to determining enterprise to be tested, and put on record in newest enterprise Equally include enterprise name in information and newest recruitment data information, therefore, can be provided in enterprise's record information library and recruitment information The corresponding newest enterprise's record information and newest recruitment data information of the enterprise to be tested are found in the library of source.
Corresponding to finally, according to the enterprise to be tested found in enterprise's record information library and recruitment information resources bank It, can be from newest enterprise's record information and newest recruitment data information after newest enterprise's record information and newest recruitment data information Middle the first address information and the second address information for obtaining the enterprise to be tested, so as to aftermentioned for first address information and the Double-address information carries out analysis comparison, determines the enterprise to be tested with the presence or absence of address change situation.
That is, according to the record of Tables 1 and 2, it is known that in the present embodiment, the first address information are as follows: the Fengtai District, Beijing City road XXX No. XX;Second address information are as follows: the Haidian District, Beijing City road XXX XX.
Further, the selection of the first address information and the second address information should meet claimed below:
Information issuing time in newest recruitment data information corresponding to second address information should be later than or be equal to first The information scratching time of newest enterprise's record information corresponding to address information.
Timeliness is had more with this to ensure the process for the change inspection of enterprise address location.
The first address information and the second address information for the enterprise to be tested that will acquire in S4 are converted into latitude coordinate Form.
Finally, the coordinate form of the first address information are as follows: the first longitude and latitude (X1, Y1);
And second address information coordinate form are as follows: the second longitude and latitude (X2, Y2)。
Further, the first address information and the second address information are converted into latitude coordinate by Baidu API.
Specifically, converting latitude and longitude coordinates form according to Baidu API the first address information of progress and the second address information Service parameter is as shown in table 3:
Table 3
Also, the first address information is carried out according to Baidu API shown in upper table 3 and the second address information conversion longitude and latitude is sat The return value of mark form is as shown in table 4:
Table 4
The latitude and longitude coordinates form of the first address information and the second address information that are swapped out in S5 according to the above S4 transfer come Calculate the coordinate difference of its latitude and longitude coordinates.
Specifically, the coordinate form of the first address information for example, by using above-mentioned acquisition: the first longitude and latitude (X1, Y1) and the The coordinate form of double-address information: the second longitude and latitude (X2, Y2)。
Then, coordinate difference can calculate by the following method:
Coordinate difference may particularly denote are as follows: coordinate difference (Δ X, Δ Y).
Wherein, Δ X is to pass through in longitude and the coordinate form of the second address information in the coordinate form of the first address information The difference of angle value then knows Δ X=| X1-X2|。
Likewise, Δ Y is then the coordinate form middle latitude value of the first address information and the coordinate form of the second address information The difference of middle latitude value then knows Δ Y=| Y1-Y2|。
According to the value of Δ X and Δ Y determined above.
Then, coordinate difference is also denoted as: coordinate difference (| X1-X2|, | Y1-Y2|)。
According to coordinate difference calculated in S5 in S6, to judge that the enterprise to be tested changes with the presence or absence of enterprise address location But the case where industry and commerce is put on record not in time, and determined whether to carry out early warning processing according to judging result.
Specifically, when judging the enterprise to be tested according to coordinate difference, there are enterprise address location change but industry and commerce not in time The case where putting on record, then carries out early warning processing;And the enterprise to be tested is judged according to coordinate difference there is no the changes of enterprise address location The case where more but industry and commerce is put on record not in time, is then handled without early warning.
Further, in the present embodiment, set coordinate difference threshold value as 10 ' ', i.e., according to coordinate difference whether be more than the threshold value come Judge that enterprise to be tested changes the case where but industry and commerce is put on record not in time with the presence or absence of enterprise address location.
That is, according to calculated coordinate difference whether more than 10 ' ', determine whether the enterprise to be tested deposits with this The case where but industry and commerce is put on record not in time is changed in enterprise address location.
Specifically, then judging enterprise to be tested, there are enterprise address locations to change but not in time as coordinate difference >=10 ' ' The case where industry and commerce is put on record, and carry out early warning;As coordinate difference < 10 ' ', then judging enterprise to be tested, there is no enterprise address locations The case where but industry and commerce is put on record not in time is changed, without early warning.
It based on above method, can fully play in internet, the fast feature of data renewal frequency, needs are examined Enterprise is monitored inspection with the presence or absence of address location change the case where but industry and commerce is put on record not in time, and is determined according to inspection result Whether early warning is carried out.
Embodiment two
Fig. 4 is that there are address locations to change but one embodiment of the industrial and commercial system put on record not in time for present invention prediction enterprise Structure chart.
As shown in figure 4, predicting enterprise in the present embodiment, there are address locations to change but the industrial and commercial system put on record not in time, packet Include: enterprise's record information library constructs server, recruitment information resources bank building server, longitude and latitude converting unit, residence inspection Server and prewarning unit.
Wherein, building server in enterprise's record information library connects industrial and commercial bureau's record information resource system, so that enterprise puts on record Information bank building server can crawl newest enterprise's record information from industrial and commercial bureau's record information resource system, and according to being climbed Newest enterprise's record information for taking constructs enterprise's record information library.
Further, building server in enterprise's record information library also may particularly include: enterprise's record information acquiring unit, Record information processing unit and record information cleaning unit.
Wherein, enterprise's record information acquiring unit for grab it is described from industrial and commercial bureau's record information resource system about enterprise The process of S11 in first HTML information of industry record information, detailed process such as above-described embodiment one.
Record information processing unit is used to capture the first link information in first HTML information by Python, The process of S12 in detailed process such as above-described embodiment one.
Record information cleaning unit is used to carry out data cleansing, specific mistake to first link information using R language The process of S13 in journey such as above-described embodiment one.
Recruitment information resources bank constructs server and connects internet, so that recruitment information resources bank building server can be certainly The newest recruitment data information of enterprise is crawled in internet, and recruitment information is constructed according to the newest recruitment data information crawled Resources bank.
Further, recruitment information resources bank building server may particularly include: recruitment information acquiring unit, recruitment letter Cease processing unit and recruitment information cleaning unit.
Wherein, recruitment information acquiring unit is used to grab the second HTML information in internet about enterprises recruitment breath, The process of S21 in detailed process such as above-described embodiment one.
Recruitment information processing unit is used to capture the second link information in second HTML information by Python, The process of S22 in detailed process such as above-described embodiment one.
Recruitment information cleaning unit is used to carry out data cleansing, specific mistake to second link information using R language The process of S23 in journey such as above-described embodiment one.
Longitude and latitude converting unit is separately connected enterprise's record information library building server and recruitment information resources bank building clothes Business device, and, longitude and latitude converting unit can construct server and the building of recruitment information resources bank from enterprise's record information library respectively The first address information and the second address information of enterprise to be tested are obtained in server, and the first address information that will acquire and Double-address information is respectively converted into latitude and longitude coordinates.
Further, which can specifically use Baidu API conversion chip, and first address is believed Breath and the second address information are converted into latitude coordinate.
Further, longitude and latitude converting unit should meet the selection of the first address information and the second address information following It is required that:
Information issuing time in newest recruitment data information corresponding to second address information should be later than or be equal to first The information scratching time of newest enterprise's record information corresponding to address information.
Timeliness is had more with this to ensure the process for the change inspection of enterprise address location.
For example, the first address information and the second address information of the enterprise to be tested that longitude and latitude converting unit is got are strictly according to the facts It applies in example one shown in Tables 1 and 2, i.e. the first address information are as follows: the Fengtai District, Beijing City road XXX XX;Second address information are as follows: north The Haidian District the Jing Shi road XXX XX.
Then, first address information and the second address information are converted into latitude coordinate form by longitude and latitude converting unit.
Specifically, the coordinate form of the first address information are as follows: the first longitude and latitude (X1, Y1);And the coordinate of the second address information Form are as follows: the second longitude and latitude (X2, Y2)。
Server is examined to be turned according to what longitude and latitude converting unit provided by the first address information and the second address information in residence The coordinate difference for the latitude and longitude coordinates changed judge enterprise to be tested with the presence or absence of enterprise address location change it is but industrial and commercial standby not in time The case where case.
Further, which examines server may particularly include: coordinate difference computing unit and address check unit.
Wherein, coordinate difference computing unit is used to calculate the longitude and latitude of first address information and the conversion of the second address information The coordinate difference of coordinate.
Coordinate difference computing unit turns according to calculated first address information of longitude and latitude converting unit and the second address information It is changed to latitude and longitude coordinates form, i.e. the first longitude and latitude (X1, Y1) and the second longitude and latitude (X2, Y2), it is poor that coordinates computed is come with this.
Specific coordinate difference can calculate by the following method:
Coordinate difference may particularly denote are as follows: coordinate difference (Δ X, Δ Y).
Wherein, Δ X is to pass through in longitude and the coordinate form of the second address information in the coordinate form of the first address information The difference of angle value then knows Δ X=| X1-X2|。
Likewise, Δ Y is then the coordinate form middle latitude value of the first address information and the coordinate form of the second address information The difference of middle latitude value then knows Δ Y=| Y1-Y2|。
According to the value of Δ X and Δ Y determined above.
Then, coordinate difference is also denoted as: coordinate difference (| X1-X2|, | Y1-Y2|)。
Address check unit then according to coordinate difference judge enterprise to be tested with the presence or absence of enterprise address location change but not and When industry and commerce the case where putting on record.
Address detection unit judges whether the coordinate difference is more than threshold value according to preset coordinate difference threshold value, is come with this Judge that enterprise to be tested changes the case where but industry and commerce is put on record not in time with the presence or absence of enterprise address location.
In the present embodiment, standard deviation threshold method is set as 10 ' '.
Specifically, then, as coordinate difference >=10 ' ', then judge enterprise to be tested there are the change of enterprise address location but not and When industry and commerce the case where putting on record;As coordinate difference < 10 ' ', then judge enterprise to be tested there is no enterprise address location change but not and When industry and commerce the case where putting on record.
Prewarning unit examines the enterprise to be tested of server judgement to change but with the presence or absence of enterprise address location according to residence Not in time industrial and commercial the case where putting on record as a result, to determine the need for carrying out early warning processing for the enterprise.
Specifically, server is examined in prewarning unit connection residence, being directed to for server check is examined to look forward to receive residence Industry it is no there are the change of enterprise address location but not in time industrial and commercial the case where putting on record as a result, simultaneously inspection result judges based on the received Whether need to carry out early warning processing.
The setting in server is examined according to residence, then it is found that then carrying out early warning as coordinate difference >=10 ' ';And when seat When marking poor < 10 ' ', then without early warning.
Setting above in relation to prewarning unit corresponds in residence inspection server, as coordinate difference >=10 ' ', then sentences Breaking, there are enterprise address locations to change the case where but industry and commerce is put on record not in time for enterprise to be tested;As coordinate difference < 10 ' ', then judge There is no enterprise address locations to change the case where but industry and commerce is put on record not in time for enterprise to be tested.
It is realized with this, as coordinate difference >=10 ' ', is not then judged enterprise to be tested there are the changes of enterprise address location but not The case where timely industry and commerce is put on record, and carry out early warning.
And as coordinate difference < 10 ' ', then judging enterprise to be tested, there is no enterprise address location change but industry and commerce not in time The case where putting on record, and without early warning.
There are address locations to change but the industrial and commercial method and system put on record not in time for prediction enterprise of the invention, by crawling The recruitment data information that enterprise issues on internet, and be compared with the record information of the State Administration for Industry and Commerce crawled, it can be effective Determination enterprise do not put on record to the State Administration for Industry and Commerce but with the presence or absence of address change, the significantly more efficient identification for realizing company information Judgement, and early warning is carried out for enterprise of problems.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features; And these are modified or replaceed, technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution spirit and Range.

Claims (12)

1. there are address locations to change but the industrial and commercial method put on record not in time for a kind of prediction enterprise, which is characterized in that including following Step:
S1, using static downloading mode, newest enterprise's record information is crawled from industrial and commercial bureau's record information resource system, and construct Enterprise's record information library;Put on record in newest enterprise's record information including corporate social credit code, enterprise name, enterprise Institute and information scratching time;
S2, using static downloading mode, the newest recruitment data information of enterprise is crawled from internet, and construct recruitment information resource Library;It include enterprise name, enterprise publication residence and information issuing time in the newest recruitment data information;
S3, it determines enterprise to be tested, and obtains the first address letter of the enterprise to be tested from enterprise's record information library Breath, the second address information of the enterprise to be tested is obtained from the recruitment information resources bank;The second address information institute Information issuing time in corresponding newest recruitment data information should be later than or equal to newest enterprise corresponding to the first address information The information scratching time of industry record information;
S4, first address information and the second address information are converted into latitude coordinate;
S5, the coordinate difference for calculating the latitude and longitude coordinates that first address information and the second address information are converted;
S6, judge that the enterprise to be tested is but industrial and commercial standby not in time with the presence or absence of the change of enterprise address location according to the coordinate difference The case where case, and determine whether to carry out early warning.
2. there are address locations to change but the industrial and commercial method put on record not in time for prediction enterprise according to claim 1, special Sign is,
In the S1, crawl newest enterprise's record information the following steps are included:
The first HTML information in S11, crawl industrial and commercial bureau's record information resource system about enterprise's record information;
S12, the first link information in first HTML information is captured by Python;
S13, data cleansing is carried out to first link information using R language.
3. there are address locations to change but the industrial and commercial method put on record not in time for prediction enterprise according to claim 1, special Sign is,
In the S2, crawling the newest recruitment data information includes following steps:
The second HTML information in S21, crawl internet about enterprises recruitment information;
S22, the second link information in second HTML information is captured by Python;
S23, data cleansing is carried out to second link information using R language.
4. there are address locations to change but the industrial and commercial method put on record not in time for prediction enterprise according to claim 1, special Sign is,
In the S4, first address information and the second address information are converted into latitude coordinate by Baidu API.
5. there are address locations to change but the industrial and commercial method put on record not in time for prediction enterprise according to claim 1, special Sign is,
In the S6, as the coordinate difference >=10 ' ', then judging the enterprise to be tested, there are enterprise address locations to change but The case where industry and commerce is put on record not in time, and carry out early warning;
As the coordinate difference < 10 ' ', then judging the enterprise to be tested, there is no enterprise address locations to change but work not in time The case where quotient puts on record, without early warning.
6. there are address locations to change but the industrial and commercial system put on record not in time for a kind of prediction enterprise characterized by comprising
Enterprise's record information library constructs server, connects industrial and commercial bureau's record information resource system, and from industrial and commercial bureau's record information Newest enterprise's record information is crawled in resource system to construct enterprise's record information library;Include in newest enterprise's record information Corporate social credit code, enterprise name, enterprise put on record residence and information scratching time;
Recruitment information resources bank constructs server, connects internet, and the newest recruitment data letter of enterprise is crawled from internet Breath is to construct recruitment information resources bank;It include enterprise name, enterprise publication residence and information in the newest recruitment data information Issuing time;
Longitude and latitude converting unit, the longitude and latitude converting unit construct server and recruitment from enterprise's record information library respectively Obtain the first address information and the second address information of enterprise to be tested in information resource database building server, and by described first Address information and the second address information are respectively converted into latitude and longitude coordinates;Newest recruitment number corresponding to second address information It is believed that breath in information issuing time should be later than or equal to newest enterprise record information corresponding to the first address information information Grab the time;
Server is examined in residence, according to the coordinate difference for the latitude and longitude coordinates that first address information and the second address information are converted Judge that the enterprise to be tested changes the case where but industry and commerce is put on record not in time with the presence or absence of enterprise address location;
Prewarning unit, according to the enterprise to be tested of judgement, with the presence or absence of the change of enterprise address location, but industry and commerce is put on record not in time The case where result, it is determined whether carry out early warning.
7. there are address locations to change but the industrial and commercial system put on record not in time for prediction enterprise according to claim 6, special Sign is,
Enterprise's record information library constructs server
Enterprise's record information acquiring unit is used to grab and described put on record from industrial and commercial bureau's record information resource system about enterprise First HTML information of information;
Record information processing unit is used to capture the first link information in first HTML information by Python;
Record information cleaning unit is used to carry out data cleansing to first link information using R language.
8. there are address locations to change but the industrial and commercial system put on record not in time for prediction enterprise according to claim 6, special Sign is,
The recruitment information resources bank constructs server
Recruitment information acquiring unit is used to grab the second HTML information in internet about enterprises recruitment breath;
Recruitment information processing unit is used to capture the second link information in second HTML information by Python;
Recruitment information cleaning unit is used to carry out data cleansing to second link information using R language.
9. there are address locations to change but the industrial and commercial system put on record not in time for prediction enterprise according to claim 6, special Sign is,
The longitude and latitude converting unit uses Baidu API conversion chip, and by first address information and the second address information It is converted into latitude coordinate.
10. there are address locations to change but the industrial and commercial system put on record not in time for prediction enterprise according to claim 6, special Sign is,
The residence examines the server to include:
Coordinate difference computing unit is used to calculate the latitude and longitude coordinates of first address information and the conversion of the second address information Coordinate difference;
Address check unit judges that the enterprise to be tested changes but not with the presence or absence of enterprise address location according to the coordinate difference The case where timely industry and commerce is put on record.
11. there are address locations to change but the industrial and commercial system put on record not in time for prediction enterprise according to claim 10, It is characterized in that,
The address check unit judges:
As the coordinate difference >=10 ' ', then judging the enterprise to be tested, there are enterprise address location change but industry and commerce not in time The case where putting on record;
As the coordinate difference < 10 ' ', then judging the enterprise to be tested, there is no enterprise address locations to change but work not in time The case where quotient puts on record.
12. there are address locations to change but the industrial and commercial system put on record not in time for prediction enterprise according to claim 11, It is characterized in that,
The prewarning unit judgement:
As the coordinate difference >=10 ' ', then early warning is carried out;
As the coordinate difference < 10 ' ', then without early warning.
CN201610796065.0A 2016-08-31 2016-08-31 It is a kind of to predict that enterprise changes the method and system that but industry and commerce is put on record not in time there are address location Active CN106469200B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610796065.0A CN106469200B (en) 2016-08-31 2016-08-31 It is a kind of to predict that enterprise changes the method and system that but industry and commerce is put on record not in time there are address location

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610796065.0A CN106469200B (en) 2016-08-31 2016-08-31 It is a kind of to predict that enterprise changes the method and system that but industry and commerce is put on record not in time there are address location

Publications (2)

Publication Number Publication Date
CN106469200A CN106469200A (en) 2017-03-01
CN106469200B true CN106469200B (en) 2019-07-16

Family

ID=58230406

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610796065.0A Active CN106469200B (en) 2016-08-31 2016-08-31 It is a kind of to predict that enterprise changes the method and system that but industry and commerce is put on record not in time there are address location

Country Status (1)

Country Link
CN (1) CN106469200B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107274144A (en) * 2017-05-10 2017-10-20 福建海峡中创网络信息技术股份有限公司 The method for automatically determining employer's authenticity
CN109002555B (en) * 2018-08-09 2022-05-17 郑州市景安网络科技股份有限公司 ICP recording method, device, equipment and readable storage medium
CN109388805A (en) * 2018-10-23 2019-02-26 重庆誉存大数据科技有限公司 A kind of industrial and commercial analysis on altered project method extracted based on entity
CN109885534A (en) * 2019-03-04 2019-06-14 深圳市众信电子商务交易保障促进中心 The management method and terminal of business license
CN110502706A (en) * 2019-08-06 2019-11-26 苏州星宇测绘科技有限公司 Company information inspection system and typing terminal based on Beidou
CN112396540A (en) * 2019-08-19 2021-02-23 上海著腾信息科技有限公司 Method, system, equipment and readable storage medium for judging trademark change
CN112416993A (en) * 2019-08-20 2021-02-26 上海著腾信息科技有限公司 Trademark change judgment method, system, equipment and readable storage medium
CN112446673A (en) * 2019-08-31 2021-03-05 上海著腾信息科技有限公司 Trademark change judgment method, system, equipment and readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101228524A (en) * 2005-05-27 2008-07-23 谷歌公司 Using boundaries associated with a map view for business location searching
CN104899243A (en) * 2015-03-31 2015-09-09 北京奇虎科技有限公司 Method and apparatus for detecting accuracy of POI (Point of Interest) data
CN105574208A (en) * 2016-01-29 2016-05-11 北京橙鑫数据科技有限公司 Calling card recommendation method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101228524A (en) * 2005-05-27 2008-07-23 谷歌公司 Using boundaries associated with a map view for business location searching
CN104899243A (en) * 2015-03-31 2015-09-09 北京奇虎科技有限公司 Method and apparatus for detecting accuracy of POI (Point of Interest) data
CN105574208A (en) * 2016-01-29 2016-05-11 北京橙鑫数据科技有限公司 Calling card recommendation method and device

Also Published As

Publication number Publication date
CN106469200A (en) 2017-03-01

Similar Documents

Publication Publication Date Title
CN106469200B (en) It is a kind of to predict that enterprise changes the method and system that but industry and commerce is put on record not in time there are address location
CN111949834B (en) Site selection method and site selection platform system
Dankovic et al. The scientific basis of uncertainty factors used in setting occupational exposure limits
Shahbaz et al. The impact of supply chain collaboration on operational performance: Empirical evidence from manufacturing of Malaysia
US20090070700A1 (en) Ranking content based on social network connection strengths
JP2007510986A (en) Techniques for analyzing website performance
Wattenberg et al. Assessment of the acute and chronic health hazards of hydraulic fracturing fluids
CN109858919A (en) Determination method and device, online ordering method and the device of abnormal account
JP2017097498A (en) Human resources matching device, human resources matching method, and human resources matching program
Mishra et al. Exploring the awareness regarding e-waste and its health hazards among the informal handlers in Musheerabad area of Hyderabad
CN111639253A (en) Data duplication judging method, device, equipment and storage medium
JP4639615B2 (en) Network path search program and device
Woodruff et al. A science-based agenda for health-protective chemical assessments and decisions: overview and consensus statement
CN106357835A (en) Method and device for determining subordinate region of target IP address
CN110825817B (en) Enterprise suspected association judgment method and system
CN106055591B (en) Weather pushing method and device
Mirabelli et al. Database of occupations and industrial activities that involve the risk of pulmonary tumors
CN106682146A (en) Method and system for retrieving evaluation of scenic spot according to keyword
JP2015053028A (en) Information management system, information management program, information management method, and information management device
JPWO2014064757A1 (en) Corporate information providing apparatus and server program
CN106157214A (en) The method and device of tracking of information
CN107181831B (en) Reverse IP positioning method
JP5902434B2 (en) Route search method for transportation, route search server, and computer program
CN110060080A (en) Sharing method, device and the client device of trade company&#39;s evaluation
CN114896629A (en) Network information safety online monitoring and early warning management system based on big data analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 100070, No. 101-8, building 1, 31, zone 188, South Fourth Ring Road, Beijing, Fengtai District

Patentee after: Guoxin Youyi Data Co., Ltd

Address before: 100070 Beijing city Fengtai District South Fourth Ring Road No. 188 (ABP) B headquarters mansion 9 floor

Patentee before: SIC YOUE DATA Co.,Ltd.