CN105389306A - Latent semantic analysis based intelligent parsing method for application form - Google Patents

Latent semantic analysis based intelligent parsing method for application form Download PDF

Info

Publication number
CN105389306A
CN105389306A CN201510730573.4A CN201510730573A CN105389306A CN 105389306 A CN105389306 A CN 105389306A CN 201510730573 A CN201510730573 A CN 201510730573A CN 105389306 A CN105389306 A CN 105389306A
Authority
CN
China
Prior art keywords
model
matrix
word
request slip
statement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510730573.4A
Other languages
Chinese (zh)
Inventor
夏圣峰
詹仁俊
陈宇星
葛清
田学刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinan True Technology Co Ltd
State Grid Corp of China SGCC
State Grid Fujian Electric Power Co Ltd
Fuzhou Power Supply Co of State Grid Fujian Electric Power Co Ltd
Original Assignee
Jinan True Technology Co Ltd
State Grid Corp of China SGCC
State Grid Fujian Electric Power Co Ltd
Fuzhou Power Supply Co of State Grid Fujian Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinan True Technology Co Ltd, State Grid Corp of China SGCC, State Grid Fujian Electric Power Co Ltd, Fuzhou Power Supply Co of State Grid Fujian Electric Power Co Ltd filed Critical Jinan True Technology Co Ltd
Priority to CN201510730573.4A priority Critical patent/CN105389306A/en
Publication of CN105389306A publication Critical patent/CN105389306A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The present invention discloses a latent semantic analysis based intelligent parsing method for an application form. According to the method provided by the present invention, a latent semantic analysis method is employed, so that irrelevant words in an application form are filtered out, and analysis space is reduced in size; and a filtering process is independent of a power system, but is purely analysis of a natural language. Based on the analysis space reduced in size, maximum fuzzy matching is performed on a professional thesaurus of a power distribution network, so as to intelligently parse the application form. According to the method provided by the present invention, the traditional precise matching mode is no longer a necessary, the success rate of parsing is increased, and a technical basis for application of a higher level is provided.

Description

A kind of request slip intelligently parsing method based on latent semantic analysis
Technical field
The present invention relates to a kind of request slip input method, particularly a kind of request slip intelligently parsing method based on latent semantic analysis.
Background technology
Request slip in current power distribution network is when inputting, substantially be all the mode adopting manual input, during input, randomness is comparatively large, and in some intelligent use, intelligently parsing must be carried out to request slip, allow computing machine understand operation object in request slip and content accurately.At present, in power distribution network, the intelligently parsing of request slip word adopts the accurate word matching process of vector space model mostly, i.e. the word that exists in the word of exact matching user input and vector space.Due to the existence of polysemy (polysemy) and adopted many words (synonymy), make this model cannot be supplied to the retrieval of user semantic aspect.
Summary of the invention
The object of the invention is to overcome the deficiencies in the prior art part, and a kind of traditional exact matching pattern that changes is provided, can the success ratio of resolving be improved, for a kind of request slip intelligently parsing method based on latent semantic analysis of technical foundation is laid in more high-rise application.
A kind of request slip intelligently parsing method based on latent semantic analysis, (1) basic model of request slip content is set up: the sample gathering request slip from historical data, by carrying out manual analysis to sample, generate word feature set and the semantic model set of request slip content, (2), create matrix, svd: the relational matrix automatically generating word feature set and semantic model set in a computer, wherein every a line represents the number of times that word occurs in statement model, each list is shown in statement model for which word: then carry out svd to this matrix, every a line in left matrix X represents the characteristic of word, the characteristic of statement model is shown in each list in right matrix Y, middle singular value matrix represents a line of left singular vector and a significance level arranged of right singular vector, numerical value is more large more important, the potential correlativity of word and statement model is shown in row in X and the list in Y, numerical value is more relevant close to expression, (3) first with segmentation methods, participle carried out to given request slip content and extract word feature, obtaining optimum statement model with in word feature to matrix model according to correlativity, then with statement model, accurate semantics recognition is carried out to request slip content.
In sum, the present invention's following advantage compared to existing technology:
Semanteme parsing at present for power distribution network request slip substantially all adopts the matching way of accurate words and phrases to resolve, the success ratio of resolving is lower, and after the present invention adopts latent semantic analysis method, on the one hand incoherent word in filtering request slip, reduce the size of analysis space, in the process of filtering, it doesn't matter with electric system itself, is the analysis to natural language purely.The basis of reduced analysis space is carried out the maximization fuzzy matching of power distribution network specialized dictionary, thus defines the intelligently parsing of request slip.It breaks away from the pattern of traditional exact matching, improves the success ratio of resolving, for technical foundation has been laid in more high-rise application
Accompanying drawing explanation
Fig. 1 is semantic model set figure of the present invention.
Embodiment
Below in conjunction with embodiment, the present invention is described in more detail.
Embodiment 1
A kind of request slip intelligently parsing method based on latent semantic analysis, (1) basic model of request slip content is set up: the sample gathering request slip from historical data, by carrying out manual analysis to sample, generate word feature set and the semantic model set of request slip content, (2), create matrix, svd: the relational matrix automatically generating word feature set and semantic model set in a computer, wherein every a line represents the number of times that word occurs in statement model, each list is shown in statement model for which word: then carry out svd to this matrix, every a line in left matrix X represents the characteristic of word, the characteristic of statement model is shown in each list in right matrix Y, middle singular value matrix represents a line of left singular vector and a significance level arranged of right singular vector, numerical value is more large more important, the potential correlativity of word and statement model is shown in row in X and the list in Y, numerical value is more relevant close to expression, (3) first with segmentation methods, participle carried out to given request slip content and extract word feature, obtaining optimum statement model with in word feature to matrix model according to correlativity, then with statement model, accurate semantics recognition is carried out to request slip content.
Applying step is as follows:
● from the historical data of GPMS, obtain the historical data of request slip, form the description sample of the safety practice in request slip
● manually the safety practice sample of request slip is analyzed, carry out artificial punctuate analysis, word is wherein carried out feature abstraction, form word feature set, set of words is mated with actual statement, forms corresponding statement model
● automatically generate the relational matrix U of word feature set and semantic model set in a computer, wherein every a line represents the number of times that word occurs in statement model, and the word in statement model is shown in each list:
● svd is carried out to relational matrix U, X Σ Y can be drawn, wherein X and Y orthogonal matrix each other, Σ is diagonal matrix, and the every a line in left matrix X represents the characteristic of word, and the characteristic of statement model is shown in each list in right matrix Y, middle diagonal matrix represents a line of left singular vector and a significance level arranged of right singular vector, numerical value is more large more important, and the potential correlativity of word and statement model is shown in the row in X and the list in Y, and numerical value is more relevant close to expression.
● obtain optimum statement model with in word feature to matrix model according to correlativity, then with statement model, accurate semantics recognition is carried out to request slip content.
Below illustrate:
1. set up the basic model of request slip content
The sample of request slip content is gathered from historical data:
10kV turns maintenance to sage's stand side 611 switch of 31# of washing the sand
10kV becomes side 612 circuit to 4# and turns maintenance by running
2# becomes and turns cold standby by operation
10kVI section bus PT turns maintenance
10kVII section bus turns maintenance
10kV mother 600 switch turns cold standby
10kV upwards 5.131.67 side 602, wild goose village circuit turns maintenance
10kV turns maintenance to Shang Bian673 side 602, Pu circuit
Disconnect to the inner 10kV electric power incoming line switch of Ju Long road 6# looped network 604 side user and disconnecting link
By carrying out manual analysis to sample, generate word feature set and the semantic model set of request slip content.
Word feature set:
1 Electric pressure
2 Transformer station
3 Network distribution transformer
4 Switch
5 Equipment state
6 ……
Accompanying drawing 1 is shown in semantic model set:
2. create matrix, svd
Automatically generate the relational matrix of word feature set and semantic model set in a computer, wherein every a line represents the number of times that word occurs in statement model, and each list shows in statement model have which word:
Then svd is carried out to this matrix, every a line in left matrix X represents the characteristic of word, the characteristic of statement model is shown in each list in right matrix Y, middle singular value matrix represents a line of left singular vector and a significance level arranged of right singular vector, numerical value is more large more important, the potential correlativity of word and statement model is shown in row in X and the list in Y, and numerical value is more relevant close to expression.
3. accurate Analysis
First with segmentation methods, participle carried out to given request slip content and extracts word feature, obtaining optimum statement model with in word feature to matrix model according to correlativity, then with statement model, accurate semantics recognition is carried out to request slip content.
Form corresponding plug-in unit as long as of the present invention, be applied directly in main system, the intelligently parsing function of request slip can be completed.
It is same as the prior art that the present embodiment does not state part.

Claims (1)

1. the request slip intelligently parsing method based on latent semantic analysis, it is characterized in that, concrete grammar is: (1) sets up the basic model of request slip content: the sample gathering request slip from historical data, by carrying out manual analysis to sample, generate word feature set and the semantic model set of request slip content, (2), create matrix, svd: the relational matrix automatically generating word feature set and semantic model set in a computer, wherein every a line represents the number of times that word occurs in statement model, each list is shown in statement model for which word: then carry out svd to this matrix, every a line in left matrix X represents the characteristic of word, the characteristic of statement model is shown in each list in right matrix Y, middle singular value matrix represents a line of left singular vector and a significance level arranged of right singular vector, numerical value is more large more important, the potential correlativity of word and statement model is shown in row in X and the list in Y, numerical value is more relevant close to expression, (3) first with segmentation methods, participle carried out to given request slip content and extract word feature, obtaining optimum statement model with in word feature to matrix model according to correlativity, then with statement model, accurate semantics recognition is carried out to request slip content.
CN201510730573.4A 2015-11-02 2015-11-02 Latent semantic analysis based intelligent parsing method for application form Pending CN105389306A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510730573.4A CN105389306A (en) 2015-11-02 2015-11-02 Latent semantic analysis based intelligent parsing method for application form

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510730573.4A CN105389306A (en) 2015-11-02 2015-11-02 Latent semantic analysis based intelligent parsing method for application form

Publications (1)

Publication Number Publication Date
CN105389306A true CN105389306A (en) 2016-03-09

Family

ID=55421603

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510730573.4A Pending CN105389306A (en) 2015-11-02 2015-11-02 Latent semantic analysis based intelligent parsing method for application form

Country Status (1)

Country Link
CN (1) CN105389306A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109359950A (en) * 2018-10-31 2019-02-19 国网河南省电力公司濮阳供电公司 A kind of method of power system monitor information overall process control

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH02166557A (en) * 1988-12-21 1990-06-27 Agency Of Ind Science & Technol Technical knowledge collector for comprehension of natural language
CN101710333A (en) * 2009-11-26 2010-05-19 西北工业大学 Network text segmenting method based on genetic algorithm
CN101727487A (en) * 2009-12-04 2010-06-09 中国人民解放军信息工程大学 Network criticism oriented viewpoint subject identifying method and system
CN104281567A (en) * 2014-10-13 2015-01-14 安徽华贞信息科技有限公司 Latent semantic analysis method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH02166557A (en) * 1988-12-21 1990-06-27 Agency Of Ind Science & Technol Technical knowledge collector for comprehension of natural language
CN101710333A (en) * 2009-11-26 2010-05-19 西北工业大学 Network text segmenting method based on genetic algorithm
CN101727487A (en) * 2009-12-04 2010-06-09 中国人民解放军信息工程大学 Network criticism oriented viewpoint subject identifying method and system
CN104281567A (en) * 2014-10-13 2015-01-14 安徽华贞信息科技有限公司 Latent semantic analysis method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109359950A (en) * 2018-10-31 2019-02-19 国网河南省电力公司濮阳供电公司 A kind of method of power system monitor information overall process control
CN109359950B (en) * 2018-10-31 2021-07-02 国网河南省电力公司濮阳供电公司 Method for overall process control of power grid monitoring information

Similar Documents

Publication Publication Date Title
CN106250934B (en) Defect data classification method and device
CN110378809A (en) A kind of intelligent check method of supervisory control of substation information table information data
CN107153150A (en) A kind of power distribution network over-voltage fault type recognition method and device
CN106505731B (en) A kind of intelligent substation and scheduling are to test case intelligent generating system and a method
CN105930509B (en) Field concept based on statistics and template matching extracts refined method and system automatically
CN106326307A (en) Language interaction method
CN110888973B (en) Method for automatically structuring and carding monitoring information table
CN106372118B (en) Online semantic understanding search system and method towards mass media text data
CN107066541A (en) The processing method and system of customer service question and answer data
CN107153946A (en) Intelligent station is secondary to pacify automatic generation method and the system of arranging
CN110991188A (en) Ticket forming method applied to distribution network scheduling intelligent ticket forming system
CN113360641B (en) Deep learning-based power grid fault handling plan semantic modeling system and method
CN102403718A (en) Generating method for power grid topological relationship based on Arcgis
CN104091243A (en) Intelligent switching operation ticket mechanism designing and achieving method
CN105740218A (en) Post-editing processing method for mechanical translation
CN111832977A (en) Maintenance application automatic ticketing method based on natural language parsing
CN110399463A (en) The Similarity Match Method and device of work ticket
CN111126055A (en) Power grid equipment name matching method and system
CN105389306A (en) Latent semantic analysis based intelligent parsing method for application form
CN103713523A (en) Influencing analysis method of whole intelligent substation model
CN103279824A (en) Modeling method for relay protection setting calculation system
CN104731811A (en) Cluster information evolution analysis method for large-scale dynamic short texts
CN112420042A (en) Control method and device of power system
CN114676698A (en) Equipment fault key information extraction method and system based on knowledge graph
CN112036179B (en) Electric power plan information extraction method based on text classification and semantic frame

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160309