CN104994208B - Contact person information of mobile terminal extracting method and system - Google Patents

Contact person information of mobile terminal extracting method and system Download PDF

Info

Publication number
CN104994208B
CN104994208B CN201510397401.XA CN201510397401A CN104994208B CN 104994208 B CN104994208 B CN 104994208B CN 201510397401 A CN201510397401 A CN 201510397401A CN 104994208 B CN104994208 B CN 104994208B
Authority
CN
China
Prior art keywords
information
alias
character
original name
address list
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510397401.XA
Other languages
Chinese (zh)
Other versions
CN104994208A (en
Inventor
周伟达
梅微星
俞凯
曹迪
朱苏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sipic Technology Co Ltd
Original Assignee
Suzhou Speech Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Speech Information Technology Co Ltd filed Critical Suzhou Speech Information Technology Co Ltd
Priority to CN201510397401.XA priority Critical patent/CN104994208B/en
Publication of CN104994208A publication Critical patent/CN104994208A/en
Application granted granted Critical
Publication of CN104994208B publication Critical patent/CN104994208B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)
  • Telephone Function (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a kind of contact person information of mobile terminal extracting method and system, methods described to include:For any of multiple original name character strings original name character string in address list, the original name is pre-processed, filters sensitive character in the original name character string, the sensitive character comprises at least the character of nonnumeric, non-alphabetical and non-Chinese character;Word segmentation processing is carried out to the pre-processed results of the original name;By parsing word segmentation result, Alias information and auxiliary information are extracted, and Alias information is post-processed, the Alias information comprises at least Chinese nickname and English alias, and the auxiliary information is decoration information;Corresponding each address book entries, according to Alias information and auxiliary information generating structure information data, build address list information database.The present invention can filter out the effective information in address list, improve the robustness and accuracy of the semantic parsing of name, support is provided for intelligentized man-machine interaction.

Description

Contact person information of mobile terminal extracting method and system
Technical field
The present invention relates to communication technical field, and in particular to contact person information of mobile terminal extracting method and system.
Background technology
The technological innovation of the rapid development, the especially field such as traffic, information and communication of science and technology, is greatly changed Therefore the life of people and communication mode, the social networks of person to person's composition have also obtained great extension.As people are social The drastically expansion of network, effective storage of social bond people's information and quick-searching also more turn into one can not be ignored ask Topic.Once simple contact management's mode by memory was no longer reliable already, and papery address list then has fragile, easy something lost The drawback such as lose, be not easy to update.In recent years, the departure of information and communication technology (ICT) is that people are brought based on personal computer, movement The electronic address list of phone.It compensate for the unreliability of memory, the shortcomings that overcoming papery address list.Given birth to however as people The continuous quickening and the continuous enlargement of social networks that movable joint is played, the collection of associated person information become a kind of burden gradually, without Complete associated person information can be then that the routine work of people and social activities bring influence.
Recent years, with the development of mobile Internet, the application scenarios of smart mobile phone are more and more extensive.And address list The One function essential as mobile phone, contains important associated person information.Yet with user's custom, erroneous input etc. one Series of problems, the original name information of address list often include the information of non-name.Such as often had in address list:It is " small No. Zhang Suzhou ", " the name entry such as Li Si 2 ".Such entry is often believed because not being the name information of specification as name Cease the obstruction of identification.Simultaneously as the custom of user, for example, for the address book entries of " horse sword brother ", user wants to use " horse This appellation removal search of brother ", it can not often be correctly found the entry for wanting inquiry.Also, " Shanghai Communications University horse is old for address list The entry of teacher " etc, except with name information as " horse teacher ", further comprises " Shanghai Communications University " such auxiliary letter Breath, these information can provide many help to intelligent human-machine interaction.So for the original name information of address list, it is badly in need of A kind of solution that can effectively extract name relevant information is provided.
The content of the invention
For in the prior art the defects of, can be with the invention provides a kind of contact person information of mobile terminal extraction system The extraction to alias and auxiliary information in address list raw information is realized, so as to provide support for intelligentized man-machine interaction.
In a first aspect, the present invention provides a kind of contact person information of mobile terminal extracting method, the mobile terminal includes One address list, includes multiple address book entries in the address list, and each address book entries record is related the original name word of people Symbol string and corresponding telephone number, including:
For any of multiple original name character strings original name character string in address list, the original name is entered Row pretreatment, filters sensitive character in the original name character string, the sensitive character comprise at least nonnumeric, non-letter and The character of non-Chinese character;
Word segmentation processing is carried out to the pre-processed results of the original name;
By parsing word segmentation result, Alias information and auxiliary information are extracted, and Alias information is post-processed, it is described other Name information comprises at least Chinese nickname and English alias, and the auxiliary information is decoration information;
Corresponding each address book entries, according to Alias information and auxiliary information generating structure information data, structure communication Record information database.
Alternatively, the Alias information comprise at least complete Chinese Name, name part, English name, relation appellation information, It is accustomed to appellation information.
Alternatively, the auxiliary information comprises at least urban information, company information, school information, job information.
Alternatively, post processing is carried out to Alias information to comprise at least:The alias result that conflicts screening.
Alternatively, the auxiliary information is more than 2 characters.
Second aspect, present invention also offers a kind of contact person information of mobile terminal extraction system, the mobile terminal bag An address list has been included, multiple address book entries are included in the address list, each address book entries record the original surname for the people that is related Name character string and corresponding telephone number, the system include:
Pretreatment module, for for any of multiple original name character strings original name character string in address list, The original name is pre-processed, filters sensitive character in the original name character string, the sensitive character at least wraps Include the character of nonnumeric, non-alphabetical and non-Chinese character;
Word-dividing mode, for carrying out word segmentation processing to the pre-processed results of the original name;
Parsing module, for by parsing word segmentation result, extracting Alias information and auxiliary information, and carry out to Alias information Post processing, the Alias information comprise at least Chinese nickname and English alias, and the auxiliary information is decoration information;
Address list information database generation module, for corresponding each address book entries, believed according to Alias information and auxiliary Generating structure information data is ceased, builds address list information database.
Alternatively, the Alias information comprise at least complete Chinese Name, name part, English name, relation appellation information, It is accustomed to appellation information.
Alternatively, the auxiliary information comprises at least urban information, company information, school information, job information.
Alternatively, post processing is carried out to Alias information to comprise at least:The alias result that conflicts screening.
Alternatively, the auxiliary information is more than 2 characters.
As shown from the above technical solution, the present invention proposes a kind of contact person information of mobile terminal extracting method and system, By rejecting the sensitive character in address book contact original name character string, therefrom parsing extracts Alias information and auxiliary is believed Breath, structuring address list information database is built, the effective information in address list can be filtered out, improve the semantic parsing of name It robustness and accuracy, can be used for the auxiliary information in address list, support is provided for intelligentized man-machine interaction.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention, And can be practiced according to the content of specification, and in order to allow above and other objects of the present invention, feature and advantage can Become apparent, below especially exemplified by the embodiment of the present invention.
Brief description of the drawings
By reading the detailed description of hereafter preferred embodiment, it is various other the advantages of and benefit it is common for this area Technical staff will be clear understanding.Accompanying drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the present invention Limitation.And in whole accompanying drawing, identical part is denoted by the same reference numerals.In the accompanying drawings:
Fig. 1 is the schematic flow sheet for the contact person information of mobile terminal extracting method that one embodiment of the invention provides;
Fig. 2 is the structural representation for the contact person information of mobile terminal extraction system that one embodiment of the invention provides.
Embodiment
In order to realize the extraction to alias and auxiliary information in address list raw information, so as to be intelligentized man-machine interaction There is provided and support, the embodiments of the invention provide a kind of contact person information of mobile terminal extracting method and system.
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although the disclosure is shown in accompanying drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here Limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure Completely it is communicated to those skilled in the art.
The present embodiments relate to a kind of solution for carrying out intellectual analysis to address list information, extracting alias and auxiliary information Analysis system, by this systematic difference, the semantic parsing in the field that can be dialled for mobile phone, mobile unit etc. provides auxiliary and branch Hold.
Fig. 1 shows the schematic flow sheet for the contact person information of mobile terminal extracting method that one embodiment of the invention provides, As shown in figure 1, this method comprises the following steps:
In embodiments of the present invention, mobile terminal includes an address list, and multiple address book entries are included in the address list, Each address book entries record the original name character string for the people that is related and corresponding telephone number.
101st, for any of multiple original name character strings original name character string in address list, to the original surname Name is pre-processed, and filters sensitive character in the original name character string, the sensitive character comprises at least nonnumeric, non-word The character of female and non-Chinese character;
In embodiments of the present invention, all original name character strings can be pre-processed by information pre-processor, Filter sensitive character in original name character string.
The wherein character of including but not limited to nonnumeric, the non-alphabetical and non-Chinese character of the sensitive character.
102nd, word segmentation processing is carried out to the pre-processed results of the original name;
In embodiments of the present invention, word segmentation processing can be carried out to the pre-processed results of the original name, first to the original The pre-processed results of beginning name carry out participle cutting, and the pre-processed results of the original name are cut into some words, refiltered Fall the function words such as auxiliary word therein, conjunction, so as to obtain some participles that can represent certain special characteristic.
103rd, by parsing word segmentation result, Alias information and auxiliary information are extracted, and Alias information is post-processed, institute State Alias information and comprise at least Chinese nickname and English alias, the auxiliary information is decoration information;
Wherein, the Alias information include but is not limited to complete Chinese Name, name part, English name, relation appellation information, It is accustomed to appellation information.Such as:King brother, happy brothers etc..
Wherein, the auxiliary information comprises at least urban information, company information, school information, job information.Such as:Shanghai Hand over big, Beijing etc..
It should be noted that the embodiment of the present invention by parsing word segmentation result, is extracted to Alias information and auxiliary information Sequencing is not specifically limited, and can is by parsing word segmentation result, first extract Alias information, then be parsed word segmentation result, carry Auxiliary information or analytic results are taken, while extract Alias information and auxiliary information.
Preferably, Alias information is post-processed, the alias result that including but not limited to conflicts screening.
It is understood that in embodiments of the present invention, the auxiliary information is more than 2 characters, that is to say not to be individual character.
In embodiments of the present invention, for any of multiple original name character strings original name character string in address list Step 101-103 is performed both by, every group of Alias information is obtained and auxiliary information corresponds to an entry in address list.
104th, corresponding each address book entries, according to Alias information and auxiliary information generating structure information data, structure Address list information database.
In embodiments of the present invention, Alias information and auxiliary information are decomposed into after analysis multiple inter-related Part, there is clear and definite hierarchical structure between each part, its operation and maintenance is managed by database, improves name The robustness and accuracy of semanteme parsing.
Method provided in an embodiment of the present invention, by rejecting the sensitive words in address book contact original name character string Symbol, therefrom parsing extract Alias information and auxiliary information, build structuring address list information database, can filter out communication Effective information in record, the robustness and accuracy of the semantic parsing of name are improved, can be added for the auxiliary information in address list To utilize, support is provided for intelligentized man-machine interaction.
Fig. 2 shows a kind of contact person information of mobile terminal extraction system provided in an embodiment of the present invention, the mobile terminal Include an address list, multiple address book entries are included in the address list, each address book entries record the original of people of being related Name character string and corresponding telephone number, the system include:
Pretreatment module 21, for for any of multiple original name character strings original name character in address list String, pre-processes to the original name, filters sensitive character in the original name character string, the sensitive character is at least Include the character of nonnumeric, non-alphabetical and non-Chinese character;
Word-dividing mode 22, for carrying out word segmentation processing to the pre-processed results of the original name;
Parsing module 23, for by parsing word segmentation result, extracting Alias information and auxiliary information, and enter to Alias information Row post processing, the Alias information comprise at least Chinese nickname and English alias, and the auxiliary information is decoration information;
Address list information database generation module 24, for corresponding each address book entries, according to Alias information and auxiliary Information generating structure information data, build address list information database.
Alternatively, the Alias information comprise at least complete Chinese Name, name part, English name, relation appellation information, It is accustomed to appellation information.
Alternatively, the auxiliary information comprises at least urban information, company information, school information, job information.
Alternatively, post processing is carried out to Alias information to comprise at least:The alias result that conflicts screening.
Alternatively, the auxiliary information is more than 2 characters.
System provided in an embodiment of the present invention, by rejecting the sensitive words in address book contact original name character string Symbol, therefrom parsing extract Alias information and auxiliary information, build structuring address list information database, can filter out communication Effective information in record, the robustness and accuracy of the semantic parsing of name are improved, can be added for the auxiliary information in address list To utilize, support is provided for intelligentized man-machine interaction.
It will be appreciated that above-mentioned contact person information of mobile terminal extraction system is extracted with above-mentioned contact person information of mobile terminal Method is one-to-one, and no longer above-mentioned starter is described in detail for the present embodiment.
It should be understood by those skilled in the art that, embodiments herein can be provided as method, system or computer program Product.Therefore, the application can use the reality in terms of complete hardware embodiment, complete software embodiment or combination software and hardware Apply the form of example.Moreover, the application can use the computer for wherein including computer usable program code in one or more The computer program production that usable storage medium is implemented on (including but is not limited to magnetic disk storage, CD-ROM, optical memory etc.) The form of product.
The application is with reference to the flow according to the method for the embodiment of the present application, equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that can be by every first-class in computer program instructions implementation process figure and/or block diagram Journey and/or the flow in square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided The processors of all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce A raw machine so that produced by the instruction of computer or the computing device of other programmable data processing devices for real The device for the function of being specified in present one flow of flow chart or one square frame of multiple flows and/or block diagram or multiple square frames.
These computer program instructions, which may be alternatively stored in, can guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory, which produces, to be included referring to Make the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one square frame of block diagram or The function of being specified in multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that counted Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented processing, so as in computer or The instruction performed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one The step of function of being specified in individual square frame or multiple square frames.
Although having been described for the preferred embodiment of the application, those skilled in the art once know basic creation Property concept, then can make other change and modification to these embodiments.So appended claims be intended to be construed to include it is excellent Select embodiment and fall into having altered and changing for the application scope.
Obviously, those skilled in the art can carry out the essence of various changes and modification without departing from the application to the application God and scope.So, if these modifications and variations of the application belong to the scope of the application claim and its equivalent technologies Within, then the application is also intended to comprising including these changes and modification.

Claims (10)

1. a kind of contact person information of mobile terminal extracting method, the mobile terminal include an address list, wrapped in the address list Containing multiple address book entries, each address book entries record the original name character string for the people that is related and corresponding telephone number, Characterized in that, methods described includes:
For any of multiple original name character strings original name character string in address list, to the original name character string Pre-processed, filter sensitive character in the original name character string, the sensitive character comprises at least nonnumeric, non-letter With the character of non-Chinese character;
Word segmentation processing is carried out to the pre-processed results of the original name character string;
By parsing word segmentation result, Alias information and auxiliary information are extracted, and Alias information is post-processed, the alias letter Breath comprises at least Chinese nickname and English alias, and the auxiliary information is decoration information;
Corresponding each address book entries, according to Alias information and auxiliary information generating structure information data, structure address list letter Cease database.
2. according to the method for claim 1, it is characterised in that the Alias information comprises at least complete Chinese Name, name Character segment, English name, relation appellation information, custom appellation information.
3. according to the method for claim 1, it is characterised in that the auxiliary information comprises at least urban information, company believes Breath, school information, job information.
4. according to the method for claim 1, it is characterised in that post processing is carried out to Alias information and comprised at least:Conflict is other Name result screening.
5. according to the method described in any claim of claim 1 and 3, it is characterised in that the auxiliary information is more than 2 words Symbol.
6. a kind of contact person information of mobile terminal extraction system, the mobile terminal include an address list, wrapped in the address list Containing multiple address book entries, each address book entries record the original name character string for the people that is related and corresponding telephone number, Characterized in that, the system includes:
Pretreatment module, for for any of multiple original name character strings original name character string in address list, to institute State original name character string to be pre-processed, filter sensitive character in the original name character string, the sensitive character is at least Include the character of nonnumeric, non-alphabetical and non-Chinese character;
Word-dividing mode, for carrying out word segmentation processing to the pre-processed results of the original name character string;
Parsing module, for by parsing word segmentation result, extracting Alias information and auxiliary information, and locate after being carried out to Alias information Reason, the Alias information comprise at least Chinese nickname and English alias, and the auxiliary information is decoration information;
Address list information database generation module, for corresponding each address book entries, given birth to according to Alias information and auxiliary information Into structured message data, address list information database is built.
7. system according to claim 6, it is characterised in that the Alias information comprises at least complete Chinese Name, name Character segment, English name, relation appellation information, custom appellation information.
8. system according to claim 6, it is characterised in that the auxiliary information comprises at least urban information, company believes Breath, school information, job information.
9. system according to claim 6, it is characterised in that post processing is carried out to Alias information and comprised at least:Conflict is other Name result screening.
10. according to the system described in any claim of claim 6 and 8, it is characterised in that the auxiliary information is more than 2 Character.
CN201510397401.XA 2015-07-08 2015-07-08 Contact person information of mobile terminal extracting method and system Active CN104994208B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510397401.XA CN104994208B (en) 2015-07-08 2015-07-08 Contact person information of mobile terminal extracting method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510397401.XA CN104994208B (en) 2015-07-08 2015-07-08 Contact person information of mobile terminal extracting method and system

Publications (2)

Publication Number Publication Date
CN104994208A CN104994208A (en) 2015-10-21
CN104994208B true CN104994208B (en) 2018-04-06

Family

ID=54305959

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510397401.XA Active CN104994208B (en) 2015-07-08 2015-07-08 Contact person information of mobile terminal extracting method and system

Country Status (1)

Country Link
CN (1) CN104994208B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814470A (en) * 2020-07-14 2020-10-23 混沌时代(北京)教育科技有限公司 Method and system for extracting name based on internet nickname

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853280A (en) * 2010-05-19 2010-10-06 北京友录在线科技发展有限公司 Method for searching for contacts in hand-held equipment
CN103294776A (en) * 2013-05-13 2013-09-11 浙江大学 Smartphone address book fuzzy search method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4244714B2 (en) * 2003-06-10 2009-03-25 日本電気株式会社 Mobile communication terminal and communication information selection method
CN101834928A (en) * 2010-04-21 2010-09-15 宇龙计算机通信科技(深圳)有限公司 Method, system and mobile terminal for searching contacts

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853280A (en) * 2010-05-19 2010-10-06 北京友录在线科技发展有限公司 Method for searching for contacts in hand-held equipment
CN103294776A (en) * 2013-05-13 2013-09-11 浙江大学 Smartphone address book fuzzy search method

Also Published As

Publication number Publication date
CN104994208A (en) 2015-10-21

Similar Documents

Publication Publication Date Title
CN107291783B (en) Semantic matching method and intelligent equipment
CN103186524B (en) A kind of place name identification method and apparatus
CN102801859B (en) Method and device for identifying junk short message, and mobile communication terminal with device
CN101976253B (en) Chinese variation text matching recognition method
CN110297988A (en) Hot topic detection method based on weighting LDA and improvement Single-Pass clustering algorithm
CN103123618B (en) Text similarity acquisition methods and device
CN106776538A (en) The information extracting method of enterprise's noncanonical format document
CN107092639A (en) A kind of search engine system
CN103984771B (en) Method for extracting geographical interest points in English microblog and perceiving time trend of geographical interest points
CN106886567A (en) Microblogging incident detection method and device based on semantic extension
CN102737039A (en) Index building method, searching method and searching result sorting method and corresponding device
CN103123624A (en) Method of confirming head word, device of confirming head word, searching method and device
CN104915420B (en) Knowledge base data processing method and system
Evert A Lightweight and Efficient Tool for Cleaning Web Pages.
CN106383814A (en) Word segmentation method of English social media short text
CN112541095A (en) Video title generation method and device, electronic equipment and storage medium
CN107341157B (en) Customer service conversation clustering method and device
CN105095196A (en) Method and device for finding new word in text
CN110019649A (en) A kind of method and device established, search for index tree
CN105488471B (en) A kind of font recognition methods and device
CN103076894A (en) Method and equipment for building input entries for object identity information according to object identity information
CN111914554A (en) Training method of field new word recognition model, field new word recognition method and field new word recognition equipment
KR20160068441A (en) Device and storage medium for protecting privacy information
CN104994208B (en) Contact person information of mobile terminal extracting method and system
CN111680146A (en) Method and device for determining new words, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Patentee after: Sipic Technology Co.,Ltd.

Address before: 215123 room 703, building 9, 328 Xinghu street, Suzhou Industrial Park, Jiangsu Province

Patentee before: AI SPEECH Co.,Ltd.

PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Method and System for Extracting Contact Information from Mobile Terminals

Effective date of registration: 20230726

Granted publication date: 20180406

Pledgee: CITIC Bank Limited by Share Ltd. Suzhou branch

Pledgor: Sipic Technology Co.,Ltd.

Registration number: Y2023980049433