CN109740150A - Address resolution method, device, computer equipment and computer readable storage medium - Google Patents

Address resolution method, device, computer equipment and computer readable storage medium Download PDF

Info

Publication number
CN109740150A
CN109740150A CN201811564845.8A CN201811564845A CN109740150A CN 109740150 A CN109740150 A CN 109740150A CN 201811564845 A CN201811564845 A CN 201811564845A CN 109740150 A CN109740150 A CN 109740150A
Authority
CN
China
Prior art keywords
address
word
address resolution
semantic slot
participle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811564845.8A
Other languages
Chinese (zh)
Inventor
张贺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Volkswagen China Investment Co Ltd
Mobvoi Innovation Technology Co Ltd
Original Assignee
Chumen Wenwen Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chumen Wenwen Information Technology Co Ltd filed Critical Chumen Wenwen Information Technology Co Ltd
Priority to CN201811564845.8A priority Critical patent/CN109740150A/en
Publication of CN109740150A publication Critical patent/CN109740150A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)

Abstract

Present disclose provides a kind of address resolution methods, comprising: word segmentation processing is carried out to the corpus of acquisition, to obtain the word as participle corpus;One semantic slot is marked respectively to each word according to address division mode;Feature extraction processing is carried out to each word, including name substance feature is extracted by name Entity recognition and part of speech label characteristics are extracted by part-of-speech tagging;And participle corpus is trained to obtain address resolution model according to treated.The disclosure additionally provides a kind of address analyzing device and computer equipment and computer readable storage medium.

Description

Address resolution method, device, computer equipment and computer readable storage medium
Technical field
This disclosure relates to a kind of address resolution method, address analyzing device and computer equipment and computer-readable storage Medium.
Background technique
Task interactive system has been successfully applied a plurality of types of equipment, such as: mobile phone is (such as: going out and ask Ask mobile phone A pp), speaker (such as: going out and ask Tic Home intelligent sound box), TV (such as: micro- whale TV) and wearable device (such as: going out and ask Tic Pods Free intelligent earphone).It generally includes one or more vertical fields, vertical field table Show natural language text fields, such as music field, navigation field, weather field etc..
In current Task interactive system, the vertical field such as navigation/restaurant/hotel is all by calling API The mode of (Application Programming Interface, application programming interface) is inquired, and calls API only When filling the address semanteme slot of coarseness, such as 7 layers of Haidian District New Zhongguan Building, the street Xin Zhongguan and Haidian street intersection, Shanghai City area, Pudong, Suzhou street 3 etc., it is inaccurate to will lead to API query result, therefore how to make query result more quasi- It is really the technical issues that need to address.And there is corresponding training corpus for training in each vertical field in the prior art Vertical domain classification model will lead to system maintenance cost in this way and improve and training data compiling costs be improved.
Summary of the invention
At least one of in order to solve the above-mentioned technical problem, present disclose provides a kind of address resolution method, addresses to solve Analysis apparatus and computer equipment and computer readable storage medium.
According to one aspect of the disclosure, a kind of address resolution method, comprising: word segmentation processing is carried out to the corpus of acquisition, To obtain the word as participle corpus;One semantic slot is marked respectively to each word according to address division mode;To each word into Row feature extraction processing, including name substance feature is extracted by name Entity recognition and part of speech label is extracted by part-of-speech tagging Feature;And participle corpus is trained to obtain address resolution model according to treated.
According at least one embodiment of the disclosure, this method further include: carrying out feature extraction processing to each word Afterwards, obtained data are formatted, so that according to treated, participle corpus is trained.
According at least one embodiment of the disclosure, this method further include: carry out natural language to natural-sounding text Understand;When understanding semantic slot related there are address in result naturally, the text of semantic slot related to the address is segmented Processing;Name substance feature is extracted by name Entity recognition to each word after participle and part of speech mark is extracted by part-of-speech tagging Sign feature;And address resolution is carried out using the address resolution model.
According at least one embodiment of the disclosure, when carrying out natural language understanding to natural-sounding text, obtain certainly Vertical field belonging to right language text and semantic slot;And the method also includes carrying out ground using address resolution model After the parsing of location, according to address parsing result, subsequent processing is carried out in the corresponding vertical field.
According at least one embodiment of the disclosure, the address is drawn in mode, to county-level city or districts under city administration address layer The address of each level under grade configures semantic slot.
According at least one embodiment of the disclosure, address under county-level city or districts under city administration address level, for city Semantic slot is respectively configured for commercial circle, road, building number, place name and better address in address, and for rural area address, it is township Semantic slot is respectively configured in town, rural area and better address.
According at least one embodiment of the disclosure, the word for the semantic slot being not belonging in the division mode of address is carried out single Solely configuration.
According to another aspect of the present disclosure, a kind of address analyzing device, comprising: word segmentation module carries out the corpus of acquisition Word segmentation processing, to obtain the word as participle corpus;Labeling module marks one to each word according to address division mode respectively Semantic slot;Extraction module carries out feature extraction processing to each word, including extracts name substance feature by name Entity recognition And part of speech label characteristics are extracted by part-of-speech tagging;And training module, according to treated, participle corpus is trained to obtain To address resolution model.
According to the another further aspect of the disclosure, a kind of computer equipment, comprising: memory, the memory store computer It executes instruction;And processor, the processor executes the computer executed instructions of the memory storage, so that the processing Device executes above-mentioned address resolution method.
According to the another embodiment of the disclosure, computer readable storage medium, wherein computer executed instructions are stored with, For realizing address resolution method described in any of the above embodiments when the computer executed instructions are executed by processor.
Detailed description of the invention
Attached drawing shows the illustrative embodiments of the disclosure, and it is bright together for explaining the principles of this disclosure, Which includes these attached drawings to provide further understanding of the disclosure, and attached drawing is included in the description and constitutes this Part of specification.
Fig. 1 is the schematic flow chart of address resolution model generating method in accordance with one embodiment of the present disclosure.
Fig. 2 is the explanatory view of participle mode in accordance with one embodiment of the present disclosure.
Fig. 3 is the schematic view of the fine granularity address division mode of conversational system in accordance with one embodiment of the present disclosure Figure.
Fig. 4 is the address hierarchical relationship of fine granularity address division agreement in accordance with one embodiment of the present disclosure.
Fig. 5 is the signal of the address hierarchical relationship of fine granularity address division agreement in accordance with one embodiment of the present disclosure Property view.
Fig. 6 is the schematic view of name Entity recognition and part of speech label for labelling in accordance with one embodiment of the present disclosure Figure.
Fig. 7 is the schematic flow chart of address resolution model generating method in accordance with one embodiment of the present disclosure.
Fig. 8 be in accordance with one embodiment of the present disclosure format conversion obtained from data format schematic diagram.
Fig. 9 is the schematic flow chart of address resolution method in accordance with one embodiment of the present disclosure.
Figure 10 is the schematic flow chart of reusable address resolution method in accordance with one embodiment of the present disclosure.
Figure 11 is the exemplary diagram of reusable address resolution method in accordance with one embodiment of the present disclosure.
Figure 12 is the schematic block diagram of address resolution model generating means in accordance with one embodiment of the present disclosure.
Figure 13 is the schematic block diagram of address analyzing device in accordance with one embodiment of the present disclosure.
Figure 14 is the schematic block diagram of reusable address analyzing device in accordance with one embodiment of the present disclosure.
Figure 15 is the schematic diagram of computer equipment in accordance with one embodiment of the present disclosure.
Specific embodiment
The disclosure is described in further detail with embodiment with reference to the accompanying drawing.It is understood that this place The specific embodiment of description is only used for explaining related content, rather than the restriction to the disclosure.It also should be noted that being Convenient for description, part relevant to the disclosure is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure can To be combined with each other.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with embodiment.
The conversational system of the interactive system of such as Task may include speech recognition module, natural language understanding mould Block, dialogue management module, spatial term module and voice synthetic module etc..
Natural language understanding module can be used for carrying out semantic parsing to the natural language text that speech recognition module exports, Non-structured natural language text is resolved to the structural knowledge for meeting natural language understanding agreement.
Natural language understanding agreement may include vertical field, and field is intended to, semantic three category information of slot.
Conversational system may include one or more vertical fields, and vertical field indicates natural language text fields, Such as: natural language text " daphne odera for playing Zhou Jielun " belongs to music field, natural language text " checks Pekinese's tomorrow Weather " belongs to weather field, natural language text " Tian An-men is gone in navigation " belongs to navigation field.Each vertical field has accordingly Training corpus for training vertical domain classification model.
One vertical field may include that one or more fields are intended to, and field is intended to indicate in vertical field, natural Language text it is specifically intended, for example, in weather field, natural language text " raining in Beijing tomorrow ", which belongs to, to be asked whether Rainy intention, natural language text " air quality good or not today " belong to the intention of inquiry air quality, natural language text This " blowing in Beijing " belongs to the intention for asking whether wind.
One vertical field includes one or more semantic slots, and semantic slot indicates the natural language text in vertical field The actual conditions of restriction, for example, generally comprising " time " and " place " two kinds of semantic slots, natural language text in weather field This " raining in Beijing tomorrow " defines that " time " condition is " tomorrow " and " place " condition is " Beijing ", natural language text " air quality good or not today " defines that " time " condition is " today ", natural language text " blowing in Beijing " defines " place " condition is " Beijing ".
Many vertical fields all include the related semantic slot in address, such as vertical field of navigating, the vertical field in restaurant, hotel hang down Straight field etc., such as: in vertical field of navigating, " [location of Haidian District Zhongguancun Street 9] is removed in navigation " includes semanteme Slot [location of Haidian District Zhongguancun Street 9];[street Xin Zhongguan and the friendship of Haidian street " are asked in vertical field at the restaurant Prong location] restaurant " include semantic slot [street Xin Zhongguan and Haidian street intersection location];It is vertical in hotel In field, hotel: " searching the hotel of [7 layers of location of Haidian District New Zhongguan Building] nearby " [Haidian District is new comprising semantic slot 7 floor location of the mansion Zhong Guan].
According to the first embodiment of the disclosure, a kind of address resolution method is provided, as shown in Figure 1, comprising: participle step Rapid S11, annotation step S12, extraction step S13 and training step S14.
In step s 11, word segmentation processing is carried out to the corpus of the acquisition of original language material etc., to obtain as participle language The word of material.Wherein, the corpus of the acquisition can be the corpus in corpus.For example, before this step can also include collecting The step of corpus, can obtain the original language material information of text formatting, such as " Duo Fujie and Hu Xilulu from corpus Mouth ", " Chaoyang District Beijing ", " Daheng Technology Building of Suzhou Street, Haidian District, Beijing City 3 " etc..
The exemplary diagram of the participle in the case of above-mentioned example is shown in Fig. 2.In Fig. 2, by original language material " Duo Fujie and lake West Road crossing " participle is participle corpus " Duo Fujie ", "AND", " lake West Road " and " crossing " four words;By original language material " Beijing Chaoyang District " participle is participle corpus " Beijing " and " Chaoyang District " two words;And by original language material " Suzhou Street, Haidian District, Beijing City No. 3 Daheng Technology Buildings " participle is participle corpus " Beijing ", " Haidian District ", " Su Zhoujie ", " 3 ", " number " and " Daheng's science and technology The word of mansion " six.
In an optional embodiment of the disclosure, the participle tool that can be used is Stanford CoreNLP.When So, those skilled in the art should understand that other participle tools can also be selected, and the disclosure is not construed as limiting this.
In step s 12, a semantic slot is marked to each word according to address division mode respectively.
A kind of example of fine granularity address division mode suitable for conversational system is shown in Fig. 3, such as can be applicable in In Task interactive system.Schematically illustrated in Fig. 3 by address according to level be divided into 11 types " country ", " province/autonomous region/autonomous prefecture (western countries) ", " general city/municipality directly under the Central Government/special administrative region/Taiwan cities and counties ", " county-level city/county City/districts under city administration ", " commercial circle ", " street/national highway/provincial highway ", " street architecture number ", " place/building/mechanism/shop/cell tool Body name ", " details, doorplate building number, floor, azimuth information ", " small towns " and " rural area ".It should be noted that the division Only exemplary division, those skilled in the art can realize other division modes according to the actual situation.Show in Fig. 3 Go out and distinguished matched semantic slot for the address after dividing, such as above-mentioned division mode, has matched semantic slot respectively “country”、“province”、“city”、“county”、“business_district”、“street”、“street_ number","name","detail","town","village".Certainly the representation of semanteme slot can also use its other party Formula indicates.
The address hierarchical relationship that fine granularity corresponding with above-mentioned division mode address divides agreement may refer to Fig. 4.Such as Shown in Fig. 4, successively address hierarchical relationship is indicated from high to low from top to bottom.And according to user's use habit of conversational system, In the address level below county-level city/county town/districts under city administration, it is divided into Liang Ge agreement branch, is " business_ respectively The commercial circle district "-" street street/national highway/provincial highway "-" street_number street architecture number "-" place name/it builds Build/mechanism/specific name of shop/cell " and " small towns town "-" rural area village ".
In an embodiment of the disclosure, agreement can be divided according to address above mentioned to mark one respectively to each word A semanteme slot.As shown in figure 5, the semantic slot of " Duo Fujie " is labeled as " street ".In addition association is divided for being not belonging to address The word of semanteme slot shown in view also marks semantic slot, such as " crossing " shown in Fig. 5 and is not belonging to divide in Fig. 3 and 4 Semantic slot individually marks the word for the semantic slot being not belonging in the division mode of address, at this moment can be not belonging to these draw The word of the semantic slot divided is labeled as such as " other " (word " number " in Fig. 5 is also labeled as " other ").Wherein the mark can be with Using artificial mask method or automatic marking method.
In step s 13, feature extraction processing is carried out to each word, including name entity is extracted by name Entity recognition Feature and pass through part-of-speech tagging extract part of speech label characteristics.
According to the disclosure optional embodiment, the name Entity recognition tool and part-of-speech tagging tool that can be used For Stanford CoreNLP.Such as when using Stanford CoreNLP being named Entity recognition, LOC (place)/NUM (number)/O (a part that the word is not belonging to name entity) " etc. is named in entity tag system for Stanford CoreNLP Label, and when carrying out part-of-speech tagging using Stanford CoreNLP, (the side NR (proper noun)/NN (other nouns)/LC Position word)/CC (conjunction arranged side by side)/OD (sequence word) etc. is the label in Stanford CoreNLP part of speech label system.Such as figure Shown in 6, the name entity tag of the word of address formats such as " Duo Fujie ", " Beijing ", " Beijing " is LOC, "AND", " crossing ", The name entity tag that " number ", " Daheng Technology Building " etc. are not belonging to the word of name entity a part is O, and the digital shape such as " 3 " The name entity tag of the word of formula is NUM;The part of speech mark of the word of proper nouns form such as " Duo Fujie ", " Beijing ", " Beijing " Label be NR, "AND" etc. side by side connection word form word part of speech label be CC, " crossing ", " number ", " Daheng Technology Building " etc. its The part of speech label of the word of his occlusion is NN, and the part of speech label of the word of the sequences word form such as " 3 " is OD etc..
In step S14, according to treated, participle corpus is trained to obtain address resolution model.For example, at this CRF++ Open-Source Tools can be used in open, come training corpus is trained according to treated in step S13, to generate Address resolution model.
In accordance with one embodiment of the present disclosure, as shown in fig. 7, address resolution method includes participle step S71, mark step Rapid S72, extraction step S73, switch process S74 and training step S75.
Compared with above-mentioned address resolution method, switch process S74 is increased in this embodiment.Remaining participle Step S71, annotation step S72, extraction step S73 and training step S75 can respectively with the participle step in above-mentioned method S11, annotation step S12, extraction step S13 and training step S14 are identical.
In some cases, since the format needs by extraction step treated corpus format, so as to Generation meets format required for training step.
For example, illustrating that the disclosure can be used CRF++ Open-Source Tools to carry out participle corpus in the above example Training is to obtain address resolution model.It, can be by data conversion that step S73 is generated at CRF++ institute at this point, in step S74 The format of support.Fig. 8 is shown Fig. 6 is formatted obtained from data format.In fig. 8, first it is classified as participle language Material, second is classified as name entity tag, and third is classified as part of speech label, and the 4th is classified as the semantic slot of mark.
Then the data after format conversion are trained by training step S73.Such as it is being converted into CRF++ support In the case where data format, it is trained using CRF++.
According to the second embodiment of the disclosure, a kind of address resolution method is additionally provided.As shown in figure 9, this method can To include: to understand step S91, participle step S92, extraction step S93 and analyzing step S94.
In step S91, natural language understanding is carried out to natural-sounding text.
Judge the related semantic slot in address, such as location whether are related in natural language understanding result, if be related to The related semantic slot in address, then carry out subsequent address resolution processing.If be not related to, at subsequent address resolution Reason.For example, natural language understanding result is that " [location of the street Haidian District China Guan Cun 9] is removed in navigation " is then thought as The related semantic slot in address, and language understanding result is " several points now ", is not at this moment related to the related semantic slot in address then.
In step S92, word segmentation processing is carried out to the text for being related to the related semantic slot in address.For the tool of the word segmentation processing Gymnastics is made, and is referred to the mode of above-mentioned step S11 to carry out.
In step S93, name substance feature is extracted by name Entity recognition to each word after participle and passes through word Property mark extract part of speech label characteristics.It extracts name substance feature and extracts the specific processing mode of part of speech label characteristics, it can be with It is carried out referring to the mode in above-mentioned steps S13.
In step S94, address resolution is carried out using address resolution model, is wherein drawn in address resolution model according to address Each word of the text of the related semantic slot in address is labeled with a semantic slot by point mode.The address resolution model can be root According to the first embodiment of disclosure address resolution model generated, that is to say, that according to suitable Task interactive system Fine granularity address divide agreement and the address resolution model that generates.
Finally, being then output address parsing result.
According to the third embodiment of the disclosure, a kind of address resolution method is additionally provided.As shown in Figure 10, this method Including step S101, S102, S103, S104 and S105.
In step s101, natural language understanding is carried out to natural-sounding text, obtains hanging down belonging to natural language text Straight field and semantic slot.
If understand semantic slot related there are address in result naturally, step S102 is carried out, it is in step s 102, right The text of the related semantic slot in address carries out word segmentation processing.Concrete operations for the word segmentation processing are referred to above-mentioned step The mode of S11 carries out.
In step s 103, name substance feature is extracted by name Entity recognition to each word after participle and passes through word Property mark extract part of speech label characteristics.For extracting name substance feature and extracting the specific processing mode of part of speech label characteristics, The mode in above-mentioned steps S13 is referred to carry out.
In step S104, address resolution is carried out using address resolution model, wherein basis in the address resolution model Each word of the text of the related semantic slot in the address is labeled with a semantic slot by address division mode.The address resolution mould Type can be the first embodiment address resolution model generated according to the disclosure, that is to say, that according to suitable Task people The fine granularity address of machine conversational system divides agreement and the address resolution model that generates.
In step s105, according to address parsing result, subsequent processing is carried out in corresponding vertical field.
Below in conjunction with specific example, referring to Figure 11, third embodiment to be described in detail.
With reference first to process labelled in solid arrow, user says " Haidian District Zhongguancun Street 9 is gone in navigation ", passes through Speech recognition technology identifies that the natural language text of user is " Haidian District Zhongguancun Street 9 is gone in navigation ".Use nature language Speech understands that NLU (Natural Language Understanding) obtains the affiliated vertical field and ground of the natural language text The related semantic slot in location.At this moment vertical field is [vertical field of navigating], and the related semantic slot in address is [Haidian District Zhongguancun Street No. 9 location], the result of natural language understanding is input in general address parsing module, successively by as described above Participle, name Entity recognition and part-of-speech tagging processing, by treated, result is input in general address analytic modell analytical model.Universally Location analytic modell analytical model can according to first embodiment generate model.By the dissection process of general address analytic modell analytical model come defeated Parsing result out, parsing result at this time is input into corresponding vertical field, such as navigation correlation APP etc..
As shown in phantom in Figure 11, when user says " restaurant for asking for the street Xin Zhongguan and Haidian street intersection ", By speech recognition technology, identify that the natural language text of user is " to ask for the street Xin Zhongguan and Haidian street intersection Restaurant ".Obtaining vertical field belonging to the natural language text using NLU is [the vertical field in restaurant] and address correlative Adopted slot is [street Xin Zhongguan and Haidian street intersection location], and the result of natural language understanding is input to general address In parsing module, successively pass through participle, name Entity recognition and part-of-speech tagging processing as described above, result is defeated by treated Enter into general address analytic modell analytical model.Parsing result, solution at this time are exported by the dissection process of general address analytic modell analytical model Analysis result is input into corresponding vertical field, such as restaurant correlation APP etc..
Can be clearly understood from from Figure 11, the vertical field in vertical field and restaurant of navigating used it is same universally Location parsing module.Figure 11 is merely illustrative, and the same general address solution can be used in multiple vertical fields or all vertical fields Module is analysed, address resolution module can be separately maintained in this way to avoid for each vertical field, the maintenance of system can be reduced in this way Cost.
According to the 4th of the disclosure the, embodiment there is provided a kind of address analyzing devices.As shown in Figure 12, address resolution Model generating means 120 may include word segmentation module 121, labeling module 122, extraction module 123 and training module 124.
Word segmentation module 121 carries out word segmentation processing to the corpus of acquisition, to obtain the word as participle corpus.Labeling module 122, a semantic slot is marked to each word according to address division mode respectively.Extraction module 123 carries out feature to each word and mentions Processing is taken, including name substance feature is extracted by name Entity recognition and part of speech label characteristics are extracted by part-of-speech tagging.Instruction Practice module 124, participle corpus is trained to obtain address resolution model according to treated.
It can also include format converting module according to the disclosure optional embodiment, it is defeated for extraction module 123 Data out are formatted to meet the call format of training module 124.
Wherein concrete operations conducted in above-mentioned modules can be identical with method described in first embodiment. For brevity, details are not described herein.
According to the 5th of the disclosure the, embodiment there is provided a kind of address analyzing devices.As shown in Figure 13, address solution Analysing module 130 may include:
Understanding Module 131 carries out natural language understanding to natural-sounding text;
Word segmentation module 132 carries out address correlative justice slot when understanding semantic slot related there are address in result naturally Word segmentation processing;
Extraction module 133 extracts name substance feature by name Entity recognition to each word after participle and passes through word Property mark extract part of speech label characteristics;And
Parsing module 134 carries out address resolution using address resolution model, is wherein drawn in address resolution model according to address Each word of the text of the related semantic slot in address is labeled with a semantic slot by point mode.
Wherein concrete operations conducted in above-mentioned modules can be identical with method described in second embodiment. For brevity, details are not described herein.
According to the sixth embodiment of the disclosure, a kind of address analyzing device is provided.As shown in Figure 14, the reusable Address analyzing device 140 may include:
Understanding Module 141 carries out natural language understanding to natural-sounding text, obtains vertical belonging to natural language text Field and semantic slot;
Word segmentation module 142 carries out address correlative justice slot when understanding semantic slot related there are address in result naturally Word segmentation processing;
Extraction module 143 extracts name substance feature by name Entity recognition to each word after participle and passes through word Property mark extract part of speech label characteristics;
Parsing module 144 carries out address resolution using address resolution model, is wherein drawn in address resolution model according to address Each word of the text of the related semantic slot in address is labeled with a semantic slot by point mode;And
Processing module 145 carries out subsequent processing in corresponding vertical field according to address parsing result.
Wherein concrete operations conducted in above-mentioned modules can be identical with method described in third embodiment. For brevity, details are not described herein.
To sum up, agreement is divided according to the general address of the conversational system of the disclosure, can be compatible with and be related to the master of address parameter API is flowed, and according to the reusable based on CRFs (Conditional Random Fields, condition random field) of the disclosure General address resolver, can be avoided and separately maintain an address resolution module in each vertical field, reduce system Maintenance cost.
In addition, it is necessary to which explanation, the vertical field such as restaurant, hotel, navigation mentioned in addition to disclosure citing, are also suitable It is related to the vertical field of place semanteme slot in searching map, looking for movie theatre, look into public transport, plane ticket, train ticket etc..
The disclosure also provides a kind of computer equipment, and as shown in figure 15, which includes: communication interface 1000, memory 2000 and processor 3000.Communication interface 1000 carries out data interaction for being communicated with external device.Memory The computer program that can be run on processor 3000 is stored in 2000.Processor 3000 executes real when the computer program Method in existing above embodiment.The quantity of the memory 2000 and processor 3000 can be one or more.
Memory 2000 may include high speed RAM memory, can also further include nonvolatile memory (non- Volatile memory), a for example, at least magnetic disk storage.
If communication interface 1000, memory 2000 and the independent realization of processor 3000, communication interface 1000, memory 2000 and processor 3000 can be connected with each other by bus and complete mutual communication.The bus can be industrial standard Architecture (ISA, Industry Standard Architecture) bus, external equipment interconnection (PCI, Peripheral Component) bus or extended industry-standard architecture (EISA, Extended Industry Standard Component) bus etc..The bus can be divided into address bus, data/address bus, control bus etc..For convenient for expression, the figure In only indicated with a thick line, it is not intended that an only bus or a type of bus.
Optionally, in specific implementation, if communication interface 1000, memory 2000 and processor 3000 are integrated in one On block chip, then communication interface 1000, memory 2000 and processor 3000 can be completed each other by internal interface Communication.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the disclosure includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be by the disclosure Embodiment person of ordinary skill in the field understood.Processor executes each method as described above and processing. For example, the method implementation in the disclosure may be implemented as computer software programs, being tangibly embodied in machine can Read medium, such as memory.In some embodiments, some or all of of computer software programs can be via memory And/or communication interface and be loaded into and/or install.When computer software programs are loaded into memory and are executed by processor, One or more steps in method as described above can be executed.Alternatively, in other embodiments, processor can lead to It crosses other any modes (for example, by means of firmware) appropriate and is configured as executing one of above method.
Expression or logic and/or step described otherwise above herein in flow charts, may be embodied in any In computer-readable medium, for instruction execution system, device or equipment (such as computer based system, including processor System or other can be from instruction execution system, device or equipment instruction fetch and the system executed instruction) use, or combine these Instruction execution system, device or equipment and use.
For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable read-only memory (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other suitable Jie Matter, because can then be edited, be interpreted or when necessary with other for example by carrying out optical scanner to paper or other media Suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each section of the disclosure can be realized with hardware, software or their combination.In above-mentioned embodiment party In formula, multiple steps or method can carry out reality in memory and by the software that suitable instruction execution system executes with storage It is existing.It, and in another embodiment, can be in following technology well known in the art for example, if realized with hardware Any one or their combination are realized: having a discrete logic for realizing the logic gates of logic function to data-signal Circuit, the specific integrated circuit with suitable combinational logic gate circuit, programmable gate array (PGA), field-programmable gate array Arrange (FPGA) etc..
Those skilled in the art are understood that realize all or part of the steps of above embodiment method It is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer readable storage medium In, which when being executed, includes the steps that one or a combination set of method implementation.
In addition, can integrate in a processing module in each functional unit in each embodiment of the disclosure, it can also To be that each unit physically exists alone, can also be integrated in two or more units in a module.It is above-mentioned integrated Module both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module If in the form of software function module realize and when sold or used as an independent product, also can store one calculating In machine readable storage medium storing program for executing.The storage medium can be read-only memory, disk or CD etc..
In the description of this specification, reference term " an embodiment/mode ", " some embodiment/modes ", The description of " example ", " specific example " or " some examples " etc. means the embodiment/mode or example is combined to describe specific Feature, structure, material or feature are contained at least one embodiment/mode or example of the application.In this specification In, schematic expression of the above terms are necessarily directed to identical embodiment/mode or example.Moreover, description Particular features, structures, materials, or characteristics can be in any one or more embodiment/modes or example in an appropriate manner In conjunction with.In addition, without conflicting with each other, those skilled in the art can be by different implementations described in this specification Mode/mode or example and different embodiments/mode or exemplary feature are combined.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the present application, the meaning of " plurality " is at least two, such as two, three It is a etc., unless otherwise specifically defined.
It will be understood by those of skill in the art that above embodiment is used for the purpose of clearly demonstrating the disclosure, and simultaneously Non- be defined to the scope of the present disclosure.For those skilled in the art, may be used also on the basis of disclosed above To make other variations or modification, and these variations or modification are still in the scope of the present disclosure.

Claims (10)

1. a kind of address resolution method characterized by comprising
Word segmentation processing is carried out to the corpus of acquisition, to obtain the word as participle corpus;
One semantic slot is marked respectively to each word according to address division mode;
Feature extraction processing is carried out to each word, including name substance feature is extracted by name Entity recognition and passes through part of speech mark Note extracts part of speech label characteristics;And
According to treated, participle corpus is trained to obtain address resolution model.
2. the method as described in claim 1, which is characterized in that further include: it is right after carrying out feature extraction processing to each word Obtained data format, so that according to treated, participle corpus is trained.
3. method according to claim 1 or 2, which is characterized in that further include:
Natural language understanding is carried out to natural-sounding text;
When understanding semantic slot related there are address in result naturally, the text of semantic slot related to the address is carried out at participle Reason;
Name substance feature is extracted by name Entity recognition to each word after participle and part of speech mark is extracted by part-of-speech tagging Sign feature;And
Address resolution is carried out using the address resolution model.
4. method as claimed in claim 3, which is characterized in that when carrying out natural language understanding to natural-sounding text, obtain Vertical field belonging to natural language text and semantic slot;And
The method also includes after carrying out address resolution using address resolution model, according to address parsing result, in correspondence The vertical field in carry out subsequent processing.
5. method according to any one of claims 1 to 4, which is characterized in that the address is drawn in mode, to county-level city or The address of each level under the level of districts under city administration address configures semantic slot.
6. the method as described in any one of claims 1 to 5, which is characterized in that under county-level city or districts under city administration address level Semantic slot is respectively configured for commercial circle, road, building number, place name and better address for urban addresses in address, and for Semantic slot is respectively configured for small towns, rural area and better address in rural area address.
7. method as claimed in claim 6, which is characterized in that carried out to the word for the semantic slot being not belonging in the division mode of address It is separately configured.
8. a kind of address analyzing device characterized by comprising
Word segmentation module carries out word segmentation processing to the corpus of acquisition, to obtain the word as participle corpus;
Labeling module marks a semantic slot to each word according to address division mode respectively;
Extraction module, to each word carry out feature extraction processing, including by name Entity recognition extract name substance feature and Part of speech label characteristics are extracted by part-of-speech tagging;And
Training module, according to treated, participle corpus is trained to obtain address resolution model.
9. a kind of computer equipment characterized by comprising
Memory, the memory store computer executed instructions;And
Processor, the processor executes the computer executed instructions of the memory storage, so that the processor executes such as Address resolution method described in any one of claims 1 to 7.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer in the computer readable storage medium It executes instruction, for realizing as described in any one of claims 1 to 7 when the computer executed instructions are executed by processor Address resolution method.
CN201811564845.8A 2018-12-20 2018-12-20 Address resolution method, device, computer equipment and computer readable storage medium Pending CN109740150A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811564845.8A CN109740150A (en) 2018-12-20 2018-12-20 Address resolution method, device, computer equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811564845.8A CN109740150A (en) 2018-12-20 2018-12-20 Address resolution method, device, computer equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN109740150A true CN109740150A (en) 2019-05-10

Family

ID=66360891

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811564845.8A Pending CN109740150A (en) 2018-12-20 2018-12-20 Address resolution method, device, computer equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN109740150A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110298039A (en) * 2019-06-20 2019-10-01 北京百度网讯科技有限公司 Recognition methods, system, equipment and the computer readable storage medium of event
CN110516241A (en) * 2019-08-26 2019-11-29 北京三快在线科技有限公司 Geographical address analytic method, device, readable storage medium storing program for executing and electronic equipment
CN110688449A (en) * 2019-09-20 2020-01-14 京东数字科技控股有限公司 Address text processing method, device, equipment and medium based on deep learning
CN110826318A (en) * 2019-10-14 2020-02-21 浙江数链科技有限公司 Method, device, computer device and storage medium for logistics information identification
CN111309861A (en) * 2020-02-07 2020-06-19 中科鼎富(北京)科技发展有限公司 Location extraction method, device, electronic equipment and computer readable storage medium
CN111859968A (en) * 2020-06-15 2020-10-30 深圳航天科创实业有限公司 Text structuring method, text structuring device and terminal equipment
CN117131867A (en) * 2022-05-17 2023-11-28 贝壳找房(北京)科技有限公司 Method, apparatus, computer program product and storage medium for splitting house address

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1811915A (en) * 2005-01-28 2006-08-02 中国科学院计算技术研究所 Estimating and detecting method and system for telephone continuous speech recognition system performance
CN104615589A (en) * 2015-02-15 2015-05-13 百度在线网络技术(北京)有限公司 Named-entity recognition model training method and named-entity recognition method and device
CN108763212A (en) * 2018-05-23 2018-11-06 北京神州泰岳软件股份有限公司 A kind of address information extraction method and device
US20180341698A1 (en) * 2017-05-27 2018-11-29 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for parsing query based on artificial intelligence, and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1811915A (en) * 2005-01-28 2006-08-02 中国科学院计算技术研究所 Estimating and detecting method and system for telephone continuous speech recognition system performance
CN104615589A (en) * 2015-02-15 2015-05-13 百度在线网络技术(北京)有限公司 Named-entity recognition model training method and named-entity recognition method and device
US20180341698A1 (en) * 2017-05-27 2018-11-29 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for parsing query based on artificial intelligence, and storage medium
CN108763212A (en) * 2018-05-23 2018-11-06 北京神州泰岳软件股份有限公司 A kind of address information extraction method and device

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110298039A (en) * 2019-06-20 2019-10-01 北京百度网讯科技有限公司 Recognition methods, system, equipment and the computer readable storage medium of event
CN110298039B (en) * 2019-06-20 2023-05-30 北京百度网讯科技有限公司 Event place identification method, system, equipment and computer readable storage medium
CN110516241A (en) * 2019-08-26 2019-11-29 北京三快在线科技有限公司 Geographical address analytic method, device, readable storage medium storing program for executing and electronic equipment
CN110688449A (en) * 2019-09-20 2020-01-14 京东数字科技控股有限公司 Address text processing method, device, equipment and medium based on deep learning
CN110826318A (en) * 2019-10-14 2020-02-21 浙江数链科技有限公司 Method, device, computer device and storage medium for logistics information identification
CN111309861A (en) * 2020-02-07 2020-06-19 中科鼎富(北京)科技发展有限公司 Location extraction method, device, electronic equipment and computer readable storage medium
CN111859968A (en) * 2020-06-15 2020-10-30 深圳航天科创实业有限公司 Text structuring method, text structuring device and terminal equipment
CN117131867A (en) * 2022-05-17 2023-11-28 贝壳找房(北京)科技有限公司 Method, apparatus, computer program product and storage medium for splitting house address
CN117131867B (en) * 2022-05-17 2024-05-14 贝壳找房(北京)科技有限公司 Method, apparatus, computer program product and storage medium for splitting house address

Similar Documents

Publication Publication Date Title
CN109740150A (en) Address resolution method, device, computer equipment and computer readable storage medium
CN107766371B (en) Text information classification method and device
US11698261B2 (en) Method, apparatus, computer device and storage medium for determining POI alias
CN107437413B (en) Voice broadcasting method and device
CN105335133B (en) Method and apparatus for generating business rule model
CN107357787B (en) Semantic interaction method and device and electronic equipment
CN110472066A (en) A kind of construction method of urban geography semantic knowledge map
CN109523986A (en) Phoneme synthesizing method, device, equipment and storage medium
CN106683662A (en) Speech recognition method and device
CN109992671A (en) Intension recognizing method, device, equipment and storage medium
CN110019617B (en) Method and device for determining address identifier, storage medium and electronic device
CN109543192B (en) Natural language analysis method, device, equipment and storage medium
CN109213856A (en) A kind of method for recognizing semantics and system
CN104183144A (en) Real-time traffic condition information generating method and system thereof
CN109740159B (en) Processing method and device for named entity recognition
CN109284549A (en) A kind of buildings model parameter management method, computer installation and readable storage medium storing program for executing
CN104679495B (en) software identification method and device
CN112527933A (en) Chinese address association method based on space position and text training
KR101007549B1 (en) Method and System for managing and integrating a POI
Mainka et al. Mobile application services based upon open urban government data
CN107221344A (en) A kind of speech emotional moving method
CN103544145A (en) Multi-language translating system and method for traveling
CN111625732B (en) Address matching method and device
Gu et al. Re-imagining creative cities in twenty-first century Asia
KR102017229B1 (en) A text sentence automatic generating system based deep learning for improving infinity of speech pattern

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20211112

Address after: 210034 floor 8, building D11, Hongfeng Science Park, Nanjing Economic and Technological Development Zone, Jiangsu Province

Applicant after: New Technology Co.,Ltd.

Applicant after: VOLKSWAGEN (CHINA) INVESTMENT Co.,Ltd.

Address before: 1001, floor 10, office building a, No. 19, Zhongguancun Street, Haidian District, Beijing 100094

Applicant before: MOBVOI INFORMATION TECHNOLOGY Co.,Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190510