CN115544979A - Method, device and equipment for extracting administrative address and storage medium - Google Patents

Method, device and equipment for extracting administrative address and storage medium Download PDF

Info

Publication number
CN115544979A
CN115544979A CN202211210796.4A CN202211210796A CN115544979A CN 115544979 A CN115544979 A CN 115544979A CN 202211210796 A CN202211210796 A CN 202211210796A CN 115544979 A CN115544979 A CN 115544979A
Authority
CN
China
Prior art keywords
administrative
address
administrative address
sequence
elements
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211210796.4A
Other languages
Chinese (zh)
Inventor
朱营军
霍玲玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Leap New Technology Co ltd
Original Assignee
Shenzhen Leap New Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Leap New Technology Co ltd filed Critical Shenzhen Leap New Technology Co ltd
Priority to CN202211210796.4A priority Critical patent/CN115544979A/en
Publication of CN115544979A publication Critical patent/CN115544979A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/226Validation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a method, a device and equipment for extracting an administrative address and a storage medium, wherein the method comprises the following steps: acquiring an administrative address element sequence obtained by matching preset address elements of a current address to be extracted, wherein the administrative address element sequence comprises a plurality of administrative address elements arranged according to a matching sequence, and the administrative address elements comprise administrative address names and administrative address codes; when determining to extract a plurality of or one administrative addresses, according to the administrative address code, performing administrative address element splitting or deleting processing on the administrative address element sequence to obtain a new administrative address element sequence; and determining the administrative four-level address corresponding to the new administrative address element sequence according to a preset address element extraction logic. The method disclosed by the invention can meet the requirements of extracting one or more administrative four-level addresses in different scenes, and improve the accuracy of extracting the administrative four-level addresses so as to ensure the timeliness and the accuracy of logistics.

Description

Method, device and equipment for extracting administrative address and storage medium
Technical Field
The invention relates to the technical field of address data processing, in particular to an administrative address extraction method, device and equipment and a storage medium.
Background
In the logistics industry, after receiving an order of a user, logistics delivery is carried out according to an address on the order. With the rapid development of the logistics industry, the order business volume is continuously increased, and the requirements on timeliness and accuracy of logistics distribution are higher and higher. However, addresses filled by some users are not very standard, and correct administrative level four address information cannot be extracted by using the existing address administrative extraction scheme, so that the difficulty of address resolution is greatly increased, the operation cost of enterprises is greatly increased, and the timeliness and the accuracy of logistics dispatching are reduced.
Disclosure of Invention
The invention provides an administrative address extraction method, device and equipment and a storage medium, which are used for solving the technical problem that correct administrative four-level address information cannot be extracted in the prior art.
In order to solve the technical problem, in a first aspect, the present invention provides an administrative address extraction method, including:
acquiring an administrative address element sequence obtained by matching preset address elements of a current address to be extracted, wherein the administrative address element sequence comprises a plurality of administrative address elements arranged according to a matching sequence, and the administrative address elements comprise administrative address names and administrative address codes;
when determining to extract a plurality of or one administrative addresses, according to the administrative address code, performing administrative address element splitting or deleting processing on the administrative address element sequence to obtain a new administrative address element sequence;
and determining the administrative four-level address corresponding to the new administrative address element sequence according to a preset address element extraction logic.
Optionally, when it is determined that a plurality of administrative addresses are extracted, performing administrative address element splitting processing on the sequence of administrative address elements according to the administrative address code, including:
and judging the membership relationship among the administrative address elements, and splitting the administrative address elements which have the membership relationship and are adjacent in arrangement position in the administrative address element sequence into the same administrative address element sequence.
Optionally, when it is determined to extract an administrative address, according to the administrative address code, performing administrative address element deletion processing on the sequence of administrative address elements, including:
and judging the membership relationship among the administrative address elements, and deleting the administrative address elements which do not have the membership relationship with the administrative address element at the first arrangement position in the sequence of the administrative address elements.
Optionally, determining an administrative fourth-level address corresponding to the new administrative address element sequence according to a preset address element extraction logic, including:
according to the administrative address names, screening a plurality of administrative address elements which have character inclusion relations and are adjacent in arrangement positions in the new administrative address element sequence, and reserving the administrative address elements with the most characters of the administrative address names;
and determining a corresponding administrative fourth-level address according to the new administrative address element sequence after screening processing.
Optionally, determining an administrative fourth-level address corresponding to the new administrative address element sequence according to a preset address element extraction logic, including:
according to the administrative address level determined by the administrative address code, deleting the administrative address elements in the new administrative address element sequence, which are higher than the administrative address level of the administrative address elements arranged at the previous position;
and determining a corresponding administrative fourth-level address according to the deleted new administrative address element sequence.
Optionally, determining an administrative fourth-level address corresponding to the new administrative address element sequence according to a preset address element extraction logic, including:
according to the administrative address names, merging administrative address elements which are identical in administrative address name and adjacent in arrangement position in the new administrative address element sequence;
and determining a corresponding administrative fourth-level address according to the new administrative address element sequence after the merging processing.
Optionally, before determining the corresponding administrative four-level address, the method further includes:
the administrative address element further comprises an administrative address level;
and reserving the administrative address code of the administrative address element with the lowest administrative address grade in the processed new administrative address element sequence, and supplementing the administrative address element with the front arrangement position according to the administrative address code.
In a second aspect, the present invention provides an administrative address extraction apparatus, including a sequence acquisition module, a sequence update module, and an address determination module;
the sequence acquisition module is used for acquiring an administrative address element sequence obtained by matching preset address elements on a current address to be extracted, wherein the administrative address element sequence comprises a plurality of administrative address elements arranged according to a matching sequence, and the administrative address elements comprise administrative address names and administrative address codes;
the sequence updating module is used for splitting or deleting the administrative address elements of the administrative address element sequence according to the administrative address code when determining to extract a plurality of or one administrative address, so as to obtain a new administrative address element sequence;
and the address determination module is used for determining the administrative four-level address corresponding to the new administrative address element sequence according to preset address element extraction logic.
In a third aspect, the present invention provides an administrative address extracting apparatus, including a memory and a processor, wherein:
the memory is used for storing a computer program;
the processor is used for reading the program in the memory and executing the steps of the administrative address extraction method provided by the first aspect.
In a fourth aspect, the present invention provides a computer-readable storage medium, on which a readable computer program is stored, which when executed by a processor, implements the steps of the administrative address extraction method as provided in the first aspect above.
Compared with the prior art, the administrative address extraction method, the administrative address extraction device, the administrative address extraction equipment and the administrative address extraction storage medium have the following beneficial effects:
the requirements for extracting one or more administrative four-level addresses under different scenes can be met, the extraction accuracy of the administrative four-level addresses is improved, and the timeliness and the accuracy of logistics are guaranteed; by reserving the administrative address elements with the most characters of the administrative address name, the accuracy of extracting the administrative four-level address can be improved; by deleting the administrative address elements with higher administrative address levels than the administrative address elements arranged at the front in the sequence of the administrative address elements, the administrative address elements in the administrative four-level address are sequenced according to the administrative address levels, and the accuracy of logistics is ensured; by combining the administrative address elements which have the same administrative address name and are arranged at adjacent positions in the sequence of the administrative address elements, the repetition of the administrative address name in the administrative four-level address can be avoided.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only a part of the embodiments of the present invention, but not all embodiments, and other drawings obtained from these drawings will belong to the protection scope of the present application without creative efforts for those skilled in the art.
Fig. 1 is a schematic flowchart of an administrative address extraction method according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of an administrative address extraction apparatus according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an administrative address extraction device according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
In order to make the description of the present disclosure more complete and complete, the following description is given for illustrative purposes with respect to the embodiments and examples of the present invention; it is not intended to be the only form in which the embodiments of the invention may be practiced or utilized. The embodiments are intended to cover the features of the various embodiments as well as the method steps and sequences for constructing and operating the embodiments. However, other embodiments may be utilized to achieve the same or equivalent functions and step sequences. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein.
Example 1
As shown in fig. 1, a schematic flow chart of an administrative address extraction method according to an embodiment of the present invention includes the following steps:
step S101, acquiring an administrative address element sequence obtained by matching preset address elements of a current address to be extracted, wherein the administrative address element sequence comprises a plurality of administrative address elements arranged according to a matching sequence, and the administrative address elements comprise administrative address names and administrative address codes;
the address to be extracted is an address input by a user, and for the specific implementation of the address to be extracted, such as format, integrity, and number of included addresses, the embodiment of the present invention is not limited at all.
The administrative address name and the administrative address code are encoded into an administrative division and an administrative division code which accord with the national standard. For example, the administrative address names may be "Guangdong province", "Baoan district"; the administrative address corresponding to the aforementioned administrative address name "baoan area" is encoded as "440306000000".
The administrative address element includes an administrative address name and an administrative address code, a specific implementation of the administrative address element may be specifically set according to a specific implementation scenario, which is not limited in this embodiment of the present invention, for example, the administrative address element may be "(administrative address name 1, administrative address code 1)", such as "(baean area, 440306000000)"; or "administrative address name 1 administrative address code 1", such as "baoan area 440306000000", etc.
The specific embodiment of the sequence of the administrative address elements may be specifically configured according to the specific embodiment of the administrative address elements, for example, the sequence of the administrative address elements may be in the form of "(administrative address name 1, administrative address code 1), (administrative address name 2, administrative address code 2)", where "(administrative address name 1, administrative address code 1)" is an administrative address element.
It should be noted that, besides the administrative address elements related to the administrative fourth-level address, the administrative address element sequence may also include other address names unrelated to the administrative fourth-level address, but the other address names do not have corresponding address codes. For example, the sequence of the administrative address elements may be "(administrative address name 1, administrative address code 1), (administrative address name 2, administrative address code 2), (administrative address name 3, administrative address code 3), (administrative address name 4, administrative address code 4), other address name").
Step S102, when determining to extract a plurality of or one administrative address, according to the administrative address code, performing administrative address element splitting or deleting processing on the administrative address element sequence to obtain a new administrative address element sequence;
the embodiment of the present invention does not limit the specific implementation manner for determining to extract multiple or one administrative addresses, and any implementation manner that can determine to extract multiple or one administrative addresses may be applied to the embodiment of the present invention, for example, it may be determined to extract multiple or one administrative addresses by responding to an instruction, or by text recognition whether multiple administrative addresses exist, or according to a preset requirement.
The administrative address elements can form one or more administrative addresses.
Embodiment 1: and when determining to extract a plurality of administrative addresses, according to the administrative address codes, performing administrative address element splitting on the administrative address element sequence to obtain one or more new administrative address element sequences. For example, when the plurality of administrative address elements include "guangdong province, bao an district", the administrative address element sequence is split to obtain a new administrative address element sequence of "guangdong province, bao an district". For example, when the administrative address elements include "Guangdong province, baoan district, beijing city, and sunny district", the administrative address element sequence is split to obtain two new administrative address element sequences of "Guangdong province, baoan district", "Beijing city, and sunny district".
The determining to extract multiple administrative addresses refers to processing the sequence of the administrative address elements for the purpose of extracting multiple administrative addresses, and it cannot be guaranteed that multiple new sequences of the administrative address elements can be obtained.
In the description of the above embodiments, the administrative address code of all the administrative address elements is omitted, and only the administrative address element is replaced by the administrative address name, and the same alternative may appear in the following description, which is only an abbreviation for convenience of description and does not represent the limitation of the administrative address element, and will not be described again.
It should be noted that the above-mentioned splitting of the administrative address elements is to split the administrative address elements from the original sequence of the administrative address elements into a new sequence of the administrative address elements, rather than splitting the internal components of the administrative address elements.
Embodiment 2: and when determining to extract an administrative address, deleting the administrative address elements of the administrative address element sequence according to the administrative address code to obtain a new administrative address element sequence. For example, the administrative address elements include "guangdong province, bao an district, beijing city, and yang ward district", and the administrative address elements can be deleted from "guangdong province, bao an district" to obtain a new administrative address element sequence "beijing city, yang ward district"; and the administrative address element deletion processing can also be carried out on the 'Beijing City, sunward area' to obtain a new administrative address element sequence 'Guangdong province, baoan district'.
Step S103, according to preset address element extraction logic, determining an administrative four-level address corresponding to the new administrative address element sequence.
The embodiment of the present invention does not limit the specific implementation of the above address element extraction logic, and any implementation of the address element extraction logic that can extract the administrative four-level address may be applied to the embodiment of the present invention, for example, a deletion processing logic, a screening processing logic, a merging processing logic, an adjustment logic for a ranking position of the administrative address elements in the sequence of the administrative address elements, and the like.
As an optional implementation manner, determining the administrative fourth-level address corresponding to the new sequence of administrative address elements includes:
outputting the administrative address names of the administrative address elements in the new administrative address element sequence according to the sorting positions, and respectively corresponding to the administrative division levels;
and reserving the administrative address code of the administrative address element at the last sequencing position in the new administrative address element sequence.
The administrative fourth-level address meets the national administrative division standard and comprises at least one of administrative address names of four administrative division levels of province, city, district/county, town/county/street, and information such as an administrative address code corresponding to the administrative address name with the lowest administrative division level. For example, an administrative four-level address includes 'province': guangdong province ',' City ': shenzhen city', 'region': baolan region ',' opcode ': 440306000000', wherein the aforementioned opcode is an administrative address encoding.
It should be noted that when determining to extract a plurality of administrative addresses, the number of the finally obtained administrative four-level addresses is consistent with the number of the obtained new administrative address element sequences; and finally obtaining an administrative four-level address when determining to extract an administrative address.
The embodiment of the invention can meet the requirements of extracting one or more administrative four-level addresses in different scenes, and improve the accuracy of extracting the administrative four-level addresses so as to ensure the timeliness and the accuracy of logistics.
It should be noted that, in the embodiment of the present invention, a specific implementation manner of performing preset address element matching on the current address to be extracted to obtain the administrative address element sequence is not limited, and any implementation manner that can obtain the administrative address element sequence may be applied to the embodiment of the present invention, for example, dictionary tree matching, administrative address database matching, and the like.
As an optional implementation manner, the performing preset address element matching on the current address to be extracted to obtain an administrative address element sequence includes:
constructing an address dictionary to enable the address dictionary to contain administrative four-level address information; the administrative four-level address information comprises an administrative address code, an administrative address name and address longitude and latitude information;
in some implementation scenarios, information such as an administrative address name, an administrative address attribution, an administrative address level, and/or an administrative address code of an administrative level address is changed. By constructing the address dictionary of the administrative four-level address, the address dictionary can be directly changed under the implementation scene that the information of the administrative four-level address changes, so that the information of the administrative four-level address can be conveniently and quickly maintained, and errors in extraction of the administrative four-level address caused by the fact that the update of the administrative four-level address is not responded in time are avoided.
According to the address dictionary, an address information object (addressInfo) is constructed, so that the address information object comprises the administrative four-level address information, and an AC automata dictionary tree is constructed;
when the AC automata dictionary tree is constructed, the names of the administrative addresses are judged for short and full names; and respectively taking the short administrative address name and the full-name administrative address name as words, constructing an AC automata dictionary tree, setting the words corresponding to the short administrative address name and the full-name administrative address name of the same administrative address, and pointing to the same address information object.
The administrative address name is used for judging the abbreviation and the full name, for example, the full name of Beijing is Beijing, and the abbreviation is Beijing; the accuracy of the administrative four-level address matching can be improved by fully calling the Xinjiang Uygur autonomous region, namely Xinjiang Uygur autonomous region for short.
Taking the administrative four-level address information in the AC automata dictionary tree as a preset address element; carrying out preset address element matching on the current address to be extracted; if the text input into the dictionary tree of the AC automaton is matched with the word, returning the matched word and the attribute of the corresponding address information object; wherein, the attributes comprise an administrative address code, an administrative address name, an address longitude, an address latitude and an administrative address grade; and according to the returned word and the address information object, analyzing the administrative address name and the administrative address code of the address, and returning the administrative address as the element sequence of the administrative address.
In some embodiments, when determining to extract a plurality of administrative addresses, performing an administrative address element splitting process on the sequence of administrative address elements according to the administrative address code includes:
and judging the membership relationship among the administrative address elements, and splitting the administrative address elements which have the membership relationship and are arranged adjacently into the same administrative address element sequence in the administrative address element sequence.
In one embodiment, when determining to extract a plurality of administrative addresses, determining a membership relationship between the plurality of administrative address elements, that is, determining a membership relationship between administrative address names in the plurality of administrative address elements. If the administrative address elements have membership and the sequencing positions of the administrative address elements are adjacent, splitting the administrative address elements into the same administrative address element sequence; if the administrative address elements comprise a plurality of administrative address elements which are not subordinate to each other, splitting the administrative address elements which are not subordinate to each other into different administrative address element sequences respectively.
If a plurality of administrative address elements have a belonging relationship in the administrative division, that is, a first administrative address element belongs to or is belonged to a second administrative address element, for example, shenzhen city belongs to guangdong province, or guangdong province belongs to shenzhen city, or a plurality of administrative address elements are the same, for example, two administrative address elements are also shenzhen city, then a membership relationship exists between the plurality of administrative address elements, which is not described in detail later.
Specifically, the numbers of different positions of the administrative address codes represent administrative addresses of different administrative division levels, and if the numbers of the positions of the administrative address codes of a plurality of administrative address elements representing the administrative addresses of the same administrative division level are the same, it is determined that the administrative address elements have a membership relationship, which is not described again.
As an implementation manner, when determining to extract a plurality of administrative addresses, the plurality of administrative address elements include "guangdong province, baoan district, beijing city, sunny district", the administrative address element sequence is subjected to administrative address element splitting processing to obtain a first new administrative address element sequence including "guangdong province, baoan district", and a second new administrative address element sequence including "beijing city, sunny district".
As another embodiment, when determining to extract multiple administrative addresses, the administrative address element sequence is subjected to administrative address element splitting processing, so that a third new administrative address element sequence including "beijing city", a fourth new administrative address element sequence including "guangdong province, shenzhen city, bao an district, and boat city street", and a fifth new administrative address element sequence including "beijing city, and chaoyang district" are obtained.
According to the embodiment of the invention, the administrative address elements which have membership and are adjacent in arrangement position in the administrative address element sequence are divided into the same administrative address element sequence, so that a plurality of administrative addresses can be accurately extracted.
In some embodiments, when determining to extract an administrative address, performing an administrative address element deletion process on the sequence of administrative address elements according to the administrative address code, including:
and judging the membership relationship among the administrative address elements, and deleting the administrative address elements which do not have the membership relationship with the administrative address element with the first arrangement position in the sequence of the administrative address elements.
In one embodiment, when determining to extract an administrative address, the membership between the administrative address elements is the membership between the administrative address names in the administrative address elements. And if the administrative address elements which do not have the membership relation with the administrative address element at the first arrangement position exist in the sequence of the administrative address elements, deleting the administrative address elements.
As an embodiment, the plurality of administrative address elements include "guangdong province, beijing city, baoan district", when determining to extract one administrative address, the plurality of administrative address elements are subjected to an administrative address element deletion process, and among the plurality of administrative address elements, a first administrative address name is guangdong province, a second administrative address name is beijing city, and beijing city does not belong to guangdong province, so that the administrative address elements are deleted to obtain a new sequence of administrative address elements, which includes "guangdong province, baoan district".
In the embodiment of the present invention, the administrative address elements that do not have a membership relationship with the administrative address element at the first arrangement position in the sequence of the administrative address elements are deleted, so that the interfering administrative address elements of the sequence of the administrative address elements can be excluded.
In some embodiments, determining the administrative fourth-level address corresponding to the new sequence of administrative address elements according to a preset address element extraction logic includes:
according to the administrative address names, screening a plurality of administrative address elements which have character inclusion relations and are adjacent in arrangement positions in the new administrative address element sequence, and reserving the administrative address elements with the most characters of the administrative address names;
and determining the corresponding administrative fourth-level address according to the screened new administrative address element sequence.
Note that, the above-mentioned character inclusion relationship may have an administrative division level that is not included, for example, "hongjiang city" and "hongjiang management area", and although the "hongjiang management area" does not include the administrative division level "city" of "hongjiang city", there is a character inclusion relationship.
As an embodiment, if the new sequence of administrative address elements includes "hunan province, huai city, hong jiang management area, and new street", and the "hong jiang city" and "hong jiang management area" in the new sequence of administrative address elements have a character-containing relationship and are arranged adjacently, so that the administrative address element with the most characters in the address name is taken, i.e., "hong jiang management area" is reserved, then the corresponding administrative address may include 'province', 'hunan province', 'city', 'region', 'flood jiang river management area', 'town', 'new street'.
According to the embodiment of the invention, the accuracy of extracting the administrative four-level address can be improved by reserving the administrative address elements with the most administrative address name characters.
In some embodiments, determining the administrative fourth-level address corresponding to the new sequence of administrative address elements according to a preset address element extraction logic includes:
according to the administrative address level determined by the administrative address code, deleting the administrative address elements in the new administrative address element sequence, which are higher than the administrative address level of the administrative address elements arranged at the previous position;
and determining a corresponding administrative four-level address according to the deleted new administrative address element sequence.
As an alternative embodiment, the level of the administrative address is determined by calculation of the administrative address code. The numbers of different positions of the administrative address code represent administrative addresses of different administrative division levels, and if a certain position is filled with 0, the fact that the administrative address of the administrative division level does not exist is shown. For example, the administrative address code ends with 10 0, the corresponding administrative address level is 0, and the administrative district level represented by the administrative address code is province and city in direct jurisdiction; ending with 8 0 s, the administrative address level is 1, and the administrative division level represented by the administrative address level is a grade city; the administrative address codes end with 60, the administrative address level is 2, and the administrative district level represented by the administrative address codes is a county-level city, county or district; the other digit ends, the administrative address level is 3, which represents the administrative level of town, county, or street.
As an embodiment, if the new sequence of administrative address elements includes "guangdong province, bao an district, shenzhen city", and the new sequence of administrative address elements has the first administrative address name of guangdong province, the second administrative address name of bao an district, and the third administrative address name of shenzhen city, and the shenzhen city is at a higher level than that of the shenian district, and the shenzhen city is deleted, the corresponding administrative four-level addresses may include 'province': guangdong province ',' district ': bao an district'.
According to the embodiment of the invention, by deleting the administrative address elements in the new administrative address element sequence, which are higher in the administrative address level than the administrative address level of the administrative address elements arranged at the front, the obtained corresponding administrative four-level addresses can be sorted according to the administrative address level, so that the accuracy of logistics dispatching is facilitated.
In some embodiments, determining the administrative fourth-level address corresponding to the new sequence of administrative address elements according to a preset address element extraction logic includes:
according to the administrative address names, merging administrative address elements which have the same administrative address name and are adjacent in arrangement position in the new administrative address element sequence;
and determining the corresponding administrative fourth-level address according to the new administrative address element sequence after the merging processing.
It should be noted that, in the embodiment of the present invention, the number of the adjacent administrative address elements is not limited, and may be two or more.
As an embodiment, if the new sequence of administrative address elements includes "beijing city, sunny region, and beijing street", merging the administrative address elements with the same name and adjacent arrangement positions in the new sequence of administrative address elements, several beijing cities are merged into one until the next different name "sunny region" appears; similarly, several sunny areas are merged into one, until the next different administrative address name "beijing street" appears, the corresponding administrative four-level address may include 'province': beijing ',' city ': beijing city', 'district': sunny area ',' town ': beijing street'.
In the embodiment of the invention, the administrative address elements which have the same administrative address name and are adjacent in arrangement position in the new administrative address element sequence are merged, so that the administrative address name in the administrative four-level address can be prevented from being repeated.
It should be noted that, in the foregoing several embodiments, the address element screening logic, the address element deleting logic, and the address element merging logic may be specifically combined according to specific implementation situations to obtain further embodiments, and the number and the order of the combined logic are not limited in any way in the present invention.
As another embodiment, if the administrative address element sequence includes "guangdong province, shenzhen city, baoan district, beijing city, and yangming district", when a plurality of administrative addresses are extracted from the implemented administrative address element sequence, the membership relationship between a plurality of administrative address elements in the administrative address element sequence is determined, and the administrative address elements having the membership relationship and arranged adjacently in the arrangement position in the administrative address element sequence are split into the same administrative address element sequence, so as to obtain a new administrative address element sequence "guangdong province, shenzhen city, baoan district" and another new administrative address element sequence "beijing city, and yangming district"; in the "beijing city, sunny region", the administrative address elements with the same name and adjacent arrangement positions are merged, so that the two beijing cities are merged into one, and the new sequence of the administrative address elements "beijing city, sunny region", and the corresponding administrative four-level address may include ' province ': beijing ', ' city ': beijing city ', ' region ': sunny region '.
In some embodiments, before determining the corresponding administrative level four address, the method further comprises:
the administrative address element also comprises an administrative address grade;
and reserving the administrative address code of the administrative address element with the lowest administrative address grade in the processed new administrative address element sequence, and completing the administrative address element with the front arrangement position according to the administrative address code.
As an alternative embodiment, the administrative address name of the administrative address element arranged at the previous position is complemented according to the administrative address code.
The numbers of different positions of the administrative address codes represent administrative addresses of different administrative division levels, so that the administrative address elements with the front positions can be completely arranged according to the administrative address codes.
The new sequence of administrative address elements after being processed may be the new sequence of administrative address elements after being screened, the new sequence of administrative address elements after being deleted, the new sequence of administrative address elements after being merged, the new sequence of administrative address elements after being processed by any two kinds of processing, or the new sequence of administrative address elements after being processed by the three kinds of processing.
As an embodiment, if the processed new sequence of the administrative address elements includes "(guangdong province, 440000000000), (baean region, 440306000000)", and the administrative address code of the administrative address element with the lowest administrative address level is reserved, the new sequence of the administrative address elements is changed to "guangdong province, baean region, 440306000000", and the administrative address elements before the position are fully arranged according to the 44030600003060000000000 corresponding to the baean region, the new sequence of the administrative address elements is changed to "guangdong province, shenzhen city, baean region, 440306000000", and the corresponding administrative four-level address includes "guangdong province '," municipality ', "naval city '," precin city ', "region ': bao region ', ' encode ': 440000000 '.
As another embodiment, the other administrative address names of the above administrative four-level addresses are supplemented with null characters, i.e., a certain administrative four-level address includes those of 'province': guangdong province ',' City ': shenzhen city', 'region': baoan region ',' encode ': 440306000000', and the above administrative four-level addresses may also be those of 'province': guangdong province ',' City ': shenzhen city', 'region', 'Baoan region', 'town' 306nine ',' encode ': 440000000'.
As another embodiment, the administrative address name that is not related to the administrative fourth-level address in the administrative address element sequence may be retained or deleted, for example, if the administrative address element sequence includes "beijing city facing yang district township street township SOHO tower 3", the corresponding administrative fourth-level address may include ' province ': ' Beijing ', ' City ': beijing city ', ' district ': area facing the sun ', ' Zheng ', ' strewn street ', ' strewn SOHO tower 3', ' decode ': 110026105000 '; the corresponding administrative level four addresses may also include ' province ', ' Beijing ', ' City ', ' Beijing city ', ' district ', ' sunny region ', ' town ', ' Hooking street ', ' encode '110105026000'.
In the embodiment of the invention, the administrative address code of the administrative address element with the lowest administrative address level in the new administrative address element sequence after processing is reserved, and the administrative address elements with the front arrangement positions are completed according to the administrative address code, so that the administrative four-level address is more complete.
In practice, some users fill out addresses that are not very standard, for example, there may be provincial and municipal conflicts in the addresses, or the addresses contain multiple provincial and municipal regions, or the addresses are simply called addresses, etc., and correct administrative level four information cannot be extracted by using the existing address administrative level four extraction scheme. The embodiment of the invention provides an administrative address extraction method, which comprises the steps of obtaining an administrative address element sequence obtained by matching preset address elements of a current address to be extracted; when determining to extract a plurality of or one administrative addresses, according to the administrative address codes, performing administrative address element splitting or deleting processing on the administrative address element sequence to obtain a new administrative address element sequence; determining an administrative four-level address corresponding to the new administrative address element sequence according to a preset address element extraction logic; the correct administrative four-level address information can be extracted, so that the timeliness and the accuracy of logistics dispatching are facilitated. Due to specific business requirements, in practice, it may be expected that addresses containing multiple provinces can be extracted not only by extracting a single address but also by extracting multiple addresses.
Example 2
Based on the above-mentioned method for extracting an administrative address, as shown in fig. 2, a schematic structural diagram of an administrative address extracting apparatus according to an embodiment of the present invention is shown, where the administrative address extracting apparatus 20 includes a sequence obtaining module 21, a sequence updating module 22, and an address determining module 23;
the sequence obtaining module 21 is configured to obtain an administrative address element sequence obtained by matching preset address elements with a current address to be extracted, where the administrative address element sequence includes a plurality of administrative address elements arranged according to a matching sequence, and the administrative address elements include an administrative address name and an administrative address code;
the sequence updating module 22 is configured to, when it is determined that multiple or one administrative address is to be extracted, split or delete an administrative address element of the sequence of administrative address elements according to the administrative address code, so as to obtain a new sequence of administrative address elements;
the address determining module 23 is configured to determine, according to preset address element extraction logic, an administrative level four address corresponding to the new administrative address element sequence.
For other details of the technical solution implemented by each module in the administrative address extracting apparatus, reference may be made to the description of the administrative address extracting method provided in the embodiment of the present invention, and details are not repeated here.
Example 3
Based on the above-mentioned method for extracting an administrative address, as shown in fig. 3, a schematic structural diagram of an administrative address extracting apparatus according to an embodiment of the present invention is provided, where the administrative address extracting apparatus 30 includes a processor 31 and a memory 32 coupled to the processor 31. The memory 32 stores a computer program, and the computer program, when executed by the processor 31, causes the processor 31 to execute the steps of the administrative address extracting method in the above embodiment.
For other details of the implementation of the technical solution by the processor 31 in the administrative address extraction device, reference may be made to the description of the administrative address extraction method provided in the embodiment of the present invention, and details are not repeated here.
The processor 31 may also be referred to as a CPU (central processing unit), and the processor 31 may be an integrated circuit chip and has signal processing capability; the processor 31 may also be a general purpose processor, a DSP (Digital Signal processor), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable gate array) or other Programmable logic device, discrete gate or transistor logic, discrete hardware components, wherein the general purpose processor may be a microprocessor or the processor 31 may be any conventional processor, etc.
Example 4
As shown in fig. 4, a schematic structural diagram of a computer-readable storage medium provided in an embodiment of the present invention is shown, where the computer-readable storage medium 40 has a readable computer program 41 stored thereon; the computer program 41 may be stored in the storage medium in the form of a software product, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a hard disk drive, a magnetic or optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), etc., or terminal devices, such as a computer, a server, a mobile phone, a tablet, etc.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may be stored in a computer-readable storage medium.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a server, a data center, etc., that is integrated with one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a Solid State Disk (SSD)), among others.
The technical solutions provided by the present application are introduced in detail, and the present application applies specific examples to explain the principles and embodiments of the present application, and the descriptions of the above examples are only used to help understand the method and the core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, the specific implementation manner and the application scope may be changed, and in summary, the content of the present specification should not be construed as a limitation to the present application.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. An administrative address extraction method, comprising:
acquiring an administrative address element sequence obtained by matching preset address elements of a current address to be extracted, wherein the administrative address element sequence comprises a plurality of administrative address elements arranged according to a matching sequence, and the administrative address elements comprise administrative address names and administrative address codes;
when determining to extract a plurality of or one administrative addresses, according to the administrative address code, performing administrative address element splitting or deleting processing on the administrative address element sequence to obtain a new administrative address element sequence;
and determining the administrative four-level address corresponding to the new administrative address element sequence according to a preset address element extraction logic.
2. The method for extracting the administrative addresses according to claim 1, wherein when determining to extract a plurality of administrative addresses, performing an administrative address element splitting process on the sequence of the administrative address elements according to the administrative address codes, includes:
and judging the membership relationship among the administrative address elements, and splitting the administrative address elements which have the membership relationship and are adjacent in arrangement position in the administrative address element sequence into the same administrative address element sequence.
3. The method for extracting administrative addresses according to claim 1, wherein when determining to extract an administrative address, the method for deleting administrative address elements from the sequence of administrative address elements according to the administrative address code comprises:
and judging the membership relationship among the administrative address elements, and deleting the administrative address elements which do not have the membership relationship with the administrative address element with the first arrangement position in the sequence of the administrative address elements.
4. The method for extracting administrative addresses according to claim 1, wherein determining the administrative four-level addresses corresponding to the new sequence of administrative address elements according to a preset address element extraction logic comprises:
according to the administrative address names, screening a plurality of administrative address elements in the new administrative address element sequence, wherein the administrative address names have character inclusion relations and are adjacent in arrangement positions, and reserving the administrative address elements with the most characters of the administrative address names;
and determining a corresponding administrative fourth-level address according to the new administrative address element sequence after screening processing.
5. The method according to claim 1, wherein determining the administrative level four address corresponding to the new sequence of administrative address elements according to a preset address element extraction logic comprises:
according to the administrative address level determined by the administrative address code, deleting the administrative address elements in the new administrative address element sequence which are higher than the administrative address level of the administrative address elements with the previous arrangement position;
and determining a corresponding administrative fourth-level address according to the deleted new administrative address element sequence.
6. The method according to claim 1, wherein determining the administrative level four address corresponding to the new sequence of administrative address elements according to a preset address element extraction logic comprises:
according to the administrative address names, merging administrative address elements which are identical in administrative address name and adjacent in arrangement position in the new administrative address element sequence;
and determining a corresponding administrative fourth-level address according to the new administrative address element sequence after the merging processing.
7. The method for extracting administrative addresses according to any one of claims 4 to 6, further comprising, before determining the corresponding administrative level four address:
the administrative address element further comprises an administrative address level;
and reserving the administrative address code of the administrative address element with the lowest administrative address grade in the processed new administrative address element sequence, and supplementing the administrative address element with the front arrangement position according to the administrative address code.
8. An administrative address extraction device is characterized by comprising a sequence acquisition module, a sequence updating module and an address determination module;
the sequence acquisition module is used for acquiring an administrative address element sequence obtained by matching preset address elements with a current address to be extracted, wherein the administrative address element sequence comprises a plurality of administrative address elements arranged according to a matching sequence, and the administrative address elements comprise an administrative address name and an administrative address code;
the sequence updating module is used for splitting or deleting the administrative address elements of the administrative address element sequence according to the administrative address code when determining to extract a plurality of or one administrative address, so as to obtain a new administrative address element sequence;
and the address determination module is used for determining the administrative four-level address corresponding to the new administrative address element sequence according to preset address element extraction logic.
9. An administrative address extraction device, comprising a memory and a processor, wherein:
the memory is used for storing a computer program;
the processor is used for reading the computer program in the memory and executing the steps of any administrative address extraction method according to claims 1 to 7.
10. A computer-readable storage medium, characterized in that it has stored thereon a readable computer program which, when executed by a processor, carries out the steps of the method for extracting an administrative address according to any one of claims 1 to 7.
CN202211210796.4A 2022-09-30 2022-09-30 Method, device and equipment for extracting administrative address and storage medium Pending CN115544979A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211210796.4A CN115544979A (en) 2022-09-30 2022-09-30 Method, device and equipment for extracting administrative address and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211210796.4A CN115544979A (en) 2022-09-30 2022-09-30 Method, device and equipment for extracting administrative address and storage medium

Publications (1)

Publication Number Publication Date
CN115544979A true CN115544979A (en) 2022-12-30

Family

ID=84731893

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211210796.4A Pending CN115544979A (en) 2022-09-30 2022-09-30 Method, device and equipment for extracting administrative address and storage medium

Country Status (1)

Country Link
CN (1) CN115544979A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117271693A (en) * 2023-10-17 2023-12-22 中运科技股份有限公司 Automatic judging method for arrival attribution of traffic route based on big data analysis

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117271693A (en) * 2023-10-17 2023-12-22 中运科技股份有限公司 Automatic judging method for arrival attribution of traffic route based on big data analysis
CN117271693B (en) * 2023-10-17 2024-04-26 中运科技股份有限公司 Automatic judging method for arrival attribution of traffic route based on big data analysis

Similar Documents

Publication Publication Date Title
CN108628811B (en) Address text matching method and device
CN107145577A (en) Address standardization method, device, storage medium and computer
CN112560468B (en) Meteorological early warning text processing method, related device and computer program product
CN108959244A (en) The method and apparatus of address participle
CN108733810B (en) Address data matching method and device
CN112733551A (en) Text analysis method and device, electronic equipment and readable storage medium
CN115544979A (en) Method, device and equipment for extracting administrative address and storage medium
CN116414823A (en) Address positioning method and device based on word segmentation model
CN109460398A (en) Complementing method, device and the electronic equipment of time series data
CN116414824A (en) Administrative division information identification and standardization processing method, device and storage medium
CN111552527A (en) Method, device and system for translating characters in user interface and storage medium
CN109697224B (en) Bill message processing method, device and storage medium
CN114840642A (en) Event extraction method, device, equipment and storage medium
CN111401051B (en) Express information analysis method and system
CN112861532B (en) Address standardization processing method, device, equipment and online searching system
CN114036414A (en) Method and device for processing interest points, electronic equipment, medium and program product
CN113449002A (en) Vehicle recommendation method and device, electronic equipment and storage medium
CN113495845A (en) Data testing method and device, electronic equipment and storage medium
CN111949706A (en) Land big data distributed mining analysis-oriented storage method
CN111538914A (en) Address information processing method and device
CN116701719B (en) Data processing method, device, computer equipment and readable storage medium
CN117729176B (en) Method and device for aggregating application program interfaces based on network address and response body
CN116484850A (en) Administrative four-level address extraction method, device, equipment and storage medium
CN109697250B (en) Bill information extraction method and device and storage medium
CN115098684A (en) Network model establishing method, equipment and storage medium for 5G user identification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination