CN106709824B - Building evaluation method based on semantic analysis of web text - Google Patents

Building evaluation method based on semantic analysis of web text Download PDF

Info

Publication number
CN106709824B
CN106709824B CN201611159450.0A CN201611159450A CN106709824B CN 106709824 B CN106709824 B CN 106709824B CN 201611159450 A CN201611159450 A CN 201611159450A CN 106709824 B CN106709824 B CN 106709824B
Authority
CN
China
Prior art keywords
building
professional
word
vocabulary
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611159450.0A
Other languages
Chinese (zh)
Other versions
CN106709824A (en
Inventor
赵渺希
郭振松
梁景宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201611159450.0A priority Critical patent/CN106709824B/en
Publication of CN106709824A publication Critical patent/CN106709824A/en
Application granted granted Critical
Publication of CN106709824B publication Critical patent/CN106709824B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/08Construction

Landscapes

  • Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a building evaluation method based on network text semantic analysis, which comprises the steps of selecting a professional building forum, obtaining network texts by utilizing L ocoy Spider software, carrying out screening and sorting, carrying out semantic analysis on the network texts by using a result word segmentation tool and a Chinese word frequency analysis tool, carrying out screening matching and non-parameter inspection on word frequency tables of segmentation classes of a modern Chinese language database, establishing a network building professional language database, carrying out characteristic word analysis on building individuals, comparing characteristic words of the building individuals with the network building professional language database, and analyzing the attention difference of the network individuals and professional building designers on the building individuals.

Description

Building evaluation method based on semantic analysis of web text
Technical Field
The invention relates to a building evaluation method, in particular to a building evaluation method based on web text semantic analysis, and belongs to the field of building evaluation.
Background
With the advent of the information age and the network society, the variety of construction media has become increasingly abundant. Besides traditional text publishing media such as newspapers and magazines, the rise of new media such as social software, professional building forums and sticking bars provides new platforms and tools for building comments. In recent years, a lot of nickname buildings similar to 'autumn pants', 'big underpants' and 'big intestine tower' are concerned in the network, attract the broad enthusiasm of netizens and social people, and raise a round of building criticism, thereby having a wide influence on building design and building comments. Diversified building propagation media play an increasingly important role in the field of building reviews, and have profound influence on the main body, content, form, value standard and the like of the building reviews[1]. In the role of the current network new media in the building field, the difference of the cognition of different groups such as designers, masses and the like to the building and the effective promotion of the public participation of the building design by using the network media tool of the new era are subjects worthy of intensive research.
With the continuous improvement of information technology, methods for word frequency analysis, semantic analysis and comment tendency analysis are becoming mature. Zhangming et al (2009) invented a Chinese web page classification method based on keyword frequency analysis, which uses regular expression filter to filter noise, uses word segmentation device and keyword frequency analyzer to make fuzzy classification calculation of web page to obtain the result of the category to which the web page belongs[1](ii) a Wangyi et al (2013) invented a semantic analysis method and system, which carries out corpus segmentation and iterative sampling according to document dimensionality and word dimensionality, and carries out semantic analysis on the obtained convergence sampling model[2](ii) a Shiliu (2014) invents a method and a device for extracting domain keywords, and keywords in the domain are extracted by setting an algorithm through generating a word frequency matrix[3](ii) a Zhao Juxi et al (2016) invented a method for generating an urban cognition map based on internet word frequency, which is reflected on the urban map based on the urban cognition measure collected by network data[4](ii) a Wu Qiong et al (2009) invented a cross-domain text emotion orientation analysis method, which establishes a matrix relation through a text set, calculates emotion scores by using a matrix and normalizes[5](ii) a The limited scientific and technological development company (2011) of Zhongdingfu (Beijing) invented a system and method for analyzing tendentiousness of short texts, which can identify semantic structures of sentences, search set tendentiousness words and tendentiousness patterns in the sentences and analyze the tendentiousness[6]. Wumingfen et al (2013) invent an automatic classification system for oriented texts and an implementation method thereof, and classify texts based on an emotion classification syntax tree library and a dependency relationship graph library[7](ii) a Donelili et al (2013) invented a text tendency analysis method and commodity comment tendency discriminator based on the method, through dependency grammar analysis, emotion dictionary calculation engine discriminates text tendency[8](ii) a The invention discloses a method and a device for determining text tendency, which are used for determining the tendency of a sentence containing an industry characteristic word according to a preset industry characteristic word dictionary and a text classification model[9]
Therefore, the web text pair is used for establishing a professional building corpus and researching the tendency of the public to different building schemes, so that more building comment languages are reflected in the building design, and the development of building evaluation and building design is promoted.
The references mentioned above are as follows:
[1] zhangming, ridge dragon, Luyanhong, Von source, Yanrui, Wang Pan. Patent application publication No. CN101593200, 2009-12-02.
[2] Wangyi, zhao schanmin, sun jonglong, rigor, wangli peak, junk, royal bin semantic analysis method and system [ P ]. guangdong: patent application publication No. CN104346339A, 2015-02-11.
[3] Shiliu, a method and device for extracting domain keywords [ P ]. Beijing: patent application publication numbers CN103870575A, 2014-06-18.
[4] Zhao vast xi, Huangjunhao, Linyan willow, zhongguang, an urban cognitive map generation method [ P ] based on internet word frequency: patent application publication No. CN105574259A, 2016-05-11.
[5] Wu Qiong, Tan Tubo, section persistence, Cheng Zhi, a cross-domain text emotional orientation analysis method [ P ]. Beijing: patent application publication No. CN101714135A, 2010-05-26.
[6] System and method for trend analysis for short text [ P ]. beijing: patent application publication No. CN102541840A, 2012-07-04.
[7] Wumingfen, Chentao, Liuxing forest, an automatic classification system of tendency texts and an implementation method [ P ]. Guangdong: CN102930042A,2013-02-13.
[8] Dongli, Zhao flourishing, Zhang Xiang, Wang Ru A text tendency analysis method and a commodity comment tendency discriminator [ P ] Shaanxi based on the method: patent application publication No. CN103455562A, 2013-12-18.
[9] Method and apparatus for determining text orientation [ P ]. beijing: patent application publication numbers CN104572616A, 2015-04-29.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a building evaluation method based on semantic analysis of web texts.
The purpose of the invention can be achieved by adopting the following technical scheme:
a building evaluation method based on web text semantic analysis comprises the following steps:
s1, selecting a professional building forum, acquiring the web text by using L ocoy Spider software, and screening and sorting the web text;
s2, performing semantic analysis on the web text through a Chinese word segmentation tool and a Chinese word frequency analysis tool, and performing screening matching and nonparametric inspection on the web text and a word class word frequency table of a modern Chinese language database to establish a network building professional language database;
and S3, analyzing the characteristic words of the construction individual case, comparing the characteristic words of the construction individual case with the network construction professional corpus, and analyzing the attention difference of the network masses and professional construction designers on the construction individual case.
Preferably, in step S1, the selecting a professional building forum, acquiring the web text by using L ocoy spreader software, and performing screening and sorting specifically includes:
s11, selecting a professional building forum with sufficient comment samples as a data source;
s12, editing a newly-built locomotive task by using L ocoy Spider software, analyzing a source code of a webpage structure of a professional building forum, selecting front and rear corresponding fields as identification character strings for capturing required webpage information, wherein main tag information obtained by crawling comprises a professional building forum theme, a comment user name, comment time and comment content;
s13, setting in the rule of the collected content of the locomotive task, and operating the locomotive task to crawl relevant data;
s14, refining and sorting the acquired comment data according to the labels of the professional building forum topics, the comment user names, the comment time and the comment contents, and rejecting the professional building forum bulletins and the advertisement posts.
Preferably, in step S2, the semantic analysis of the web text is performed by the word segmentation tool for the ending and the word frequency analysis tool for the chinese language, and the screening matching and the non-parameter inspection are performed with the word frequency table for the word segmentation class of the modern chinese language database to establish the professional language database for the web architecture, which specifically includes:
s21, converting the screened and sorted professional building forum comment data into a txt text format, and performing word segmentation by using a finish word segmentation tool to form a word list of professional building forum comments;
s22, counting the frequency, the repetition number, the percentage and the de-weight percentage of each vocabulary for the comment data of the professional building forum by utilizing a Chinese word frequency counting tool according to the vocabulary list formed in the step S21;
s23, according to the word frequency table of the modern Chinese language database in the online website of the database, matching and obtaining a certain number of word samples and the word frequency number of the word samples in the building professional building forum and the modern whole Chinese language database;
s24, performing standard normalization processing on the two groups of word frequency data;
s25, importing the data after the standard normalization processing into SPSS software, performing non-parameter detection analysis on two groups of word frequency numbers by using two paired sample non-parameter detection commands, and judging whether the overall distribution of the two paired samples has significant difference;
s26, analyzing the importance of the professional building forum vocabulary based on the TextRank algorithm when the overall distribution of the two paired samples is significantly different;
s27, sequencing the building professional building forum vocabularies from high to low according to the vocabulary importance data formed in the step S26, screening and removing the high-frequency vocabularies of the modern Chinese corpus according to the vocabulary frequency table of the modern Chinese corpus in the corpus online website, and taking the rest vocabularies as the network building professional vocabularies;
s28, classifying and sorting the network building professional vocabularies formed in the step S27 according to building types, building functions, building shapes, traffic layouts, building environments, building colors, building materials and structures, spatial layouts, building results, building components and building roles, and establishing a network building professional corpus.
Preferably, in step S3, the analyzing the difference between the network masses and the professional building designers regarding the attention of the building personal by analyzing the feature vocabulary of the building personal and comparing the feature vocabulary of the building personal with the network building professional corpus specifically includes:
s31, converting the screened and sorted building case comment data into a txt text format, and performing word segmentation by using a Chinese word segmentation tool to form a word list of the building case comment;
s32, counting the frequency, the repetition number, the percentage and the de-weight percentage of each vocabulary for the building individual case comment data by utilizing a Chinese word frequency counting tool according to the vocabulary list formed in the step S31;
s33, according to the word frequency table of the modern Chinese language database in the online website of the database, matching and obtaining a certain number of word samples and the word frequency number of the word samples in the building case comments and the modern whole Chinese language database;
s34, performing standard normalization processing on the two groups of word frequency data;
s35, importing the data after the standardization processing into SPSS software, carrying out nonparametric inspection analysis on two groups of word frequency numbers by using nonparametric inspection commands of two paired samples, and judging whether the overall distribution of the two paired samples has significant difference;
s36, when the overall distribution of the two matched samples is different in significance, analyzing the importance of the construction scheme vocabulary based on the TextRank algorithm;
s37, according to the word importance data formed in the step S36, the building individual case words are sorted from high to low in importance, high-frequency words of a modern Chinese language database appearing in the building individual case words are screened and removed according to a word frequency table of the modern Chinese language database in the online website of the language database, and the rest words are used as building individual case feature words;
and S38, comparing the characteristic words of the building individual case formed in the step S37 with the network building professional corpus, and analyzing the attention difference of the network masses and professional building designers on the building individual case.
Preferably, the performing of the standard normalization processing on the two groups of word frequency data specifically includes:
suppose the ith word count of the jth group vocabulary list is αijAfter the standard normalization processing, the standard value theta is obtainedijThe concrete formula is as follows:
Figure GDA0002440971430000051
in the formula: 1,2 …, x; j is 1, 2.
Preferably, the non-parameter test analysis of the two groups of word frequency numbers is performed by using the two paired sample non-parameter test commands, and whether the overall distribution of the two paired samples has a significant difference is determined, specifically:
subtracting the observed values β of the first set of samples from the observed values of the second set of samples according to the symbolic test methodij(ii) a If the difference is a positive number, recording as a positive number; if the difference is negative, marking as a sign; when the difference value is equal to 0, deleting the corresponding building individual case, and correspondingly reducing the number x of the samples;
the difference data is retained and sorted in ascending order according to the absolute value of the difference data to find the corresponding rank value βiAnd respectively calculating the sign as positive sign rank and W+Negative rank and W-And positive average rank U+Negative average rank U_
The specific calculation formula is as follows:
Figure GDA0002440971430000061
U+=W+m or U-=W-/n
Wherein m and n represent the number of positive rank values and negative rank values, respectively;
calculating the test statistic Z value and the accompanying probability value Sig calculated by the SPSS, and comparing the values with a set significance level to judge whether two groups of sample data have significance difference, wherein the significance difference is as follows:
W=min(W+,W_)
Figure GDA0002440971430000062
wherein n' is the number of valid samples for which the erasure difference is zero;
if the obtained probability value is less than or equal to the set significance level, the overall distribution from which the two paired samples come is considered to have significance difference; if the resulting probability value is higher than the set significance level, the overall distribution from which the two paired samples come is considered to be not significantly different.
Preferably, the importance of the vocabulary is expressed as follows:
Figure GDA0002440971430000063
wherein, P (V)i) Is the medium importance of the word i, d is the damping coefficient, In (V)i) Is a set of speech segments containing a vocabulary i, Out (V)j) Is a collection of word segments in the vocabulary j, | Out (V)j) And | is the number of elements in the set.
Preferably, the method further comprises:
and S4, classifying the overall comment data of the individual building case according to different building schemes, and analyzing the attention elements of the network masses to the different schemes.
Preferably, in step S4, the classifying the overall comment data of the individual architecture case according to different architecture schemes, and analyzing the elements of interest of the network public for different schemes specifically includes:
s41, classifying the comments of the construction cases on the professional construction forum according to different schemes, and respectively converting the comments into txt file formats;
s42, respectively counting the frequency number, the repetition number, the percentage and the de-weight percentage of each vocabulary for the plurality of building scheme comment data formed in the step S41 by utilizing a Chinese word frequency counting tool according to the vocabulary list formed in the step S31;
s43, according to the word frequency data formed in the step S42, the high frequency word data are taken to be subjected to standard normalization processing, and the following formula is shown:
suppose that the ith word frequency in the high-frequency vocabulary data is αiAfter the standard normalization processing, the standard value theta is obtainediThe concrete formula is as follows:
Figure GDA0002440971430000071
wherein i is 1,2 …, x;
s44, judging the characteristic vocabulary of each building scheme, and assuming that the standard value of the ith word frequency number of the jth scheme is PijThen the word frequency significance value of the standard value is
Figure GDA0002440971430000072
The specific calculation formula is as follows:
Figure GDA0002440971430000073
wherein i is 1,2 …, x; j is 1, 2;
s45, getting
Figure GDA0002440971430000074
The vocabulary is used as the characteristic vocabulary of the building scheme, namely the concerned elements of the network masses for different schemes are obtained.
Compared with the prior art, the invention has the following beneficial effects:
1. the method utilizes the web text of the comments of the professional building forum in the large public building design, acquires the web text of the professional building forum through L ocoySpider software, performs semantic analysis on the web text through a Chinese word segmentation tool and a Chinese word frequency analysis tool, performs screening matching and nonparametric inspection on the web text and a word frequency table of the segmentation class of a modern Chinese language database, and establishes the professional language database of the network building, which is an effective supplement for the lack of a related language database in the field of the conventional building comments.
2. The method can analyze the attention difference of network masses and professional building designers on the building individual case by analyzing the characteristic vocabulary of the building individual case, is favorable for adapting the building comment language to a new media environment, enables more building comment languages to be reflected in the building design, and promotes the development of building evaluation and building design.
3. The method can classify the overall comment data of the individual building case according to different building schemes, and analyze the comment data to obtain the feature words of each building scheme in the individual building case, so that professional architectural designers can know the concerned elements of network masses for different schemes, and the most appropriate building scheme is determined.
Drawings
Fig. 1 is a flowchart of a building evaluation method according to embodiment 1 of the present invention.
Fig. 2 is a diagram of a relative ratio of high-frequency words in the network architecture professional corpus and words in the modern chinese corpus in embodiment 2 of the present invention.
Fig. 3 is a schematic diagram of the construction competition scheme of the zhangjiakou olympic sports center in embodiment 2 of the present invention.
Fig. 4 is a diagram of a relative ratio of the building plan feature words in embodiment 2 of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
Example 1:
as shown in fig. 1, the building evaluation method of this embodiment establishes a building review professional corpus in a network environment based on web texts of large public building design professional building forum reviews, and analyzes the difference in the attention of designers and web masses to building individuals, and includes the following steps:
1) selecting a professional building forum, acquiring the web text by using L ocoy Spider software, and screening and sorting;
1.1) selecting a professional building forum with sufficient comment samples as a data source;
1.2) editing a newly-built locomotive task by using L ocoy Spider software, analyzing a source code of a webpage structure of a professional building forum, selecting front and rear corresponding fields as identification character strings for capturing required webpage information, wherein main tag information obtained by crawling comprises a professional building forum theme, a comment user name, comment time, comment content and the like;
1.3) setting in the rule of the collected content of the locomotive task, and operating the locomotive task to crawl relevant data;
1.4) completing and sorting the obtained comment data according to the tags of the subject of the professional building forum, the comment user name, the comment time and the comment content, and rejecting posts such as bulletins, advertisements and the like of the professional building forum.
2) Semantic analysis of the web text is carried out through a Chinese word frequency analysis tool and a Chinese word segmentation tool, screening matching and nonparametric inspection are carried out on the semantic analysis and the word frequency table of the segmentation class of the modern Chinese language database, and a network building professional language database is established;
2.1) converting the screened and sorted professional building forum comment data into a txt text format, and performing word segmentation by using a finish word segmentation tool to form a word list of professional building forum comments;
2.2) according to the vocabulary list formed in the step 2.1), counting the frequency, the repetition number, the percentage and the de-weight percentage of each vocabulary for the comment data of the professional building forum by using a Chinese word frequency counting tool;
2.3) matching and acquiring a certain number of vocabulary samples and the word frequency number of the vocabulary samples in the building professional building forum and the modern whole Chinese language corpus according to the word frequency table of the modern Chinese language corpus in the corpus online website (www.cncorpus.org);
2.4) carrying out standard normalization processing on the two groups of word frequency data, and assuming that the ith word frequency number of the jth group vocabulary list is αijAfter the standard normalization processing, the standard value theta is obtainedijThe concrete formula is as follows:
Figure GDA0002440971430000091
in the formula: 1,2 …, x; j is 1, 2;
2.5) importing the data after the standard normalization processing into SPSS software, performing non-parameter detection analysis on two groups of word frequency numbers by using two paired sample non-parameter detection commands, and judging whether the overall distribution of the two paired samples has significant difference, specifically:
subtracting the observed values β of the first set of samples from the observed values of the second set of samples according to the symbolic test methodij(ii) a If the difference is a positive number, recording as a positive number; if the difference is negative, marking as a sign; when the difference value is equal to 0, deleting the corresponding building individual case, and correspondingly reducing the number x of the samples;
the difference data is retained, sorted in ascending order according to the absolute value of the difference data, and the corresponding rank value β is obtainediAnd respectively calculating the sign as positive sign rank and W+Negative rank and W-And positive average rank U+Negative average rank U-
The specific calculation formula is as follows:
Figure GDA0002440971430000101
U+=W+m or U-=W-/n (3)
Wherein m and n represent the number of positive rank values and negative rank values, respectively;
calculating a test statistic Z value and an accompanying probability value Sig calculated by the SPSS, and comparing the values with a set significance level to judge whether two groups of sample data have significance difference;
W=min(W+,W_) (4)
Figure GDA0002440971430000102
wherein n' is the number of valid samples for which the erasure difference is zero;
if the obtained probability value is less than or equal to the set significance level, the overall distribution from which the two paired samples come is considered to have significance difference; if the obtained probability value is higher than the set significance level, the overall distribution from the two paired samples is considered to have no significant difference;
2.6) when the overall distribution of the two matched samples is different significantly, analyzing the importance of the vocabularies of the professional architectural forum based on the TextRank algorithm, wherein the formula is as follows:
Figure GDA0002440971430000103
wherein, P (V)i) Is the medium importance (PR value) of the word i, d is the damping coefficient, In (V)i) Is a set of speech segments containing a vocabulary i, Out (V)j) Is a collection of word segments in the vocabulary j, | Out (V)j) I is the number of elements in the set, the ranking is carried out according to the importance of the words from high to low, and the importance of the words which are ranked more front is higher in the comments;
2.7) according to the vocabulary importance data formed in the step 2.6), sequencing the vocabularies of the architecture professional architecture forum from high to low, screening and removing high-frequency vocabularies of the modern Chinese corpus according to a vocabulary frequency table of the modern Chinese corpus in the online website of the corpus, and taking the rest vocabularies as the network architecture professional vocabularies;
2.8) classifying and sorting the network building professional vocabularies formed in the step 2.7) according to building types, building functions, building shapes, traffic layouts, building environments, building colors, building materials and structures, spatial layouts, building results, building components, building roles and the like, and establishing network building professional linguistic data;
3) analyzing the characteristic vocabularies of the construction individual case, comparing the characteristic vocabularies of the construction individual case with a network construction professional corpus, and analyzing the attention difference of network masses and professional construction designers on the construction individual case;
3.1) converting the screened and sorted building case comment data into a txt text format, and performing word segmentation by using a Chinese word segmentation tool to form a vocabulary list of the building case comment;
3.2) counting the frequency, the repetition number, the percentage and the de-weight percentage of each vocabulary for the building individual case comment data by utilizing a Chinese word frequency counting tool according to the vocabulary list formed in the step 3.1);
3.3) matching and obtaining a certain number of vocabulary samples and the word frequency number of the vocabulary samples in the building case comments and the modern whole Chinese language database according to the word frequency table of the modern Chinese language database in the online website of the database;
3.4) carrying out standard normalization processing on the two groups of word frequency data, and realizing by adopting the formula (1);
3.5) importing the data after the standardization treatment into SPSS software, carrying out nonparametric inspection analysis on two groups of word frequency numbers by using nonparametric inspection commands of two paired samples, judging whether the overall distribution of the two paired samples has significant difference or not, and realizing by adopting the above formulas (2) to (5);
3.6) when the overall distribution of the two matched samples is different significantly, analyzing the importance of the construction scheme vocabulary based on the TextRank algorithm, and realizing by adopting the formula (6);
3.7) according to the vocabulary importance data formed in the step 3.6), sequencing the importance of the building individual case vocabularies from high to low, and screening and removing high-frequency vocabularies of the modern Chinese corpus according to a vocabulary frequency table of the modern Chinese corpus in the online website of the corpus, wherein the rest vocabularies are used as building individual case characteristic vocabularies;
3.8) comparing the characteristic words of the individual building case formed in the step 3.7) with a network building professional corpus, and analyzing the attention difference of network masses (common citizens) and professional building designers on the individual building case;
4) classifying the overall comment data of the individual building case according to different building schemes, and analyzing the attention elements of the network masses to the different schemes;
4.1) classifying the comments of the construction cases on the professional construction forum according to different schemes, and respectively converting the comments into txt file formats;
4.2) according to the vocabulary list formed in the step 3.1), utilizing a Chinese word frequency statistical tool to respectively count the frequency number, the repetition number, the percentage and the de-weight percentage of each vocabulary for the plurality of building scheme comment data formed in the step 4.1);
4.3) according to the word frequency data formed in the step S42, taking the high-frequency word data to perform standard normalization processing, as follows:
suppose that the ith word frequency in the high-frequency vocabulary data is αiAfter the standard normalization processing, the standard value theta is obtainediThe concrete formula is as follows:
Figure GDA0002440971430000121
wherein i is 1,2 …, x;
4.4) judging the characteristic vocabulary of each building scheme, and assuming that the standard value of the ith word frequency of the jth scheme is PijThen the word frequency significance value of the standard value is
Figure GDA0002440971430000122
The specific calculation formula is as follows:
Figure GDA0002440971430000123
wherein i is 1,2 …, x; j is 1, 2;
4.5) taking
Figure GDA0002440971430000124
The vocabulary is used as the characteristic vocabulary of the building scheme, namely the concerned elements of the network masses for different schemes are obtained, so that professional building designers can know the concerned elements of the network masses for different schemes, and the most appropriate building scheme is determined.
Example 2:
the embodiment is an application example, selecting comment contents of Abbs building forum and Zhangkou daily newspaper WeChat subscription number-Zhangkou gym design scheme voting platform as research analysis cases, 4401 posts and 32801 building comment contents based on building scheme version and building communication version of Abbs building forum, and 4662 comment contents of Zhangkou daily newspaper WeChat subscription number-Zhangu gym design scheme voting platform, and the specific implementation steps of the whole process include:
1) selecting an ABBS building forum and a Zhangkou daily newspaper WeChat subscription number-Zhangkou gymnasium design scheme voting platform, acquiring the web text by using L ocoy Spider software, and screening and sorting.
1.1) analyzing a source code of an ABBS webpage structure;
1.2) selecting corresponding fields before and after as identification character strings for capturing required webpage information, wherein the main tag information which is captured comprises forum topics, comment user names, comment time, comment contents and the like.
1.3) setting in the rule of the collected content of the locomotive task, and operating the locomotive task to crawl relevant data, wherein due to the existence of a multi-level website structure in the building forum of Abbs, the required comment data can be obtained by establishing a plurality of locomotive tasks;
1.4) analyzing a family daily newspaper WeChat subscription number-a source code of a design scheme voting platform webpage structure of a family gymnasium;
1.5) selecting corresponding fields before and after as identification character strings for capturing required webpage information, wherein the main tag information which is crawled is comment content;
1.6) setting in the rule of the collected content of the locomotive task, and operating the locomotive task to crawl relevant data;
1.7) completing and sorting the obtained comment data according to tags such as forum topics, comment user names, comment time, comment contents and the like, and rejecting posts irrelevant to forum bulletins, advertisements and the like.
2) Semantic analysis of the network text is carried out through a Chinese word frequency analysis tool and a Chinese word segmentation tool, and screening matching and nonparametric inspection are carried out on the semantic analysis and the word frequency table of the segmentation class of the modern Chinese language database, so that the network architecture professional language database is established.
2.1) converting the screened and sorted forum comment data into a txt text format, and performing word segmentation by using a Chinese word segmentation tool of 'ending' to form a word list of ABBS forum comments;
2.2) according to the vocabulary list formed in the step 2.1), counting the frequency, the repetition number, the percentage and the de-weight percentage of each vocabulary of the forum comment data formed in the step 2.1 by using a Chinese word frequency counting tool;
2.3) matching and obtaining the vocabulary samples with the word frequencies ranked 50 above and the word frequencies of the building forums and the modern whole Chinese corpus formed in the step 2.2) according to the word frequency table of the modern Chinese corpus in the corpus online website (www.cncorpus.org); a portion of a modern chinese corpus is shown in table 1 below.
Figure GDA0002440971430000141
TABLE 1 modern Chinese language material base
2.4) carrying out standard normalization processing on the two groups of word frequency data, wherein the formula refers to the formula (1) in the embodiment 1;
2.5) importing the standardized data into SPSS software, and performing nonparametric inspection analysis on two groups of word frequency numbers by using two paired sample nonparametric inspection commands;
2.5.1) subtracting the observed values β of the first group of samples from the observed values of the second group of samples according to the method of sign checkingij. If the difference is a positive number, recording as a positive number; if the difference is negative, marking as a sign; if the difference is equal to 0, the case is deleted and the number x of samples is reduced accordingly.
2.5.2) retaining the difference data, sorting in ascending order according to the absolute value of the difference data, finding the corresponding rank value βiAnd respectively calculating the sign as positive sign rank and W+Negative rank and W-And positive average rank U+Negative average rank U_See formulas (2) and (3) of example 1;
2.5.3) calculating the test statistic Z value and the accompanying probability value Sig calculated by the SPSS, and comparing the values with the set significance level to judge whether the two groups of sample data have significance difference. See formulas (4) and (5) of example 1;
the absolute value of the Z value of the test statistic obtained through calculation is 114.477, and the accompanying probability value Sig is 0.000, which shows that the building professional corpus is significantly different from the modern Chinese integral corpus, and the building network professional forum has characteristic words to be further analyzed.
2.6) analyzing the importance of the words of the architectural forum based on a TextRank algorithm, wherein the formula (6) in the embodiment 1 is shown, the words are ranked from high to low according to the importance of the words, and the more top words have higher importance in comments;
2.7) according to the vocabulary importance data formed in the step 2.6), sequencing the vocabulary of the building forum from high to low, and according to a word frequency table of a modern Chinese language database in an online website of the language database, screening and removing high-frequency vocabularies of the modern Chinese language database, wherein the rest vocabularies are used as network building professional vocabularies;
2.8) classifying and sorting the network building professional vocabularies formed in the step 2.7) according to the types of building types, building functions, building shapes, traffic layouts, building environments, building colors, building materials and structures, spatial layouts, building results, building components, building roles and the like, and establishing a network building professional corpus as shown in the following table 2; the high frequency words of the web architecture professional corpus are compared with the vocabulary of the modern chinese corpus as shown in fig. 2.
Figure GDA0002440971430000151
TABLE 2 network architecture professional corpus
3) By analyzing the characteristic vocabulary of the construction individual case, the attention points of the network masses to the construction individual case and the attention difference with a designer are analyzed.
3.1) converting the screened and sorted building individual case comment data into a txt text format, and performing word segmentation by using a result word segmentation tool to form a vocabulary list of the building individual case comment;
3.2) counting the frequency, the repetition number, the percentage and the de-weight percentage of each vocabulary for the building individual case comment data by utilizing a Chinese word frequency counting tool according to the vocabulary list formed in the step 3.1);
3.3) matching and obtaining vocabulary samples with the word frequencies of 50 th before the word frequency ranking formed in the step 3.2) and word frequency numbers of the building case comments and the modern integral Chinese corpus according to a word frequency table of the modern Chinese corpus of the online website of the corpus;
3.4) carrying out standard normalization processing on the two groups of word frequency data, wherein the formula refers to the formula (1) in the embodiment 1;
3.5) importing the standardized data into SPSS software, and performing non-parameter test analysis on two groups of word frequencies by using two paired sample non-parameter test commands, wherein the formulas are shown in formulas (2) to (5) in embodiment 1, the absolute value of the test statistic Z value is 7.513 through calculation, and the accompanying probability value Sig is 0.000, so that the obvious difference exists between the building individual case comment vocabulary of Zhangjiakou gym and the modern Chinese language corpus;
3.6) analyzing the importance of the vocabulary of the construction case based on a TextRank algorithm, wherein the formula is shown in formula (6) of example 1;
3.7) according to the vocabulary importance data formed in the step 3.7), sequencing the importance of the building individual case vocabularies from high to low, and screening and removing high-frequency vocabularies of the modern Chinese corpus according to a vocabulary frequency table of the modern Chinese corpus in the online website of the corpus, wherein the rest vocabularies are used as building individual case characteristic vocabularies;
3.8) comparing the building individual case characteristic vocabulary formed in the step 3.7) with a building professional corpus, and analyzing the attention difference between the network masses and professional building designers;
4) classifying the overall comment data of the individual architecture case according to different architecture schemes, and analyzing the attention elements of the network masses for the different schemes, wherein each architecture scheme of the embodiment is shown in fig. 3;
4.1) classifying the comments of the construction cases on the professional construction forum according to different schemes, and respectively converting the comments into txt file formats;
4.2) according to the vocabulary list formed in the step 3.1), utilizing a Chinese word frequency statistical tool to respectively count the frequency number, the repetition number, the percentage and the de-weight percentage of each vocabulary for the plurality of building scheme comment data formed in the step 4.1);
4.3) forming word frequency data according to the step 4.2), and taking the high frequency word data to perform standard normalization processing, wherein the formula refers to the formula (7) of the embodiment 1;
4.4) judging the characteristic vocabulary of each building scheme, and assuming that the standard value of the ith word frequency of the jth scheme is PijThen its word frequency significance value is
Figure GDA0002440971430000171
See formula (8) of example 1;
4.5) taking
Figure GDA0002440971430000172
The vocabulary of (1) is used as the characteristic vocabulary of the building plan, as shown in the following table 3;
construction scheme Number of comments Characteristic vocabulary of building scheme
Scheme two 965 Atmosphere, building, characteristic, space, function and shape
Scheme three 132 Comprehensive, idea, simple and cost
Scheme five 3222 Building, comprehensive, practical, beautiful and elegant
TABLE 3 characteristic vocabulary of each building plan of family
Comparing the feature vocabulary of the construction plans (plans two, three and five) as shown in fig. 4, for example, the elements of the network public concerned about each plan can be seen from table 3 and fig. 4, and the professional architectural designer can determine the most suitable construction plan according to the elements.
In conclusion, the method utilizes the web text of the comments of the professional building forum in the large public building design, acquires the web text of the professional building forum through L ocoy Spider software, performs semantic analysis on the web text through a Chinese word segmentation tool and a Chinese word frequency analysis tool, performs screening matching and nonparametric inspection on the web text and a word frequency table of the segmentation class of the modern Chinese language database, and establishes the professional web building corpus, which is an effective supplement for the lack of related language databases in the field of the conventional building comments.
The above description is only for the preferred embodiments of the present invention, but the protection scope of the present invention is not limited thereto, and any person skilled in the art can substitute or change the technical solution of the present invention and the inventive concept thereof within the scope of the present invention.

Claims (7)

1. A building evaluation method based on web text semantic analysis is characterized by comprising the following steps: the method comprises the following steps:
s1, selecting a professional building forum, acquiring the web text by using L ocoy Spider software, and screening and sorting the web text;
s2, performing semantic analysis on the web text through a Chinese word segmentation tool and a Chinese word frequency analysis tool, and performing screening matching and nonparametric inspection on the web text and a word class word frequency table of a modern Chinese language database to establish a network building professional language database;
s3, analyzing the characteristic words of the individual building case, comparing the characteristic words of the individual building case with a network building professional corpus, and analyzing the attention difference between the network masses and professional building designers on the individual building case;
s4, classifying the overall comment data of the individual building case according to different building schemes, and analyzing the attention elements of the network masses to the different schemes;
in step S3, the analyzing the feature vocabulary of the construction personal case, comparing the feature vocabulary of the construction personal case with the network construction professional corpus, and analyzing the difference between the network public and the professional construction designer regarding the construction personal case specifically includes:
s31, converting the screened and sorted building case comment data into a txt text format, and performing word segmentation by using a Chinese word segmentation tool to form a word list of the building case comment;
s32, counting the frequency, the repetition number, the percentage and the de-weight percentage of each vocabulary for the building individual case comment data by utilizing a Chinese word frequency counting tool according to the vocabulary list formed in the step S31;
s33, according to the word frequency table of the modern Chinese language database in the online website of the database, matching and obtaining a certain number of word samples and the word frequency number of the word samples in the building case comments and the modern whole Chinese language database;
s34, performing standard normalization processing on the two groups of word frequency data;
s35, importing the data after the standardization processing into SPSS software, carrying out nonparametric inspection analysis on two groups of word frequency numbers by using nonparametric inspection commands of two paired samples, and judging whether the overall distribution of the two paired samples has significant difference;
s36, when the overall distribution of the two matched samples is different in significance, analyzing the importance of the construction scheme vocabulary based on the TextRank algorithm;
s37, according to the word importance data formed in the step S36, the building individual case words are sorted from high to low in importance, high-frequency words of a modern Chinese language database appearing in the building individual case words are screened and removed according to a word frequency table of the modern Chinese language database in the online website of the language database, and the rest words are used as building individual case feature words;
and S38, comparing the characteristic words of the building individual case formed in the step S37 with the network building professional corpus, and analyzing the attention difference of the network masses and professional building designers on the building individual case.
2. The building evaluation method based on semantic analysis of web texts as claimed in claim 1, wherein in step S1, said selecting a professional building forum, obtaining web texts by using L ocoy Spider software, and performing screening and sorting specifically comprises:
s11, selecting a professional building forum with sufficient comment samples as a data source;
s12, editing a newly-built locomotive task by using L ocoy Spider software, analyzing a source code of a webpage structure of a professional building forum, selecting front and rear corresponding fields as identification character strings for capturing required webpage information, wherein main tag information obtained by crawling comprises a professional building forum theme, a comment user name, comment time and comment content;
s13, setting in the rule of the collected content of the locomotive task, and operating the locomotive task to crawl relevant data;
s14, refining and sorting the acquired comment data according to the labels of the professional building forum topics, the comment user names, the comment time and the comment contents, and rejecting the professional building forum bulletins and the advertisement posts.
3. The building evaluation method based on web text semantic analysis according to claim 1, characterized in that: in step S2, the semantic analysis of the web text is performed by the results segmentation tool and the chinese word frequency analysis tool, and the web text is subjected to screening matching and non-parameter inspection with the word frequency table of the segmentation class of the modern chinese corpus to establish a web architecture professional corpus, which specifically includes:
s21, converting the screened and sorted professional building forum comment data into a txt text format, and performing word segmentation by using a finish word segmentation tool to form a word list of professional building forum comments;
s22, counting the frequency, the repetition number, the percentage and the de-weight percentage of each vocabulary for the comment data of the professional building forum by utilizing a Chinese word frequency counting tool according to the vocabulary list formed in the step S21;
s23, according to the word frequency table of the modern Chinese language database in the online website of the database, matching and obtaining a certain number of word samples and the word frequency number of the word samples in the building professional building forum and the modern whole Chinese language database;
s24, performing standard normalization processing on the two groups of word frequency data;
s25, importing the data after the standard normalization processing into SPSS software, performing non-parameter detection analysis on two groups of word frequency numbers by using two paired sample non-parameter detection commands, and judging whether the overall distribution of the two paired samples has significant difference;
s26, analyzing the importance of the professional building forum vocabulary based on the TextRank algorithm when the overall distribution of the two paired samples is significantly different;
s27, sequencing the building professional building forum vocabularies from high to low according to the vocabulary importance data formed in the step S26, screening and removing the high-frequency vocabularies of the modern Chinese corpus according to the vocabulary frequency table of the modern Chinese corpus in the corpus online website, and taking the rest vocabularies as the network building professional vocabularies;
s28, classifying and sorting the network building professional vocabularies formed in the step S27 according to building types, building functions, building shapes, traffic layouts, building environments, building colors, building materials and structures, spatial layouts, building results, building components and building roles, and establishing a network building professional corpus.
4. The building evaluation method based on web text semantic analysis according to claim 1 or 3, characterized in that: the standard normalization processing is performed on the two groups of word frequency data, and specifically comprises the following steps:
suppose the ith word count of the jth group vocabulary list is αijAfter the standard normalization processing, the standard value theta is obtainedijThe concrete formula is as follows:
Figure FDA0002440971420000031
in the formula: 1,2 …, n; j is 1, 2.
5. The building evaluation method based on web text semantic analysis according to claim 1 or 3, characterized in that: the non-parameter detection analysis of two groups of word frequency numbers is carried out by utilizing the non-parameter detection commands of the two paired samples, and whether the overall distribution of the two paired samples has significant difference is judged, specifically:
subtracting the observed values β of the first set of samples from the observed values of the second set of samples according to the symbolic test methodij(ii) a If the difference is a positive number, recording as a positive number; if the difference is negative, marking as a sign; when the difference value is equal to 0, deleting the corresponding building individual case, and correspondingly reducing the number x of the samples;
the difference data is retained and sorted in ascending order according to the absolute value of the difference data to find the corresponding rank value βiAnd respectively calculating the sign as positive sign rank and W+Negative rank and W-And positive average rank U+Negative average rank U-
The specific calculation formula is as follows:
Figure FDA0002440971420000041
or
Figure FDA0002440971420000042
U+=W+M or U-=W-/n
Wherein m and n represent the number of positive rank values and negative rank values, respectively;
calculating the test statistic Z value and the accompanying probability value Sig calculated by the SPSS, and comparing the values with a set significance level to judge whether two groups of sample data have significance difference, wherein the significance difference is as follows:
W=min(W+,W-)
Figure FDA0002440971420000043
wherein n' is the number of valid samples for which the erasure difference is zero;
if the obtained probability value is less than or equal to the set significance level, the overall distribution from which the two paired samples come is considered to have significance difference; if the resulting probability value is higher than the set significance level, the overall distribution from which the two paired samples come is considered to be not significantly different.
6. The building evaluation method based on web text semantic analysis according to claim 1 or 3, characterized in that: the importance of the vocabulary, the formula is as follows:
Figure FDA0002440971420000044
wherein, P (V)i) Is the medium importance of the word i, d is the damping coefficient, In (V)i) Is a set of speech segments containing a vocabulary i, Out (V)j) Is a collection of word segments in the vocabulary j, | Out (V)j) And | is the number of elements in the set.
7. The building evaluation method based on web text semantic analysis according to claim 1, characterized in that: in step S4, the classifying the overall comment data of the individual building case according to different building schemes, and analyzing the attention elements of the network public for different schemes specifically includes:
s41, classifying the comments of the construction cases on the professional construction forum according to different schemes, and respectively converting the comments into txt file formats;
s42, respectively counting the frequency number, the repetition number, the percentage and the de-weight percentage of each vocabulary for the plurality of building scheme comment data formed in the step S41 by utilizing a Chinese word frequency counting tool according to the vocabulary list formed in the step S31;
s43, according to the word frequency data formed in the step S42, the high frequency word data are taken to be subjected to standard normalization processing, and the standard normalization processing comprises the following steps:
assuming a high frequencyThe ith word frequency in the vocabulary data is αiAfter the standard normalization processing, the standard value theta is obtainediThe concrete formula is as follows:
Figure FDA0002440971420000054
wherein i is 1,2 …, n;
s44, judging the characteristic vocabulary of each building scheme, and assuming that the standard value of the ith word frequency number of the jth scheme is PijThen the word frequency significance value of the standard value is
Figure FDA0002440971420000051
The specific calculation formula is as follows:
Figure FDA0002440971420000052
wherein i is 1,2 …, n; j is 1, 2;
s45, getting
Figure FDA0002440971420000053
The vocabulary is used as the characteristic vocabulary of the building scheme, namely the concerned elements of the network masses for different schemes are obtained.
CN201611159450.0A 2016-12-15 2016-12-15 Building evaluation method based on semantic analysis of web text Active CN106709824B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611159450.0A CN106709824B (en) 2016-12-15 2016-12-15 Building evaluation method based on semantic analysis of web text

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611159450.0A CN106709824B (en) 2016-12-15 2016-12-15 Building evaluation method based on semantic analysis of web text

Publications (2)

Publication Number Publication Date
CN106709824A CN106709824A (en) 2017-05-24
CN106709824B true CN106709824B (en) 2020-07-28

Family

ID=58937794

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611159450.0A Active CN106709824B (en) 2016-12-15 2016-12-15 Building evaluation method based on semantic analysis of web text

Country Status (1)

Country Link
CN (1) CN106709824B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408808B (en) * 2018-09-12 2023-08-22 中国传媒大学 Evaluation method and evaluation system for literature works
CN111553006B (en) * 2020-04-23 2021-04-13 山东建筑大学 Large-scale rural residential layout generation design method in northern plain area
CN111723208B (en) * 2020-06-28 2023-04-18 西南财经大学 Conditional classification tree-based legal decision document multi-classification method and device and terminal

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009230663A (en) * 2008-03-25 2009-10-08 Kddi Corp Apparatus for detecting abnormal condition in web page, program, and recording medium
CN102135961A (en) * 2010-01-22 2011-07-27 北京金山软件有限公司 Method and device for determining domain feature words
CN102654873A (en) * 2011-03-03 2012-09-05 苏州同程旅游网络科技有限公司 Tourism information extraction and aggregation method based on Chinese word segmentation
CN103226578A (en) * 2013-04-02 2013-07-31 浙江大学 Method for identifying websites and finely classifying web pages in medical field
CN105022725A (en) * 2015-07-10 2015-11-04 河海大学 Text emotional tendency analysis method applied to field of financial Web

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009230663A (en) * 2008-03-25 2009-10-08 Kddi Corp Apparatus for detecting abnormal condition in web page, program, and recording medium
CN102135961A (en) * 2010-01-22 2011-07-27 北京金山软件有限公司 Method and device for determining domain feature words
CN102654873A (en) * 2011-03-03 2012-09-05 苏州同程旅游网络科技有限公司 Tourism information extraction and aggregation method based on Chinese word segmentation
CN103226578A (en) * 2013-04-02 2013-07-31 浙江大学 Method for identifying websites and finely classifying web pages in medical field
CN105022725A (en) * 2015-07-10 2015-11-04 河海大学 Text emotional tendency analysis method applied to field of financial Web

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
古城商业街空间感性意识之语意评价;李雯蕾;《南方建筑》;20160131(第1期);第39-45页 *
面向特定领域的文本识别和分类;褚金正;《中国优秀硕士学位论文全文数据库 信息科技辑》;20051115;第31-35,47-48页 *

Also Published As

Publication number Publication date
CN106709824A (en) 2017-05-24

Similar Documents

Publication Publication Date Title
CN105824959B (en) Public opinion monitoring method and system
US8239189B2 (en) Method and system for estimating a sentiment for an entity
CN104820629B (en) A kind of intelligent public sentiment accident emergent treatment system and method
CN109829166B (en) People and host customer opinion mining method based on character-level convolutional neural network
CN112699246A (en) Domain knowledge pushing method based on knowledge graph
CN109271477A (en) A kind of method and system by internet building taxonomy library
CN111950273A (en) Network public opinion emergency automatic identification method based on emotion information extraction analysis
CN109344187B (en) Structured processing system for judicial judgment case information
CN110472203B (en) Article duplicate checking and detecting method, device, equipment and storage medium
CN107315738A (en) A kind of innovation degree appraisal procedure of text message
CN112015721A (en) E-commerce platform storage database optimization method based on big data
CN108038099B (en) Low-frequency keyword identification method based on word clustering
CN106709824B (en) Building evaluation method based on semantic analysis of web text
CN1687924A (en) Method for producing internet personage information search engine
CN109492105A (en) A kind of text sentiment classification method based on multiple features integrated study
CN108363699A (en) A kind of netizen's school work mood analysis method based on Baidu's mhkc
CN112149422B (en) Dynamic enterprise news monitoring method based on natural language
CN110825998A (en) Website identification method and readable storage medium
CN114077705A (en) Method and system for portraying media account on social platform
CN110910175A (en) Tourist ticket product portrait generation method
KR101838573B1 (en) Place Preference Analysis Method based on Sentimental Analysis using Spatial Sentiment Lexicon
CN111460100A (en) Criminal legal document and criminal name recommendation method and system
CN109543049B (en) Method and system for automatically pushing materials according to writing characteristics
CN104809253B (en) Internet data analysis system
CN115730078A (en) Event knowledge graph construction method and device for class case retrieval and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant