CN115243250B - Method, system and storage medium for acquiring wifi portrait - Google Patents

Method, system and storage medium for acquiring wifi portrait Download PDF

Info

Publication number
CN115243250B
CN115243250B CN202210880497.5A CN202210880497A CN115243250B CN 115243250 B CN115243250 B CN 115243250B CN 202210880497 A CN202210880497 A CN 202210880497A CN 115243250 B CN115243250 B CN 115243250B
Authority
CN
China
Prior art keywords
wifi
ssid
name
class
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210880497.5A
Other languages
Chinese (zh)
Other versions
CN115243250A (en
Inventor
尹雅露
莫志强
陈志勇
方宏源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Merit Interactive Co Ltd
Original Assignee
Merit Interactive Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Merit Interactive Co Ltd filed Critical Merit Interactive Co Ltd
Priority to CN202210880497.5A priority Critical patent/CN115243250B/en
Publication of CN115243250A publication Critical patent/CN115243250A/en
Application granted granted Critical
Publication of CN115243250B publication Critical patent/CN115243250B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W8/00Network data management
    • H04W8/26Network addressing or numbering for mobility support
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W84/00Network topologies
    • H04W84/02Hierarchically pre-organised networks, e.g. paging networks, cellular networks, WLAN [Wireless Local Area Network] or WLL [Wireless Local Loop]
    • H04W84/10Small scale networks; Flat hierarchical networks
    • H04W84/12WLAN [Wireless Local Area Networks]

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a method, a system and a storage medium for acquiring wifi portraits, which are characterized in that firstly, the SSID name of wifi containing Chinese characters is acquired, then the SSID name is input into a preset wifi classification model to obtain the category portraits of the wifi, wherein the preset wifi classification model is obtained by training a plurality of different text-cnn models based on a plurality of wifi subsets for equivalent training which are randomly extracted from a training wifi set and a category subset to which the wifi subsets belong. Through the content, the robustness of the preset wifi classification model can be obviously improved, so that the acquired wifi class portrait is more accurate.

Description

Method, system and storage medium for acquiring wifi portrait
Technical Field
The application relates to the field of data processing, in particular to a method, a system and a storage medium for acquiring wifi portraits.
Background
Wifi plays an increasingly important role in people's daily work and life due to its own convenience. Because wifi self position is usually more fixed, therefore, the user portrait of describing wifi portrait helps further to help the construction connect this wifi, wherein acquire wifi portrait through wifi naming (SSID name) is a common method. In the prior art, people are used to name wifi by using Chinese characters and other characters, so that some wifi can be named according to Chinese characters easily to obtain a category portrait of the wifi, for example, the wifi of a market class can be named as a certain market, but even if the Chinese characters are used, a considerable part of wifi is still named, the category attribute of the wifi is still difficult to identify according to the naming of the wifi, and therefore, how to obtain the wifi portrait named by the Chinese characters is a technical problem to be solved at present.
Disclosure of Invention
Aiming at the technical problems, the application adopts the following technical scheme: a method for acquiring wifi portrait comprises the following steps: s100, acquiring an SSID name of the wifi, wherein the SSID name contains Chinese characters; s200, obtaining the class portrait of the wifi based on the SSID name and a preset wifi classification model.
A system for capturing wifi portraits, the system comprising a processor and a non-transitory computer readable storage medium storing at least one instruction or at least one program, the processor loading and executing the at least one instruction or at least one program to implement the method described above.
A computer-readable storage medium storing a program or instructions that cause a computer to perform the steps of the method described above.
According to the method, firstly, the SSID name of the wifi containing Chinese characters is obtained, and then the SSID name is input into a preset wifi classification model to obtain the class portrait of the wifi. The preset wifi classification model is obtained by training a plurality of different text-cnn models based on a plurality of wifi subsets for equivalent training, which are randomly extracted from a wifi set for training, and a category subset to which the wifi subsets belong. Through the content, the robustness of the preset wifi classification model can be improved, so that the acquired wifi class portrait is more accurate.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a method for obtaining wifi portrait according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to fall within the scope of the application.
The embodiment of the application provides a method for acquiring wifi portrait, which is shown in figure 1 and comprises the following steps:
S100, acquiring the SSID name of the wifi, wherein the SSID name contains Chinese characters. Those skilled in the art can use the SSID as the wireless network name of wifi, and the user can change the SSID name according to the need. In this step, a regular manner may be used to determine whether the SSID name contains chinese characters, and whether the SSID name contains chinese characters may also be determined according to a preset chinese character library, where in one embodiment, the preset chinese character library includes all chinese characters, and in another embodiment, the preset chinese character library may be determined according to chinese characters contained in the SSID names of all wifi acquired.
S200, obtaining the class portrait of the wifi based on the SSID name and a preset wifi classification model. In the present application, any classification technology in the prior art, such as SVM, may be used as the preset wifi classification model, and in the preferred embodiment of the present application, the preset wifi classification model is a text-cnn model. Specifically, the obtaining of the preset wifi classification model includes:
S301, obtaining an SSID name set S= { SSID 1,SSID2,...,SSIDn } of a training wifi set and a corresponding belonging category set L= { L 1,L2,...,Ln }, wherein the SSID name of the ith wifi in the training wifi set is SSID i, the corresponding belonging category is L i,Li and is one of M preset categories, and SSID i contains Chinese characters, and i is more than or equal to 1 and less than or equal to n. In the present application, M preset categories may be set in a customized manner, for example, traffic category, peripheral tour category, hotel category, etc., and the above is only exemplary, and is not intended to limit the scope of the present application. Those skilled in the art will recognize that, in order to improve the robustness of the classification model, the training wifi sets as many samples of each of the M preset categories as possible, so as to traverse all possible situations.
S302, training the text-cnn model by taking the name set S and the belonging class set L as input data of the text-cnn to obtain the preset wifi classification model. Specifically, the method comprises the following steps:
s3021, obtaining an orientation quantized name set sv= { SSIDV 1,SSIDV2,...,SSIDVn } based on the name set S. Specifically, in the present application, any vectorization method in the prior art may be used to vectorize SSID i to SSIDV i, such as one-hot, word2vec, and so on. In this step, the person skilled in the art can count all Chinese characters and non-Chinese characters of the SSID name that may be used for wifi and vectorize the SSID name based on the counted total number of characters. For example, when the number of all the counted characters (assuming that only 5 characters in north and south of east) is 5 and the SSID names are vectorized using the one-hot method, vectorized data corresponding to the 5 characters (in north and south of west) may be set to 10000, 01000, 00100, 00010, 00001, and when the SSID name of a wifi is "southeast", the data obtained by vectorizing the SSID name of the wifi is a2×5 matrix: [ [1, 0]; [0,0,1,0,0]].
S3022, an equal length name set sh= { SSIDH 1,SSIDH2,...,SSIDHn } is obtained based on the vectorized name set sv= { SSIDV 1,SSIDV2,...,SSIDVn }. In this step, the maximum length of the element in the wifi name set may be obtained according to the preset maximum length of the wifi name set or based on the vectorized name set SV, and the 0-compensating operation is adopted so that SSIDH 1,SSIDH2,...,SSIDHn has the same matrix line number. For example, according to the example in step S3021, if the maximum length of the element in the SV or the maximum length of the preset wifi name is 4, the SSID name "southeast" becomes [ [1, 0] after the 0-supplementing operation; [0, 1, 0]; [0, 0]; [0, 0] ], i.e., a 4×5 matrix.
S3023, inputting the equal-length name set SH= { SSIDH 1,SSIDH2,...,SSIDHn } and the category set L into a text-cnn model for training to obtain the preset wifi classification model. In order to obtain the optimal wifi category portrait, the convolution kernel size of the text-cnn model has a value range of [ three, five ], preferably five, and 128 convolution kernels each. Preferably, the convolution kernels are convolution kernels with dimensions 1, 2, 3,4, 5, respectively. By using the five convolution kernels with different sizes, the effective characteristics in the SSID names of each wifi can be comprehensively and effectively obtained, and the classification accuracy is further improved. Table 1 shows the partial recognition accuracy data of images of different wifi categories according to the application.
TABLE 1
Predictive score >0 Predictive score >0.5 Predictive score >0.6 Predictive score >0.7
Accuracy of 53% 66%-70% 73.5%-75% 75%-78%
Coverage rate 100% 80% 65% 50%
As can be seen from table 1, for a given test set, relative prediction accuracy can be obtained according to different prediction score thresholds, when the prediction score threshold=0, the prediction score of 100% of samples in the given test set is > 0, and the prediction accuracy of the portion of samples with the sample prediction score > 0 is 53%; when the predictive value threshold = 0.5, the predictive value of 80% of the samples in the given test set is > 0.5, and the predictive accuracy of the portion of samples with the predictive value of > 0.5 is 66% -70% (this predictive accuracy is due to some wifi that cannot be objectively verified, which leads to the existence of the interval, that is, if the wifi that cannot be objectively verified is considered to be correctly identified, the predictive accuracy at this time is 70%, otherwise 66%); when the predictive value threshold = 0.6, the predictive value of 65% of the samples in the given test set is > 0.6, and the predictive accuracy of the portion of samples having a sample predictive value > 0.6 is 73.5% -75%; when the predictive value threshold=0.7, the predictive value of 50% of the samples in the given test set is > 0.7, and the predictive accuracy of the portion of samples having a sample predictive value > 0.7 is 50%. According to the method, firstly, the SSID name of the wifi containing Chinese characters is obtained, and then the SSID name is input into a preset wifi classification model to obtain the class portrait of the wifi. Through the preset wifi classification model, the class portrait of wifi can be effectively obtained.
In a preferred embodiment of the present application, the obtaining the preset wifi classification model includes:
S401, obtaining an SSID name set S= { SSID 1,SSID2,...,SSIDn } of a training wifi set and a corresponding belonging category set L= { L 1,L2,...,Ln }, wherein the SSID name of the ith wifi in the training wifi set is SSID i, the corresponding belonging category is L i,Li and is one of M preset categories, and SSID i contains Chinese characters, and i is more than or equal to 1 and less than or equal to n.
S402, acquiring SSID name subsets S 1、S2、...、SK of K groups of wifi subsets for training and corresponding class subsets L 1、L2、...、LK based on the SSID name set S and the belonging class set L, wherein m obtained name subsets S j={SSIDj 1,SSIDj 2,...,SSIDj m are randomly extracted from the SSID name set S, and the corresponding belonging class of the corresponding belonging class subset L j={Lj 1,Lj 2,...,Lj m},SSIDj t is L j t,Lj t E L, m/n=a preset grouping ratio threshold value, j is more than or equal to 1 and less than or equal to K, and t is more than or equal to 1 and less than or equal to m. Specifically, in the present application, the preset packet duty ratio threshold has a value range of [0.7,0.9], preferably 0.8.K has a value of [3,5], preferably 5.
S403, respectively inputting the SSID name subset S 1、S2、...、SK and the corresponding category subset L 1、L2、...、LK into K different text-cnn classification models for training to obtain the preset wifi classification model. In the step, S 1 and L 1 are input into a1 st text-cnn classification model for training to obtain a1 st trained text-cnn model, S 2 and L 2 are input into a2 nd text-cnn classification model for training to obtain a2 nd trained text-cnn model, and so on, so as to obtain K different trained text-cnn models. When a class portrait of wifi needs to be predicted, SSID names of the wifi are respectively input into the K trained different text-cnn models to obtain K different class predicted values, voting is conducted based on the K different class predicted values, and finally the class portrait of wifi is obtained. The voting mechanism may employ any of the prior art techniques, such as the commonly used minority-compliance majority.
In order to obtain the optimal wifi category portrait, the convolution kernel size of the text-cnn model has a value range of [ three, five ], preferably five, 128 of each. Preferably, the convolution kernels are convolution kernels with dimensions 1, 2, 3,4, 5, respectively. By using the five convolution kernels with different sizes, the effective characteristics in the SSID names of each wifi can be comprehensively and effectively obtained, and the classification accuracy is further improved. Table 2 is the partial recognition accuracy data of the images of different wifi categories based on the obtained preset wifi classification model.
TABLE 2
Predictive score >0 Predictive score >0.5 Predictive score >0.6 Predictive score >0.7
Accuracy of 58.9%-72.1% 69.6%-78.5% 72.8%-79.4% 78.3%-83.9%
Coverage rate 100% 86% 73% 59%
According to the contents of table 2, the predicted result of the preset wifi classification model is obviously better than the data in table 1 in terms of coverage rate and accuracy, and the effect of the preset wifi classification model obtained by the above preferred embodiment is better. Table 3 shows partial wifi prediction accuracy data of the preset wifi classification model according to the preferred embodiment of the present application for different wifi class portraits.
TABLE 3 Table 3
Category portrayal Classification prediction accuracy
Individuals 0.9
Leisure and recreation 0.9
Company enterprise 0.8
Peripheral game 0.6
Colleges and universities 0.9
Household building material 0.7
Food for delicacies 0.6
Shopping 0.8
...... ......
From the content of Table 3, the method provided by the application can accurately acquire the wifi class portrait, can meet the requirements of people on the wifi class portrait to a certain extent, and has strong adaptability.
The embodiment of the application also provides a system for acquiring wifi portrait, which comprises a processor and a non-transitory computer readable storage medium, wherein the storage medium is used for storing at least one instruction or at least one section of program, and the processor loads and executes the at least one instruction or the at least one section of program to realize the method provided by the embodiment.
A computer-readable storage medium storing a program or instructions that cause a computer to execute the method provided by the above-described embodiment.
Embodiments of the present application also provide a non-transitory computer readable storage medium that may be disposed in an electronic device to store at least one instruction or at least one program for implementing one of the methods embodiments, the at least one instruction or the at least one program being loaded and executed by the processor to implement the methods provided by the embodiments described above.
Embodiments of the present application also provide an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.
Embodiments of the present application also provide a computer program product comprising program code for causing an electronic device to carry out the steps of the method according to the various exemplary embodiments of the application as described in the specification, when said program product is run on the electronic device.
While certain specific embodiments of the application have been described in detail by way of example, it will be appreciated by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the application. Those skilled in the art will also appreciate that many modifications may be made to the embodiments without departing from the scope and spirit of the application. The scope of the application is defined by the appended claims.

Claims (4)

1. The method for acquiring the wifi image is characterized by comprising the following steps of:
s100, acquiring an SSID name of the wifi, wherein the SSID name contains Chinese characters;
S200, acquiring the class portraits of the wifi based on the SSID name and a preset wifi classification model;
the obtaining of the preset wifi classification model comprises the following steps:
S401, acquiring an SSID name set S= { SSID 1,SSID2,...,SSIDn } of a training wifi set and a corresponding belonging category set L= { L 1,L2,...,Ln }, wherein the SSID name of the ith wifi in the training wifi set is SSID i, the corresponding belonging category is L i,Li and is one of M preset categories, and SSID i contains Chinese characters, and i is more than or equal to 1 and less than or equal to n;
S402, acquiring SSID name subsets S 1、S2、...、SK of K groups of wifi subsets for training and corresponding class subsets L 1、L2、...、LK based on the SSID name set S and the belonging class set L, wherein m obtained name subsets S j={SSIDj 1,SSIDj 2,...,SSIDj m are randomly extracted from the SSID name set S, the corresponding belonging class of the corresponding belonging class subset L j={Lj 1,Lj 2,...,Lj m},SSIDj t is L j t,Lj t E L, m/n=a preset grouping ratio threshold value, j is more than or equal to 1 and less than or equal to K, and t is more than or equal to 1 and less than or equal to m;
S403, respectively inputting the SSID name subset S 1、S2、...、SK and the corresponding category subset L 1、L2、...、LK into K different text-cnn classification models for training to obtain the preset wifi classification model; further, when predicting a class portrait of wifi, respectively inputting SSID names of the wifi into K trained different text-cnn models to obtain K different class predicted values, voting based on the K different class predicted values, and finally obtaining the class portrait of wifi;
The value range of the preset grouping ratio threshold is [0.7,0.9], the convolution kernel size corresponding to the text-cnn model is five, and 128 convolution kernels are respectively arranged; the convolution kernels are convolution kernels with dimensions 1,2, 3, 4, 5, respectively.
2. The method of claim 1, wherein K has a value of [3,5].
3. A system for capturing wifi portraits, the system comprising a processor and a non-transitory computer readable storage medium storing at least one instruction or at least one program, wherein the processor loads and executes the at least one instruction or at least one program to implement the method of claim 1 or 2.
4. A computer readable storage medium storing a program or instructions for causing a computer to perform the steps of the method according to claim 1 or 2.
CN202210880497.5A 2022-07-25 2022-07-25 Method, system and storage medium for acquiring wifi portrait Active CN115243250B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210880497.5A CN115243250B (en) 2022-07-25 2022-07-25 Method, system and storage medium for acquiring wifi portrait

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210880497.5A CN115243250B (en) 2022-07-25 2022-07-25 Method, system and storage medium for acquiring wifi portrait

Publications (2)

Publication Number Publication Date
CN115243250A CN115243250A (en) 2022-10-25
CN115243250B true CN115243250B (en) 2024-05-28

Family

ID=83675540

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210880497.5A Active CN115243250B (en) 2022-07-25 2022-07-25 Method, system and storage medium for acquiring wifi portrait

Country Status (1)

Country Link
CN (1) CN115243250B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650721A (en) * 2016-12-28 2017-05-10 吴晓军 Industrial character identification method based on convolution neural network
WO2019192397A1 (en) * 2018-04-04 2019-10-10 华中科技大学 End-to-end recognition method for scene text in any shape
CN112200177A (en) * 2020-07-21 2021-01-08 山东文多网络科技有限公司 Single number identification method and device based on bill picking scanning piece big data
CN113688237A (en) * 2021-08-10 2021-11-23 北京小米移动软件有限公司 Text classification method, and training method and device of text classification network
CN114443840A (en) * 2021-12-27 2022-05-06 天翼云科技有限公司 Text classification method, device and equipment
CN114579763A (en) * 2022-03-08 2022-06-03 安徽理工大学 Character-level confrontation sample generation method for Chinese text classification task

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650721A (en) * 2016-12-28 2017-05-10 吴晓军 Industrial character identification method based on convolution neural network
WO2019192397A1 (en) * 2018-04-04 2019-10-10 华中科技大学 End-to-end recognition method for scene text in any shape
CN112200177A (en) * 2020-07-21 2021-01-08 山东文多网络科技有限公司 Single number identification method and device based on bill picking scanning piece big data
CN113688237A (en) * 2021-08-10 2021-11-23 北京小米移动软件有限公司 Text classification method, and training method and device of text classification network
CN114443840A (en) * 2021-12-27 2022-05-06 天翼云科技有限公司 Text classification method, device and equipment
CN114579763A (en) * 2022-03-08 2022-06-03 安徽理工大学 Character-level confrontation sample generation method for Chinese text classification task

Also Published As

Publication number Publication date
CN115243250A (en) 2022-10-25

Similar Documents

Publication Publication Date Title
CN108717406B (en) Text emotion analysis method and device and storage medium
WO2020215571A1 (en) Sensitive data identification method and device, storage medium, and computer apparatus
CN109471944B (en) Training method and device of text classification model and readable storage medium
CN109902307A (en) Name the training method and device of entity recognition method, Named Entity Extraction Model
CN110046706B (en) Model generation method and device and server
CN110264274B (en) Guest group dividing method, model generating method, device, equipment and storage medium
WO2020087774A1 (en) Concept-tree-based intention recognition method and apparatus, and computer device
CN109388675A (en) Data analysing method, device, computer equipment and storage medium
CN109598307B (en) Data screening method and device, server and storage medium
CN112036550A (en) Client intention identification method and device based on artificial intelligence and computer equipment
CN110222330B (en) Semantic recognition method and device, storage medium and computer equipment
WO2021068563A1 (en) Sample date processing method, device and computer equipment, and storage medium
CN110288755A (en) The invoice method of inspection, server and storage medium based on text identification
CN110287311A (en) File classification method and device, storage medium, computer equipment
CN110362826A (en) Periodical submission method, equipment and readable storage medium storing program for executing based on artificial intelligence
CN110059212A (en) Image search method, device, equipment and computer readable storage medium
CN106910135A (en) User recommends method and device
CN112966072A (en) Case prediction method and device, electronic device and storage medium
CN115243250B (en) Method, system and storage medium for acquiring wifi portrait
CN111126626A (en) Training method, device, server, platform and storage medium
CN114692889A (en) Meta-feature training model for machine learning algorithm
US8918406B2 (en) Intelligent analysis queue construction
CN109657710B (en) Data screening method and device, server and storage medium
CN109033078B (en) The recognition methods of sentence classification and device, storage medium, processor
CN110413745B (en) Method for selecting representative text, method and device for determining standard problem

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant