CN115618086A - Method for performing word segmentation analysis based on webpage - Google Patents

Method for performing word segmentation analysis based on webpage Download PDF

Info

Publication number
CN115618086A
CN115618086A CN202211550705.1A CN202211550705A CN115618086A CN 115618086 A CN115618086 A CN 115618086A CN 202211550705 A CN202211550705 A CN 202211550705A CN 115618086 A CN115618086 A CN 115618086A
Authority
CN
China
Prior art keywords
data
value
webpage
data area
mouse
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211550705.1A
Other languages
Chinese (zh)
Other versions
CN115618086B (en
Inventor
马云
叶伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yonghong Tech Co ltd
Original Assignee
Beijing Yonghong Tech Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yonghong Tech Co ltd filed Critical Beijing Yonghong Tech Co ltd
Priority to CN202211550705.1A priority Critical patent/CN115618086B/en
Publication of CN115618086A publication Critical patent/CN115618086A/en
Application granted granted Critical
Publication of CN115618086B publication Critical patent/CN115618086B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Probability & Statistics with Applications (AREA)
  • Human Computer Interaction (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention belongs to the technical field of data processing, and particularly relates to a method for performing word segmentation analysis based on a webpage, which comprises the following steps: the client device applies for accessing the data web page to the download server through the intermediate server through the communication network to obtain and display the data web page, and the data web page is divided into different data areas to be displayed; on a data webpage, when a focus of a mouse passes through a data area, whether the data area is amplified or not is judged to prompt data displayed in the data area of a user, the user clicks and selects different data areas through the mouse, and meanwhile, the statistical analysis function of the selected different data areas is automatically realized, including counting aiming at text type data and date type data, counting, summing and averaging aiming at numerical type data.

Description

Method for performing word segmentation analysis based on webpage
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to a method for performing word segmentation analysis based on a webpage.
Background
With the development of computer application technology, it is becoming more and more common to view data including experimental reports and the like by accessing web pages, and this online data viewing method has better flexibility than the conventional offline data viewing method, as long as a user has a terminal device capable of connecting with a network and can view data at any time regardless of the location, however, using the online data viewing method in the prior art, generally, only data can be viewed, but data cannot be counted and analyzed, thereby limiting the convenience of the online data viewing method to a certain extent.
Disclosure of Invention
The invention makes the client device apply for accessing the data web page to the download server through the intermediate server, the intermediate server downloads the corresponding data web page from the download server periodically, the intermediate server only sends the data web page which is not downloaded by the client device to the client device, and the client device displays the data web page at the same time.
In order to achieve the above object, the present invention provides a method for performing word segmentation analysis based on a web page, which mainly comprises the following steps:
the method comprises the steps that client equipment applies for accessing a data webpage to a download server through an intermediate server through a communication network so as to obtain and display the data webpage, wherein the data webpage is divided into different data areas for displaying, and each data area is used for displaying different types of data including text types, date types and numerical value types;
on the data webpage, when a focus of a mouse passes through the data area, whether the data area is amplified or not is judged to prompt a user of data displayed in the data area, the user selects different data areas through clicking of the mouse, meanwhile, the statistical analysis function of the selected different data areas is automatically achieved, and the statistical analysis function comprises counting aiming at text type data and date type data, counting, summing and averaging aiming at numerical type data.
As a preferred technical solution of the present invention, a client device applies for accessing a data web page to a download server via an intermediate server through a communication network to obtain and display the data web page, including the following steps:
the intermediate server judges whether a preset time point is reached, when the time point is not reached, the intermediate server repeatedly judges whether the preset time point is reached, and when the time point is reached, the next step is continued;
the intermediate server sends a webpage access request to the download server, downloads a data webpage from the download server, extracts characteristic data from the downloaded data webpage as a second characteristic value, and correspondingly stores the second characteristic value and the downloaded data webpage in a self history table;
the client device sends a webpage access request to the intermediate server, the webpage access request comprises a physical address of the client device, a webpage address accessed by the client device and a first characteristic value of the client device, and the intermediate server forwards the webpage access request to the download server to start downloading a data webpage from the download server;
the intermediate server extracts feature data from the downloaded data web page as a new second feature value, compares the new second feature value with a corresponding second feature value in a history record table of the intermediate server to judge whether the new second feature value is consistent with the corresponding second feature value in the history record table of the intermediate server, immediately stops downloading the data web page from the download server when the new second feature value is consistent with the corresponding second feature value, finishes downloading the data web page from the download server when the new second feature value is inconsistent with the corresponding second feature value, and correspondingly updates the history record table of the intermediate server by using the new second feature value and the corresponding data web page;
and judging the consistency of the first characteristic value of the client equipment and the corresponding second characteristic value of the intermediate server based on a history record table stored by the intermediate server, wherein when the first characteristic value of the client equipment is inconsistent with the corresponding second characteristic value of the intermediate server, the intermediate server sends the second characteristic value and the data webpage corresponding to the second characteristic value to the client equipment, and the client equipment uses the second characteristic value as the first characteristic value of the client equipment when sending a webpage access request next time.
As a preferred technical solution of the present invention, the history table stored in the intermediate server includes a physical address of the client device, a web page address accessed by the client device, a first characteristic value of the client device, a second characteristic value of the intermediate server, and a data web page downloaded by the intermediate server.
As a preferred technical solution of the present invention, on the data web page, when a focus of a mouse passes through the data area, determining whether the data area is enlarged includes the following steps:
taking the upper left corner of the data webpage as an origin, establishing an x coordinate axis in the transverse direction, establishing a y coordinate axis in the longitudinal direction to form a coordinate system of the data webpage, and acquiring a first coordinate value of a mouse focus in the coordinate system when the mouse focus enters the data area;
judging whether a preset time interval passes or not, if not, judging whether the mouse focus is in the data area or not, when the mouse focus is in the data area, repeatedly judging whether the preset time interval passes or not, otherwise, ending all the steps, and if the time interval passes, continuing the next step;
acquiring a second coordinate value of the mouse focus under a coordinate system, connecting the first coordinate value and the second coordinate value under the coordinate system, regarding a connecting line direction from the first coordinate value to the second coordinate value as a moving direction of the mouse focus, and meanwhile, judging whether the mouse focus can reach other data areas if the mouse focus moves along the moving direction all the time;
when the mouse focus is moved in the moving direction all the time and can reach the other data area, the data area is judged not to be enlarged, and when the mouse focus is moved in the moving direction all the time and can not reach the other data area, the data area is judged to be enlarged.
As a preferred technical solution of the present invention, on the data web page, when a focus of a mouse passes through the data area, whether the data area is enlarged is determined, which further includes the following steps:
taking the upper left corner of the data webpage as an origin, establishing an x coordinate axis in the transverse direction, establishing a y coordinate axis in the longitudinal direction to form a coordinate system of the data webpage, and acquiring a coordinate value of a mouse focus under the coordinate system when the mouse focus enters the data area;
judging whether a mouse focus is in the data area, if not, finishing all the steps, if so, continuously judging whether a preset time interval passes, if not, skipping to judge whether the mouse focus is in the data area, and if so, continuing the next step;
and acquiring the coordinate value of the mouse focus under the coordinate system again, judging whether the data area is enlarged, jumping to judge whether the mouse focus is in the data area when the data area is judged not to be enlarged, and enlarging the data area when the data area is judged to be enlarged.
As a preferred technical solution of the present invention, the determining whether to enlarge the data area further includes the following steps:
regarding a connecting line direction from the coordinate value obtained for the second time to the coordinate value obtained for the first time as a moving direction of the mouse focus, simultaneously judging whether the mouse focus can reach other data areas if the mouse focus moves along the moving direction all the time, and continuing the next step under the condition that the mouse focus can reach other data areas, otherwise, judging that the data areas are enlarged;
calculating a distance value between the coordinate value acquired for the last time and the coordinate value acquired for the last time in the coordinate system, dividing a time interval by the distance value to be used as a speed value of the coordinate value acquired for the last time, and dividing a difference value between the speed value of the coordinate value acquired for the last time and the speed value of the coordinate value acquired for the last time by the time interval to be used as an acceleration value;
and calculating a speed value of the mouse focus when the mouse focus reaches other data areas based on the speed value of the coordinate value acquired most recently for the first time, the acceleration value and a distance value from the coordinate value acquired most recently for the first time to other data areas, judging to enlarge the data areas if the speed value is less than or equal to a preset speed threshold, and judging not to enlarge the data areas if the speed value is greater than the preset speed threshold.
The invention also provides a system for performing word segmentation analysis based on the webpage, which mainly comprises the following modules:
the client equipment module is used for applying for accessing the data webpage from the download server module through the intermediate server module to obtain and display the data webpage, judging whether the data area is amplified to prompt data displayed in the data area of a user when a mouse focus passes through the data area in the data webpage, and automatically realizing the statistical analysis function of different selected data areas when the user clicks and selects different data areas through a mouse;
the intermediate server module is used for downloading the data web pages from the corresponding downloading server module according to the self history table at a preset time point, generating a second characteristic value according to the downloaded data web pages, processing the web page access request sent by the client equipment module and sending the second characteristic value and the data web pages to the client equipment module;
and the download server module is used for storing and updating the data webpage corresponding to the webpage address accessed by the client equipment module.
Compared with the prior art, the invention has the beneficial effects that at least:
1. in the invention, a client device applies for accessing a data webpage to a download server through an intermediate server to obtain and display the data webpage, wherein the data webpage is divided into different data areas for display; when a mouse focus passes through a data area in a data webpage, judging whether the data area is amplified to prompt data displayed in the data area of a user, and selecting different data areas through mouse clicking by the user to automatically realize the functions of statistical analysis of the selected different data areas, including counting text type data and date type data, counting, summing and averaging numerical value type data;
2. the invention solves the problem that the online data viewing method in the prior art can only view data generally but can not count and analyze the data, and can only download the data webpage which is not downloaded by the client device to the client device, reduce the communication burden between the client device and the download server, and determine whether to enlarge and display the data area when the focus of a mouse passes through the data area in the data webpage displayed by the client device according to the intention of a user, thereby improving the experience of the user in using the online data viewing method.
Drawings
FIG. 1 is a flowchart illustrating the steps of a method for performing a cross-word analysis based on a web page according to the present invention;
FIG. 2 is a block diagram of a system for performing a word segmentation analysis based on a web page according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
It will be understood that, as used herein, the terms "first," "second," and the like may be used herein to describe various elements, but these elements should not be limited by these terms unless otherwise specified. These terms are only used to distinguish one element from another. For example, a first xx script may be referred to as a second xx script, and similarly, a second xx script may be referred to as a first xx script, without departing from the scope of the present application.
The invention firstly provides a method for performing word segmentation analysis based on a webpage as shown in figure 1, which is mainly realized by executing the following steps:
the method comprises the steps that firstly, a client device applies for accessing a data webpage to a download server through an intermediate server through a communication network to obtain and display the data webpage, the data webpage is divided into different data areas to be displayed, and each data area is used for displaying different types of data including text types, date types and numerical value types;
and secondly, on the data webpage, when a mouse focus passes through the data area, judging whether the data area is enlarged to prompt a user of data displayed in the data area, clicking and selecting different data areas by the user through a mouse, and automatically realizing statistical analysis functions on the selected different data areas, wherein the statistical analysis functions comprise counting text type data and date type data, counting, summing and averaging numerical type data.
Specifically, the inventor finds that in practice, as long as a user has a client device capable of connecting to a network, the user can view data at any time by accessing a data webpage regardless of where the user is, but generally only can view the data, but cannot count and analyze the data, so that convenience of the online data viewing method is limited to a certain extent.
Further, the client device applies for accessing the data web page from the download server through the intermediate server via the communication network to obtain and display the data web page, including the following steps:
the method comprises the following steps that firstly, the intermediate server judges whether a preset time point is reached, when the time point is not reached, the intermediate server repeatedly judges whether the preset time point is reached, and when the time point is reached, the next step is continued;
secondly, the intermediate server sends a webpage access request to the download server, downloads a data webpage from the download server, extracts characteristic data from the downloaded data webpage as a second characteristic value, and correspondingly stores the second characteristic value and the downloaded data webpage in a self history table;
thirdly, the client device sends a webpage access request to the intermediate server, wherein the webpage access request comprises a physical address of the client device, a webpage address accessed by the client device and a first characteristic value of the client device, and the intermediate server forwards the webpage access request to the download server to start downloading a data webpage from the download server;
fourthly, the intermediate server extracts feature data from the downloaded data web page as a new second feature value, compares the new second feature value with a corresponding second feature value in a history record table of the intermediate server to judge whether the new second feature value is consistent with the corresponding second feature value in the history record table of the intermediate server, immediately stops downloading the data web page from the download server when the new second feature value is consistent with the corresponding second feature value, finishes downloading the data web page from the download server when the new second feature value is inconsistent with the corresponding second feature value, and correspondingly updates the history record table of the intermediate server by using the new second feature value and the corresponding data web page;
and fifthly, judging the consistency between the first characteristic value of the client device and the second characteristic value of the corresponding intermediate server based on the history table stored in the intermediate server, wherein when the first characteristic value of the client device is inconsistent with the second characteristic value of the corresponding intermediate server, the intermediate server sends the second characteristic value and the data webpage corresponding to the second characteristic value to the client device, and the client device uses the second characteristic value as the first characteristic value of the client device when sending the webpage access request next time.
Specifically, the inventor considers that the online data viewing method needs to access a page on the server and sends the page on the server to the client device of the user, if the data volume of the page on the server is large, when the server applies for accessing the page on the client device, the server sends the whole page to the client device each time, which causes communication burden between the client device and the server and reduces the experience of the online data viewing method, and aiming at the technical problem, the first step to the fifth step are further provided.
In the first step to the fifth step, first, when the intermediate server arrives at a predetermined time point, the intermediate server downloads the corresponding data web page from the download server to the client device according to the contents of the history table, and extracts the feature data from the downloaded data web page, and stores the feature data as a second feature value of the downloaded data web page, and also stores the second feature value and the downloaded data web page in the history table, wherein the predetermined time point may be zero every day and one is set every twenty-four hours, the feature data may be an update time of the data web page and is set at the head of the data web page, and second, the client device transmits a web page access request to the intermediate server, the intermediate server continues to transmit the web page access request to the download server, the download server permits to start downloading the data web page therefrom, and then the intermediate server determines a second feature value from the downloaded data web page, and further searches the history table in conjunction with the web page access request of the client device, thereby comparing the second feature value with the corresponding second feature value in the history table, and immediately stops the download of the second feature data web page from the intermediate server when the download server does not store the new download data web page access request, and the download history table does not store the new download data web page download data from the intermediate server, and the download server does not store the new download history table, and the new download characteristic values of the download server does not store the new download data web page download data in the intermediate server, and the download server, and thus the download server does not store the download server, and the download server does not store the new download history table, and the download data page download server at the download server does not store the new download data page download server at the same time point, and when the first characteristic value and the second characteristic value are different, the second characteristic value and the corresponding data webpage are sent to the client equipment, and when the first characteristic value and the second characteristic value are the same, the client equipment is indicated to download the corresponding data webpage, the data webpage is not repeatedly sent to the client equipment, and the method from the first step to the fifth step can ensure that only the data webpage which is not downloaded is sent to the client equipment.
Further, the history table stored in the intermediate server includes a physical address of the client device, an address of a web page accessed by the client device, a first characteristic value of the client device, a second characteristic value of the intermediate server, and a data web page downloaded from the download server by the intermediate server corresponding to the second characteristic value. Specifically, the client device refers to a client device that has historically sent a data webpage by the intermediate server, the content of the first feature value of the client device is substantially the same as the content of the second feature value that was sent to the client device by the intermediate server last time, and the second feature value of the intermediate server refers to the second feature value that is determined by the intermediate server according to the data webpage downloaded last time.
Further, on the data web page, when the focus of the mouse passes through the data area, whether the data area is enlarged is judged, which includes the following steps:
the method comprises the steps that firstly, an x coordinate axis is established in the transverse direction and a y coordinate axis is established in the longitudinal direction by taking the upper left corner of a data webpage as an origin, so that a coordinate system of the data webpage is formed, and a first coordinate value of a mouse focus under the coordinate system is obtained when the mouse focus enters a data area;
judging whether a preset time interval passes or not, if not, judging whether the mouse focus is in the data area or not, if so, repeatedly judging whether the preset time interval passes or not, otherwise, finishing all the steps, and if so, continuing the next step;
thirdly, acquiring a second coordinate value of the mouse focus under a coordinate system, connecting the first coordinate value and the second coordinate value under the coordinate system, regarding a connecting line direction from the first coordinate value to the second coordinate value as a moving direction of the mouse focus, and simultaneously judging whether the mouse focus can reach other data areas if the mouse focus moves along the moving direction all the time;
and a fourth step of determining not to enlarge the data area when the focus of the mouse is moved in the moving direction and can reach the other data area, and determining to enlarge the data area when the focus of the mouse is moved in the moving direction and can not reach the other data area.
Specifically, when the client device displays the data web page, if the mouse focus passes through a data area in the data web page, the data area is enlarged and displayed, and there may be a problem that the data area is not intended to be viewed by the user, so that the user experience may be reduced.
Further, on the data web page, when the focus of the mouse passes through the data area, whether the data area is enlarged is judged, and the method further comprises the following steps:
firstly, establishing an x coordinate axis in the transverse direction and a y coordinate axis in the longitudinal direction by taking the upper left corner of the data webpage as an origin to form a coordinate system of the data webpage, and acquiring a coordinate value of a mouse focus under the coordinate system when the mouse focus enters the data area;
judging whether the mouse focus is in the data area, if not, finishing all steps, if so, continuously judging whether a preset time interval passes, if not, skipping to judge whether the mouse focus is in the data area, and if so, continuing the next step;
and thirdly, acquiring coordinate values of the mouse focus under the coordinate system again, judging whether the data area is enlarged or not, jumping to judge whether the mouse focus is in the data area or not when the data area is judged not to be enlarged, and enlarging the data area when the data area is judged to be enlarged.
Specifically, the inventor further provides a second method from the first step to the third step to achieve the purpose of determining whether to enlarge the data area according to the user's intention, wherein firstly, a coordinate system of the data webpage is still established, when the mouse focus enters the data area, coordinate values of the mouse focus are acquired and stored, secondly, whether the mouse focus is in the data area or not is determined, if the mouse focus is not in the data area, other steps are not executed, if the mouse focus is in the data area, a preset time interval is continuously determined, if the time interval is not elapsed, whether the mouse focus is in the data area or not is repeatedly determined, finally, if the time interval is elapsed, the coordinate values of the mouse focus are acquired and stored again, and simultaneously, whether to enlarge the data area or not is determined, if the determination result is not to enlarge, the mouse focus is determined whether to be in the data area or not is jumped, and if the determination result is to enlarge the data area, the data area is enlarged and displayed. And obtaining a plurality of coordinate values of the mouse focus sequentially through the first step to the third step.
Further, judging whether to enlarge the data area or not, further comprising the following steps:
step one, regarding a connecting line direction from the coordinate value obtained for the second time to the coordinate value obtained for the first time as a moving direction of the mouse focus, and meanwhile, judging whether the mouse focus can reach other data areas if the mouse focus moves along the moving direction all the time, and continuing the next step if the mouse focus can reach other data areas, otherwise, judging that the data areas are enlarged;
secondly, calculating a distance value between the coordinate value acquired for the last third time and the coordinate value acquired for the last second time in the coordinate system, dividing a time interval by the distance value to be used as a speed value of the coordinate value acquired for the last second time, calculating a distance value between the coordinate value acquired for the last second time and the coordinate value acquired for the last first time in the coordinate system, dividing a time interval by the distance value to be used as a speed value of the coordinate value acquired for the last first time, and dividing a difference value between the speed value of the coordinate value acquired for the last first time and the speed value of the coordinate value acquired for the last second time by the time interval to obtain an acceleration value;
and thirdly, calculating the speed value of the focus of the mouse when the focus of the mouse reaches other data areas based on the speed value of the coordinate value acquired most recently for the first time, the acceleration value and the distance value from the coordinate value acquired most recently for the first time to other data areas, judging to enlarge the data areas if the speed value is less than or equal to a preset speed threshold value, and judging not to enlarge the data areas if the speed value is greater than the preset speed threshold value.
Specifically, whether the data area is enlarged is determined based on a plurality of coordinate values of the mouse focus acquired successively from the first step to the third step, a direction of a connection line from the coordinate value acquired most recently the second time to the coordinate value acquired most recently the first time is regarded as a moving direction of the mouse focus in the future, the coordinate value acquired most recently the second time is the coordinate value acquired before the most recent first time, and if the mouse focus continues to move in the moving direction and cannot reach other data areas in the future, it is described that the data area should be enlarged and displayed, and if the mouse focus continues to move in the moving direction and can reach other data areas in the future, it is determined not to enlarge the data area directly using a distance value between the coordinate value acquired most recently the third time and the coordinate value acquired most recently the second time divided by a time interval and the result is taken as a velocity value of the coordinate value acquired most recently the second time, and if the coordinate value acquired most recently the third time does not exist, it is determined that the data area is not enlarged and the data area is acquired from the first time interval of the coordinate value acquired most recently the coordinate value acquired by the first time interval and the difference value of the first time interval of the coordinate value acquired most recently the coordinate value is taken as a time interval and the first time interval of the mouse focus of the coordinate value acquired first time interval and the mouse focus is taken as a difference value of the first time interval and the last acceleration acquired coordinate value of the first time interval, and the data area of the data area is also calculated as a difference value of the first time interval of the data area of the last acceleration acquired coordinate value of the first time interval, and according to the uniform variable speed linear motion of the acceleration value, the motion distance is the linear distance from the coordinate value obtained for the first time to other data areas along the moving direction, so that the velocity value of the mouse focus to reach other data areas can be calculated, if the velocity value is less than or equal to the velocity value threshold value, it is indicated that the mouse focus actually cannot reach other data areas, the data area should be displayed in an enlarged manner, if the velocity value is greater than the velocity value threshold value, it is indicated that the mouse focus can reach other data areas, the data area should not be displayed in an enlarged manner, and the velocity value threshold value can be 0.
Referring to fig. 2, the present invention further provides a system for performing a word segmentation analysis based on a web page, including a client device module, an intermediate server module, and a download server module, which are used to implement the method for performing a word segmentation analysis based on a web page as described in the foregoing, specifically, the functions of each module are described as follows:
the client equipment module is used for applying for accessing the data webpage from the download server module through the intermediate server module to obtain and display the data webpage, judging whether the data area is amplified to prompt data displayed in the data area of a user when a mouse focus passes through the data area in the data webpage, and automatically realizing the statistical analysis function of different selected data areas when the user clicks and selects different data areas through a mouse;
the intermediate server module is used for downloading the data web pages from the corresponding downloading server module according to the self historical record table at a preset time point, generating a second characteristic value according to the downloaded data web pages, processing the web page access request sent by the client equipment module and sending the second characteristic value and the data web pages to the client equipment module;
and the download server module is used for storing and updating the data webpage corresponding to the webpage address accessed by the client equipment module.
It should be understood that, although the steps in the flowcharts of the embodiments of the present invention are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in various embodiments may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent should be subject to the appended claims.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included therein.

Claims (7)

1. A method for performing word segmentation analysis based on a webpage is characterized by comprising the following steps:
the method comprises the steps that client equipment applies for accessing a data webpage to a download server through an intermediate server through a communication network so as to obtain and display the data webpage, wherein the data webpage is divided into different data areas for displaying, and each data area is used for displaying different types of data including text types, date types and numerical value types;
on the data webpage, when a focus of a mouse passes through the data area, whether the data area is amplified or not is judged to prompt a user of data displayed in the data area, the user selects different data areas through clicking of the mouse, meanwhile, the statistical analysis function of the selected different data areas is automatically achieved, and the statistical analysis function comprises counting aiming at text type data and date type data, counting, summing and averaging aiming at numerical type data.
2. A method for performing word segmentation analysis based on web pages as claimed in claim 1, wherein the client device applies for accessing the data web page to the download server via the intermediate server through the communication network to obtain and display the data web page, comprising the following steps:
the intermediate server judges whether a preset time point is reached, when the time point is not reached, the intermediate server repeatedly judges whether the preset time point is reached, and when the time point is reached, the next step is continued;
the intermediate server sends a webpage access request to the download server, downloads a data webpage from the download server, extracts characteristic data from the downloaded data webpage as a second characteristic value, and correspondingly stores the second characteristic value and the downloaded data webpage in a self history table;
the client device sends a webpage access request to the intermediate server, the webpage access request comprises a physical address of the client device, a webpage address accessed by the client device and a first characteristic value of the client device, and the intermediate server forwards the webpage access request to the download server to start downloading a data webpage from the download server;
the intermediate server extracts feature data from the downloaded data web page as a new second feature value, compares the new second feature value with a corresponding second feature value in a history record table of the intermediate server to judge whether the new second feature value is consistent with the corresponding second feature value in the history record table of the intermediate server, immediately stops downloading the data web page from the download server when the new second feature value is consistent with the corresponding second feature value, finishes downloading the data web page from the download server when the new second feature value is inconsistent with the corresponding second feature value, and correspondingly updates the history record table of the intermediate server by using the new second feature value and the corresponding data web page;
and judging the consistency of the first characteristic value of the client equipment and the corresponding second characteristic value of the intermediate server based on a history record table stored by the intermediate server, wherein when the first characteristic value of the client equipment is inconsistent with the corresponding second characteristic value of the intermediate server, the intermediate server sends the second characteristic value and the data webpage corresponding to the second characteristic value to the client equipment, and the client equipment uses the second characteristic value as the first characteristic value of the client equipment when sending a webpage access request next time.
3. The method of claim 2, wherein the intermediate server stores the history table comprising a physical address of the client device, an address of a web page accessed by the client device, a first characteristic value of the client device, a second characteristic value of the intermediate server, and a data web page downloaded by the intermediate server.
4. The method for performing word segmentation analysis based on the web page as claimed in claim 1, wherein the step of determining whether the data area is enlarged when the focus of the mouse passes through the data area on the data web page comprises the following steps:
taking the upper left corner of the data webpage as an origin, establishing an x coordinate axis in the transverse direction, establishing a y coordinate axis in the longitudinal direction to form a coordinate system of the data webpage, and acquiring a first coordinate value of a mouse focus under the coordinate system when the mouse focus enters the data area;
judging whether a preset time interval passes or not, if not, judging whether the mouse focus is in the data area or not, when the mouse focus is in the data area, repeatedly judging whether the preset time interval passes or not, otherwise, finishing all the steps, and if the time interval passes, continuing the next step;
acquiring a second coordinate value of the mouse focus under a coordinate system, connecting the first coordinate value and the second coordinate value under the coordinate system, regarding a connecting line direction from the first coordinate value to the second coordinate value as a moving direction of the mouse focus, and meanwhile, judging whether the mouse focus can reach other data areas if the mouse focus moves along the moving direction all the time;
when the mouse focus is moved in the moving direction all the time and can reach the other data area, the data area is judged not to be enlarged, and when the mouse focus is moved in the moving direction all the time and can not reach the other data area, the data area is judged to be enlarged.
5. The method for performing word segmentation analysis based on a web page as claimed in claim 1, wherein the step of judging whether the data area is enlarged when the mouse focus passes through the data area on the data web page further comprises the following steps:
taking the upper left corner of the data webpage as an origin, establishing an x coordinate axis in the transverse direction, establishing a y coordinate axis in the longitudinal direction to form a coordinate system of the data webpage, and acquiring a coordinate value of a mouse focus in the coordinate system when the mouse focus enters the data area;
judging whether a mouse focus is in the data area, if not, finishing all the steps, if so, continuously judging whether a preset time interval passes, if not, skipping to judge whether the mouse focus is in the data area, and if so, continuing the next step;
and acquiring the coordinate value of the mouse focus under the coordinate system again, judging whether the data area is enlarged, jumping to judge whether the mouse focus is in the data area when the data area is judged not to be enlarged, and enlarging the data area when the data area is judged to be enlarged.
6. The method of claim 5, wherein determining whether to enlarge the data area further comprises:
regarding a connecting line direction from the coordinate value obtained for the second time to the coordinate value obtained for the first time as a moving direction of the mouse focus, simultaneously judging whether the mouse focus can reach other data areas if the mouse focus moves along the moving direction all the time, and continuing the next step under the condition that the mouse focus can reach other data areas, otherwise, judging that the data areas are enlarged;
calculating a distance value between the coordinate value acquired for the last third time and the coordinate value acquired for the last second time in the coordinate system, simultaneously using the distance value divided by a time interval as a speed value of the coordinate value acquired for the last second time, calculating a distance value between the coordinate value acquired for the last second time and the coordinate value acquired for the last first time in the coordinate system, simultaneously using the distance value divided by the time interval as a speed value of the coordinate value acquired for the last first time, and further using a difference value of the speed value of the coordinate value acquired for the last first time minus the speed value of the coordinate value acquired for the last second time divided by the time interval to obtain an acceleration value;
and calculating a speed value of the mouse focus when the mouse focus reaches other data areas based on the speed value of the coordinate value acquired most recently for the first time, the acceleration value and a distance value from the coordinate value acquired most recently for the first time to other data areas, judging to enlarge the data areas if the speed value is less than or equal to a preset speed threshold, and judging not to enlarge the data areas if the speed value is greater than the preset speed threshold.
7. A system for performing word segmentation analysis based on a web page, for implementing the method according to any one of claims 1 to 6, comprising the following modules:
the client equipment module is used for applying for accessing the data webpage from the download server module through the intermediate server module to obtain and display the data webpage, judging whether the data area is amplified to prompt data displayed in the data area of a user when a mouse focus passes through the data area in the data webpage, and automatically realizing the statistical analysis function of different selected data areas when the user clicks and selects different data areas through a mouse;
the intermediate server module is used for downloading the data web pages from the corresponding downloading server module according to the self history table at a preset time point, generating a second characteristic value according to the downloaded data web pages, processing the web page access request sent by the client equipment module and sending the second characteristic value and the data web pages to the client equipment module;
and the download server module is used for storing and updating the data webpage corresponding to the webpage address accessed by the client equipment module.
CN202211550705.1A 2022-12-05 2022-12-05 Method for performing word segmentation analysis based on webpage Active CN115618086B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211550705.1A CN115618086B (en) 2022-12-05 2022-12-05 Method for performing word segmentation analysis based on webpage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211550705.1A CN115618086B (en) 2022-12-05 2022-12-05 Method for performing word segmentation analysis based on webpage

Publications (2)

Publication Number Publication Date
CN115618086A true CN115618086A (en) 2023-01-17
CN115618086B CN115618086B (en) 2023-03-28

Family

ID=84880939

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211550705.1A Active CN115618086B (en) 2022-12-05 2022-12-05 Method for performing word segmentation analysis based on webpage

Country Status (1)

Country Link
CN (1) CN115618086B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050131868A1 (en) * 2003-12-10 2005-06-16 National Chiao Tung University Method for web content filtering
CN1959679A (en) * 2006-09-25 2007-05-09 北京爱笛星科技有限公司 Method for picking-up, and aggregating micro content of web page, and automatic updating system
CN101504671A (en) * 2009-03-05 2009-08-12 阿里巴巴集团控股有限公司 Visible processing method, apparatus and system for web page access behavior of users
CN101968715A (en) * 2010-10-15 2011-02-09 华南理工大学 Brain computer interface mouse control-based Internet browsing method
CN103902164A (en) * 2014-04-11 2014-07-02 魏新成 System and method for word-capturing search in browser window by clicking left mouse button
CN104881478A (en) * 2015-06-02 2015-09-02 吴小宇 Web page positioning identification system and method
CN106446128A (en) * 2016-09-20 2017-02-22 *** Tracing method and device of webpage access tracks
CN109947967A (en) * 2017-10-10 2019-06-28 腾讯科技(深圳)有限公司 Image-recognizing method, device, storage medium and computer equipment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050131868A1 (en) * 2003-12-10 2005-06-16 National Chiao Tung University Method for web content filtering
CN1959679A (en) * 2006-09-25 2007-05-09 北京爱笛星科技有限公司 Method for picking-up, and aggregating micro content of web page, and automatic updating system
CN101504671A (en) * 2009-03-05 2009-08-12 阿里巴巴集团控股有限公司 Visible processing method, apparatus and system for web page access behavior of users
CN101968715A (en) * 2010-10-15 2011-02-09 华南理工大学 Brain computer interface mouse control-based Internet browsing method
CN103902164A (en) * 2014-04-11 2014-07-02 魏新成 System and method for word-capturing search in browser window by clicking left mouse button
CN104881478A (en) * 2015-06-02 2015-09-02 吴小宇 Web page positioning identification system and method
CN106446128A (en) * 2016-09-20 2017-02-22 *** Tracing method and device of webpage access tracks
CN109947967A (en) * 2017-10-10 2019-06-28 腾讯科技(深圳)有限公司 Image-recognizing method, device, storage medium and computer equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
高世辉: "网络运维数据可视化***的设计与实现" *

Also Published As

Publication number Publication date
CN115618086B (en) 2023-03-28

Similar Documents

Publication Publication Date Title
CN107784516B (en) Advertisement putting method and device
EP3819821A1 (en) User feature generating method, device, and apparatus, and computer-readable storage medium
CN108959644B (en) Search ranking method and device, computer equipment and storage medium
CN111031336B (en) Live broadcast list data updating method and device, electronic equipment and storage medium
CN108804548B (en) Test data query method, device, computer equipment and storage medium
CN109753603B (en) Product recommendation information display method and device, computer equipment and storage medium
CN107679077B (en) Paging implementation method and device, computer equipment and storage medium
CN112131331B (en) Map data processing method, map data processing device, computer equipment and storage medium
CN110555164B (en) Method, device, computer equipment and storage medium for generating group interest labels
CN109753421B (en) Service system optimization method and device, computer equipment and storage medium
CN110321480B (en) Recommendation information pushing method and device, computer equipment and storage medium
CN111488736B (en) Self-learning word segmentation method, device, computer equipment and storage medium
CN115618086B (en) Method for performing word segmentation analysis based on webpage
CN111026912B (en) IPTV-based collaborative recommendation method, device, computer equipment and storage medium
CN110503296B (en) Test method, test device, computer equipment and storage medium
CN110688400A (en) Data processing method, data processing device, computer equipment and storage medium
CN110555082A (en) data processing method, data processing device, computer equipment and storage medium
CN113038283B (en) Video recommendation method and device and storage medium
CN114331266A (en) Recommendation model training method and device and loading and unloading point recommendation method and device
CN114168876A (en) Page display method and device, computer equipment and computer readable storage medium
CN110162542B (en) Data page turning method and device based on cassandra, computer equipment and storage medium
CN109656948B (en) Bitmap data processing method and device, computer equipment and storage medium
CN109284260B (en) Big data file reading method and device, computer equipment and storage medium
CN113296860B (en) Progress bar refreshing method and device, computer equipment and readable storage medium
CN110472136B (en) Query result pushing method and device, storage medium and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A method for word segmentation analysis based on web pages

Granted publication date: 20230328

Pledgee: Industrial and Commercial Bank of China Limited Beijing Pilot Free Trade Zone International Business Service Area Sub-branch

Pledgor: BEIJING YONGHONG TECH CO.,LTD.

Registration number: Y2024110000217