CN111598114A - Method for determining hidden state sequence and method for determining functional type of block - Google Patents

Method for determining hidden state sequence and method for determining functional type of block Download PDF

Info

Publication number
CN111598114A
CN111598114A CN201910127322.5A CN201910127322A CN111598114A CN 111598114 A CN111598114 A CN 111598114A CN 201910127322 A CN201910127322 A CN 201910127322A CN 111598114 A CN111598114 A CN 111598114A
Authority
CN
China
Prior art keywords
hidden
probability
state
time slice
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910127322.5A
Other languages
Chinese (zh)
Other versions
CN111598114B (en
Inventor
李勇
夏彤
金德鹏
孙福宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Tencent Technology Shenzhen Co Ltd
Tencent Dadi Tongtu Beijing Technology Co Ltd
Original Assignee
Tsinghua University
Tencent Technology Shenzhen Co Ltd
Tencent Dadi Tongtu Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, Tencent Technology Shenzhen Co Ltd, Tencent Dadi Tongtu Beijing Technology Co Ltd filed Critical Tsinghua University
Priority to CN201910127322.5A priority Critical patent/CN111598114B/en
Publication of CN111598114A publication Critical patent/CN111598114A/en
Application granted granted Critical
Publication of CN111598114B publication Critical patent/CN111598114B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • G06F18/295Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0204Market segmentation
    • G06Q30/0205Location or geographical consideration
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Complex Calculations (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to a method for determining a hidden state sequence, which comprises the steps of obtaining an observation sequence corresponding to a target block; determining the local probability that the target block is respectively in each hidden state of the hidden Markov model in each time slice covered by the observation sequence and determining reverse pointers respectively corresponding to each local probability based on the observation sequence, the initial state probability and the state transition probability corresponding to the target block in the hidden Markov model and the Gaussian distribution mean value and the Gaussian distribution variance which are jointly corresponding to each candidate block related to the hidden Markov model; determining the hidden state of the target block in the last time slice covered by the observation sequence based on the maximum local probability of the local probabilities of the target block in the hidden states in the last time slice; and the optimal path backtracking is carried out on the basis of the hidden state of the target block in the last time slice and each reverse pointer to obtain a hidden state sequence, so that the state transition condition of the block can be determined.

Description

Method for determining hidden state sequence and method for determining functional type of block
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for determining a hidden state sequence, a method and an apparatus for determining a functional type of a street block, a computer-readable storage medium, and a computer device.
Background
With the development of computer technology, people increasingly model the urban area based on observation data corresponding to the urban area (such as behavior data of population activities in the urban area) so as to evaluate the population flow characteristics of the urban area.
Traditionally, a symbiotic relationship among activities such as "eating" is modeled among time, place, and population by characterizing Learning (such as Cross-modular recreation Learning), which usually occurs at a restaurant-type place during lunch hours or evening meals. As shown in fig. 1, after the model learns the symbiotic relationship among time, place, and activities of the population, one of the three items can be used to infer possible situations of the other two items, for example, the model can be queried at different times, so as to recover the change rule of the activities of the population in different places. However, the conventional method does not support determining the state transition situation of the block, and has certain limitations.
Disclosure of Invention
Based on this, it is necessary to provide a method and an apparatus for determining a hidden state sequence, a method and an apparatus for determining a function type of a neighborhood, a computer-readable storage medium, and a computer device, for solving the technical problem that the conventional technology does not support determining a state transition situation of the neighborhood.
A method of determining a sequence of hidden states, comprising:
acquiring an observation sequence corresponding to a target block;
determining local probabilities that the target block is respectively in each hidden state of the hidden Markov model in each time slice covered by the observation sequence based on the observation sequence, the initial state probability corresponding to the target block in the hidden Markov model, the state transition probability corresponding to the target block, the Gaussian distribution mean value corresponding to each candidate block related to the hidden Markov model, and the Gaussian distribution variance corresponding to each candidate block, and determining reverse pointers corresponding to each local probability;
determining the hidden state of the target block in the last time slice covered by the observation sequence based on the maximum local probability of the local probabilities of the target block in the hidden states in the last time slice;
and performing optimal path backtracking based on the hidden state of the target block in the last time slice and each back pointer to obtain a hidden state sequence.
A method for determining a functional type of a neighborhood, comprising:
acquiring observation sequences corresponding to all candidate blocks related to the hidden Markov model;
respectively determining local probabilities of the candidate blocks in hidden states of the hidden Markov model in time slices covered by the observation sequence based on initial state probabilities respectively corresponding to the candidate blocks, state transition probabilities respectively corresponding to the candidate blocks, mean Gaussian distribution values commonly corresponding to the candidate blocks and variance Gaussian distribution values commonly corresponding to the candidate blocks in the hidden Markov model, and respectively determining reverse pointers respectively corresponding to the local probabilities based on the local probabilities;
respectively determining the hidden state of each candidate block in the last time slice covered by the observation sequence based on the maximum local probability of the local probabilities of each candidate block in the hidden state in the last time slice;
performing optimal path backtracking based on the hidden state of each candidate block in the last time slice and each back pointer to obtain a hidden state sequence corresponding to each candidate block;
and clustering is carried out on the basis of the hidden state sequences respectively corresponding to the candidate blocks, and the function types of the candidate blocks are respectively determined from the candidate function types on the basis of the clustering result.
An apparatus for determining a sequence of hidden states, comprising:
the first observation sequence acquisition module is used for acquiring an observation sequence corresponding to a target block;
a first intermediate parameter determining module, configured to determine, based on the observation sequence, an initial state probability corresponding to the target block in a hidden markov model, a state transition probability corresponding to the target block, a gaussian distribution mean value corresponding to each candidate block related to the hidden markov model in common, and a gaussian distribution variance corresponding to each candidate block in common, local probabilities that the target block is in each hidden state of the hidden markov model in each time slice covered by the observation sequence, respectively, and determine reverse pointers corresponding to each local probability, respectively;
a first end hidden state determining module, configured to determine, based on a maximum local probability of local probabilities that the target block is in each hidden state in a last time slice covered by the observation sequence, a hidden state in which the target block is located in the last time slice;
and the first hidden state sequence determining module is used for performing optimal path backtracking on the basis of the hidden state of the target block in the last time slice and each back pointer to obtain a hidden state sequence.
An apparatus for determining a functional type of a neighborhood, comprising:
the second observation sequence acquisition module is used for acquiring observation sequences corresponding to the candidate blocks related to the hidden Markov model;
a second intermediate parameter determining module, configured to determine, based on an initial state probability corresponding to each candidate block in the hidden markov model, a state transition probability corresponding to each candidate block, a gaussian distribution mean value corresponding to each candidate block, and a gaussian distribution variance corresponding to each candidate block, local probabilities that each candidate block is in each hidden state of the hidden markov model in each time slice covered by the observation sequence, respectively, and determine, based on each local probability, reverse pointers corresponding to each local probability, respectively;
a second end hidden state determining module, configured to determine, based on a maximum local probability of local probabilities of the candidate blocks being in the hidden states in a last time slice covered by the observation sequence, a hidden state of the candidate blocks in the last time slice;
a second hidden state sequence determining module, configured to perform optimal path backtracking based on the hidden state of each candidate block in the last time slice and each backward pointer, so as to obtain a hidden state sequence corresponding to each candidate block;
and the function type determining module is used for clustering based on the hidden state sequences respectively corresponding to the candidate blocks and respectively determining the function type of each candidate block from the candidate function types based on the clustering result.
A computer-readable storage medium, storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method as described above.
A computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the method as described above.
Based on the scheme, the local probabilities of the hidden states of the hidden markov model of the target block in the time slices covered by the observation sequence are determined based on the observation sequence corresponding to the target block, the initial state probability corresponding to the target block in the hidden markov model, the state transition probability corresponding to the target block, the mean gaussian distribution commonly corresponding to the candidate blocks related to the hidden markov model and the variance of the gaussian distribution commonly corresponding to the candidate blocks, and the corresponding hidden state sequence is obtained according to the local probabilities. Therefore, the hidden state sequence corresponding to the block is obtained through the hidden Markov model, the state transition condition of the block can be determined, and the limitation in the traditional mode is broken.
Drawings
FIG. 1 is a modeling result based on characterization learning in the conventional art;
FIG. 2 is a diagram of an application environment for determination of a hidden state sequence in one embodiment;
FIG. 3 is a flow diagram illustrating the determination of a hidden state sequence in one embodiment;
FIG. 4 is a diagram illustrating street blocks within a central city of Beijing city as determined from road networks in one embodiment;
FIG. 5 is a diagram illustrating a mapping between hidden states and active behavior features in one embodiment;
FIG. 6 is a diagram illustrating an observation sequence corresponding to a target block in one embodiment;
FIG. 7 is a schematic flow chart diagram illustrating a method for training a hidden Markov model in one embodiment;
FIG. 8 is a flowchart illustrating a method for determining a functional type of a neighborhood according to an embodiment;
FIG. 9 is a diagram illustrating hidden states corresponding to a street block under actual test according to an embodiment;
FIG. 10 is a diagram illustrating the distribution of functions in a neighborhood of a city under actual test in one embodiment;
FIG. 11 is a graphical illustration of the predicted outcome of population mobility behavior in actual testing in one embodiment;
FIG. 12 is a block diagram showing the structure of a hidden state sequence determining apparatus according to an embodiment;
FIG. 13 is a block diagram showing the configuration of a function type determining apparatus of a neighborhood in one embodiment;
FIG. 14 is a block diagram showing a configuration of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In this document, the term "above" is understood to include the present numbers with respect to the description of the numerical ranges, such as "two or more" meaning equal to or greater than two.
The method for determining the hidden state sequence provided by the embodiments of the present application can be applied to the application environment shown in fig. 2. The application environment may relate to the terminal 210 and the server 220, and the terminal 210 and the server 220 may be connected through a network.
Specifically, the terminal 210 acquires an observation sequence corresponding to the target block and sends the observation sequence to the server 220. The server 220 receives an observation sequence corresponding to a target block; further, based on the observation sequence, the initial state probability corresponding to the target block in the hidden Markov model, the state transition probability corresponding to the target block, the Gaussian distribution mean value corresponding to all candidate blocks related to the hidden Markov model, and the Gaussian distribution variance corresponding to all candidate blocks, the local probability that the target block is in each hidden state of the hidden Markov model in each time slice covered by the observation sequence is determined, and a reverse pointer corresponding to each local probability is determined; determining the hidden state of the target block in the last time slice covered by the observation sequence based on the maximum local probability of the local probabilities of the target block in the hidden states in the last time slice; and then, performing optimal path backtracking based on the hidden state of the target block in the last time slice and each back pointer to obtain a hidden state sequence.
In other embodiments, the terminal 210 may also independently perform the above-mentioned series of steps from obtaining the observation sequence corresponding to the target block to obtaining the hidden state sequence, without the participation of the server 220.
The terminal 210 may specifically include at least one of a mobile phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, a wearable device, and the like, but is not limited thereto. Server 220 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
In one embodiment, as shown in FIG. 3, a method of determining a sequence of hidden states is provided. The method is applied to a computer device (such as the terminal 210 or the server 220 in fig. 2) for example. The method may include the following steps S302 to S308.
S302, acquiring an observation sequence corresponding to the target block.
A neighborhood, is a polygonal geographic area bounded by the geographic boundaries of streets. Specifically, the geographic boundaries of streets may be extracted from a road network of a city, such that blocks within the city are determined based on the extracted geographic boundaries. As shown in fig. 4, these are 665 blocks in the central city of beijing city determined from the secondary road network of beijing city. It can be understood that the road network is a natural division of basic geographic units of human activities in a city, and the street zones determined by the road network are often more single in function, and people living in the same street zone have a similar life mode.
Correspondingly, the target block is a block which needs to determine a hidden state sequence corresponding to the observation sequence based on the corresponding observation sequence through the hidden markov model. In this embodiment, a hidden markov model may be trained in advance based on each observation sequence corresponding to each of two or more blocks, the two or more blocks may be candidate blocks related to the hidden markov model, and the target block may be selected from each candidate block related to the hidden markov model. For example, a hidden markov model is trained based on the observation sequences corresponding to 665 blocks in the central city of beijing city shown in fig. 4, the 665 blocks are candidate blocks related to the hidden markov model, and the target block can be selected from the 665 blocks.
The observation sequence corresponding to the target block may include demographic data of the target block in more than two time slices. The demographic data may relate to activity behavior characteristics corresponding to the demographic behavior Of the neighborhood, such as population movement numbers and access frequency to a predetermined type Of Point Of Interest (POI). Specifically, the population floating number may include the number of people moving in, the number of people staying, and the number of people moving out; the predetermined types of points of interest may include at least one of 9 types of restaurants, companies, institutions, shopping, services, attractions, entertainment, education, and residences, such as 4 types of restaurants, education, attractions, and residences, and 9 types of restaurants, companies, institutions, shopping, services, attractions, entertainment, education, and residences.
Assuming that the target block is the r-th candidate block related to the hidden markov model, the observation sequence corresponding to the target block can be represented as Or={Or,1,Or,2,Or,3,...,Or,NAnd N represents the total number of time slices covered by the observation sequence corresponding to the r-th candidate block. O isr,nAnd (3) showing the demographic activity data of the nth candidate block in the nth time slice, wherein N is 1,2,3, … and N. For example, for the 4 th month in 2018, a total of 30 days, each day is divided into 24 time slices at intervals of 1 hour, 720 time slices are determined, if the population activity data in the 720 time slices of the r-th block form an observation sequence, N is equal to 720, and the observation sequence can be represented as Or={Or,1,Or,2,Or,3,...,Or,720}。
And, Or,n={Or,n,1,Or,n,2,Or,n,3,...,Or,n,MAnd M represents the total number of observation features covered by observation data in the observation sequence corresponding to the r-th block. O isr,n,mAnd (3) showing the demographic activity data of the nth candidate block in the nth time slice, wherein the demographic activity data corresponds to the mth activity behavior characteristic, and M is 1,2,3, … and M. For example, demographic data in the observation sequence corresponding to the r-th block relates toThe behavior characteristics of each activity are respectively as follows: the number of people moving in, staying at, moving out, frequency of visits to points of interest such as restaurants, frequency of visits to points of interest such as education, frequency of visits to points of interest such as attractions, and frequency of visits to points of interest such as residences, i.e. a total of 7 activity characteristics are involved, then M equals 7. For another example, the activity behavior characteristics related to the demographic activity data in the observation sequence corresponding to the r-th block are respectively: the number of people moved in, the number of people stopped, the number of people moved out, the frequency of access to 9 types of points of interest, company, organization, shopping, services, attractions, entertainment, education and home, i.e. 12 activity features in total are involved, then M equals 12.
S304, based on the observation sequence, the initial state probability corresponding to the target block in the hidden Markov model, the state transition probability corresponding to the target block, the Gaussian distribution mean value corresponding to all candidate blocks related to the hidden Markov model, and the Gaussian distribution variance corresponding to all candidate blocks, determining the local probability that the target block is respectively in each hidden state of the hidden Markov model in each time slice covered by the observation sequence, and determining the reverse pointer corresponding to each local probability.
In this embodiment, the model parameters of the hidden markov model may include initial state probabilities pi corresponding to candidate blocks related to the hidden markov model respectivelyrAnd state transition probabilities A corresponding to the candidate blocksrThe mean value μ of the gaussian distribution commonly corresponding to each candidate block, and the variance σ of the gaussian distribution commonly corresponding to each candidate block. That is, the model parameters of the hidden markov model can be expressed as θ ═ { pi ═r,Arμ, σ }, R ═ 1,2,3, …, and R, R is the total number of candidate blocks involved in the hidden markov model.
For example, a hidden markov model is trained in advance based on each observation sequence corresponding to 3 blocks, where the 3 blocks are candidate blocks related to the hidden markov model, and the model parameters of the hidden markov model may include: initial state probability pi corresponding to 1 st candidate block1And a state transition probability A corresponding to the 1 st candidate block1And an initial state probability pi corresponding to the 2 nd candidate block2And a state transition probability A corresponding to the 2 nd candidate block2And an initial state probability pi corresponding to the 3 rd candidate block3And a state transition probability A corresponding to the 3 rd candidate block3A gaussian distribution mean μ corresponding to the 3 candidate blocks, and a gaussian distribution variance σ corresponding to the 3 candidate blocks.
Supposing that the r candidate block related to the hidden Markov model is determined as the target block, and the initial state probability corresponding to the target block is the initial state probability pi corresponding to the r candidate blockrIt may include: and the probability that the r candidate block is respectively in each hidden state of the hidden Markov model in the 1 st time slice covered by the observation sequence. In particular, nrCan be represented by a matrix of 1 × K, i.e., pir=[πr,1πr,2πr,3... πr,KAnd | K represents the total number of hidden states of the hidden markov model. Pir,kRepresents the probability that the nth candidate block is in the kth hidden state of the hidden markov model in the 1 st time slice, K is 1,2,3, …, K.
The state transition probability corresponding to the target block is the state transition probability A corresponding to the r-th candidate blockrIt may include: probability that the r-th candidate block is transferred between every two hidden states of the hidden Markov model. Specifically, ArCan be represented as a matrix of K × K as follows, namely:
Figure RE-GDA0002012891050000051
wherein A isr,j,kRepresents the probability of the r-th candidate block transitioning from the j-th hidden state to the K-th hidden state, K being 1,2,3, …, K, j being 1,2,3, …, K representing the total number of hidden states of the hidden markov model.
The gaussian distribution mean μ that corresponds to each candidate block involved in the hidden markov model may include: and under the condition that the candidate block is in each hidden state of the hidden Markov model, respectively generating a mean value of Gaussian distribution obeyed by the probability of each activity behavior characteristic related to the population activity data in the observation sequence. Specifically, μ can be represented in a K × M matrix as follows, i.e.:
Figure RE-GDA0002012891050000052
wherein, muk,mAnd the mean value of the gaussian distribution which represents the probability of generating the mth activity behavior feature under the condition that each candidate block is in the kth hidden state of the hidden markov model, wherein K is 1,2,3, …, K, M is 1,2,3, …, M, K is the total number of the hidden states of the hidden markov model, and M is the total number of the activity behavior features related to the population activity data in the observation sequence.
The gaussian distribution variance σ corresponding to each candidate block related to the hidden markov model includes: and under the condition that the candidate block is in each hidden state of the hidden Markov model, respectively generating the variance of Gaussian distribution obeyed by the probability of each activity behavior characteristic related to the population activity data in the observation sequence. Like μ, σ can be represented as a K × M matrix as follows, i.e.:
Figure RE-GDA0002012891050000061
wherein σk,mAnd the variance of the Gaussian distribution which represents the probability of generating the mth activity behavior feature under the condition that each candidate block is in the kth hidden state of the hidden Markov model, wherein K is 1,2,3, …, K, M is 1,2,3, …, M, K is the total number of the hidden states of the hidden Markov model, and M is the total number of the activity behavior features related to the population activity data in the observation sequence.
Hidden states are parameters that can be used to characterize the demographic characteristics of the neighborhood, among other things. In particular, the hidden states may be used to characterize the population density, population flow, and population activity type of a neighborhood. The population density and the population flow can be represented by the population flowing number of the block, such as the number of people moving in, the number of people moving out and the number of people staying in, and the population activity type can be represented by the access frequency aiming at the predetermined type of interest points. Hidden states of the hidden markov model can be preset based on actual requirements, such as setting the hidden states of the hidden markov model as 100 hidden states (State _1 to State _100) shown in fig. 5, respectively, and for any one of the hidden states of State _1 to State _100, the hidden State has 12 characteristic parameters, and the 12 characteristic parameters are respectively used for representing the number of people moving into the street (architectural), the number of people moving out (Leaving), the number of people Staying at the residence (stable), the access frequency for interest points such as restaurants (Restaurant), the access frequency for interest points such as Company (Company), the access frequency for interest points such as organization (Agency), the access frequency for interest points such as Shopping (Shopping), the access frequency for interest points such as Service (Service), the access frequency for interest points such as scenic spots (enterprise), the access frequency for interest points such as scenic spots (events), the access frequency for interest points such as actions, Frequency of access to points of interest such as Education (Education), and frequency of access to points of interest such as Residence (Residence).
For the hidden markov model, when the observation sequence and the model parameters of the hidden markov model are known, decoding can be performed by a Viterbi algorithm (i.e., Viterbi algorithm), so as to determine the hidden state sequence corresponding to the observation sequence.
The decoding process is a recursive calculation process, that is, for each time slice covered by the observation sequence corresponding to the target block, based on the emission probability of the population activity data in the time slice in the observation sequence corresponding to the target block under the condition that the target block is in the target hidden state of the hidden markov model in the time slice, the state transition probability corresponding to the target block in the hidden markov model, and the local probabilities that the target block is in the hidden states of the hidden markov model in the previous time slice adjacent to the time slice, the local probability that the target block is in the target hidden state in the time slice is determined. It can be understood that, the hidden states of the hidden markov model are sequentially used as target hidden states, so that the local probabilities that the target block is respectively in the hidden states of the hidden markov model in the time slices covered by the corresponding observation sequences can be determined.
Specifically, the local probability that the target block is in the kth hidden state of the hidden markov model in the nth time slice may refer to a maximum value among probabilities corresponding to all state transition paths of the target block in the kth hidden state in the nth time slice, and may be recorded as a maximum valuen(k)。
In addition, the local probability that the target block is in the k hidden state of the hidden Markov model in the nth time slice can be calculated by the following formulan(k):
Figure RE-GDA0002012891050000071
Wherein the content of the first and second substances,n-1(j) representing the local probability that the target block is in the jth hidden state of the hidden Markov model in the (n-1) th time slice; a. ther,j,kRepresenting the probability of transitioning from the jth hidden state to the kth hidden state;
Figure RE-GDA0002012891050000072
to representn-1(1)Ar,1,kn-1(2)Ar,2,kn-1(3)Ar,3,k…, andn-1(K)Ar,K,kmaximum value of (1);
Figure RE-GDA0002012891050000073
representing the emission probability of the population activity data in the nth time slice in the observation sequence corresponding to the target block generated under the condition that the target block is in the kth hidden state in the nth time slice; k represents the total number of hidden states of the hidden markov model.
It can be understood that, for the 1 st time slice, there is no time slice adjacent to the 1 st time slice, so the target street can be obtained by initializing according to the following formulaLocal probability of region being in kth hidden state of hidden Markov model in 1 st time slice1(k):
Figure RE-GDA0002012891050000074
Wherein, pir,kRepresenting the probability that the target block is in the kth hidden state of the hidden Markov model in the 1 st time slice;
Figure RE-GDA0002012891050000075
and the emission probability of the population activity data in the 1 st time slice in the observation sequence corresponding to the target block is generated under the condition that the target block is in the kth hidden state in the 1 st time slice.
Obtained by initialization1(k) Then, the above formula can be passed
Figure RE-GDA0002012891050000076
Recursion to obtain2(k)、3(k) …, andN(k) and N represents the total number of time slices covered by the observation sequence corresponding to the target block.
After the local probabilities of the hidden states of the hidden markov model of the target block in the time slices covered by the observation sequence are determined, the corresponding reverse pointers can be determined based on the local probabilities. The local probabilities and backpointers may have a one-to-one correspondence. Local probability of being in k hidden state of hidden Markov model in n time slicen(k) The corresponding back pointer can refer to the hidden state of the n-1 th node in the state transition path which makes the probability of the target block being in the kth hidden state in the nth time slice be the maximum, and can be written as psin(k)。
Specifically, ψ can be calculated by the following formulan(k):
Figure RE-GDA0002012891050000077
Wherein the argmax operator is used to determine if the middle brackets are inIs (i.e. the expression ofn-1(j)Ar,j,k) The index j with the largest value. In addition, the parameters are used heren-1(j)、Ar,j,kAnd K is the same as the definition in the previous text, and the description is omitted.
It should be noted that the local probability of the kth hidden state of the hidden markov model in the 1 st time slice1(k) Corresponding backpointer psi1(k) Not of practical significance, so psi can be adjusted1(k) Set to 0, i.e. #1(1)、ψ1(2)、ψ1(3) And psi1(K) Can be set to 0.
In addition, the local probability that the target block is in each hidden state of the hidden markov model in each time slice covered by the observation sequence and each backward pointer corresponding to each local probability can be represented by a matrix D, where K represents the total number of each hidden state of the hidden markov model, and N represents the total number of each time slice covered by the observation sequence corresponding to the target block.
Figure RE-GDA0002012891050000081
S306, based on the maximum local probability of the local probabilities of the target block respectively in the hidden states in the last time slice covered by the observation sequence, determining the hidden state of the target block in the last time slice.
The local probability of each hidden state in the last time slice covered by the observation sequence corresponding to the target block is the local probability of each hidden state in the Nth time slice, namelyN(1)、N(2)、N(3) …, andN(K)。
the maximum local probability is the local probability with the maximum numerical value in the local probabilities in the hidden states in the last time slice covered by the observation sequence corresponding to the target block. It is assumed that,N(1)、N(2)、N(3) …, andN(K) the largest of the numerical values isN(3),N(3) Is the most importantThe local probability is large.
In this embodiment, the hidden state corresponding to the maximum local probability may be determined as the hidden state of the target block in the last time slice. For example, fromN(1)、N(2)、N(3) …, andN(K) in (1), determiningN(3) And if the probability is the maximum local probability corresponding to the target block, the 3 rd hidden state is the hidden state of the target block in the last time slice. As another example, fromN(1)、N(2)、N(3) …, andN(K) in (1), determiningN(K) And if the probability is the maximum local probability corresponding to the target block, the Kth hidden state is the hidden state of the target block in the last time slice.
S308, optimal path backtracking is carried out on the basis of the hidden state of the target block in the last time slice and each back pointer, and a hidden state sequence is obtained.
Specifically, in the process of performing optimal path backtracking, the hidden state in the nth time slice in the hidden state sequence may be determined based on the following formula
Figure RE-GDA0002012891050000082
Figure RE-GDA0002012891050000083
Wherein the content of the first and second substances,
Figure RE-GDA0002012891050000084
and representing the hidden state in the (n + 1) th time slice in the hidden state sequence corresponding to the target block.
It will be appreciated that the hidden state in the last time slice (i.e., the Nth time slice) is determined
Figure RE-GDA0002012891050000085
Then, can pass through the formula
Figure RE-GDA0002012891050000086
Determining the hidden position of the N-1 time sliceHidden state
Figure RE-GDA0002012891050000087
And then pass through
Figure RE-GDA0002012891050000088
Determining the hidden state in the N-2 th time slice, analogizing, and finally passing
Figure RE-GDA0002012891050000089
And determining the hidden state in the 1 st time slice. Thereby, a hidden state sequence corresponding to the target block is obtained
Figure RE-GDA00020128910500000810
In addition, the hidden state sequence corresponding to the target block can be used for characterizing the population activity characteristics of the target block in different time slices.
The method for determining the hidden state sequence determines the local probability that the target block is respectively in each hidden state of the hidden Markov model in each time slice covered by the observation sequence based on the observation sequence corresponding to the target block, the initial state probability corresponding to the target block in the hidden Markov model, the state transition probability corresponding to the target block, the Gaussian distribution mean value commonly corresponding to each candidate block related to the hidden Markov model and the Gaussian distribution variance commonly corresponding to each candidate block, and then obtains the corresponding hidden state sequence. Therefore, the hidden state sequence corresponding to the block is obtained through the hidden Markov model, the state transition condition of the block can be determined, and the limitation in the traditional mode is broken.
It should be noted that, a solution for modeling a symbiotic relationship among time, place, and human activities in a manner of characterizing learning (such as Cross-modular representation learning) in the conventional technology is provided. Besides the defect that the state transition situation of the neighborhood cannot be determined, the scheme cannot distinguish the states of different neighborhoods and cannot support parallel processing of data.
In contrast, the model parameters of the hidden markov model in the present application include initial state probabilities corresponding to the candidate blocks related to the hidden markov model, state transition probabilities corresponding to the candidate blocks, gaussian distribution means corresponding to the candidate blocks, and gaussian distribution variances corresponding to the candidate blocks, and the candidate blocks have the initial state probabilities and the state transition probabilities corresponding to each other, so that differences in state transition between the candidate blocks due to different function types of the candidate blocks, that is, states of different blocks can be differentiated.
In another embodiment, a hidden markov model corresponding to each block may be learned using an observation sequence corresponding to each block, and model parameters of the hidden markov model include an initial state probability corresponding to each block, a state transition probability corresponding to each block, and an observation probability corresponding to each block. However, this scheme also cannot reflect the difference in state transition between blocks due to the difference in the types of functions to which the blocks belong.
Alternatively, for each block, a hidden markov model corresponding to the block may be learned using the observation sequence corresponding to the block. However, learning a hidden markov model for each block respectively faces the problem of insufficient learning of the model due to sparse training data, and the hidden markov models are independent, so that the association between the blocks cannot be established.
However, in the present application, a plurality of blocks correspond to the same hidden markov model, and the model parameters of the hidden markov model include initial state probabilities corresponding to the blocks related to the hidden markov model, state transition probabilities corresponding to the blocks, a gaussian distribution mean corresponding to the blocks, and a gaussian distribution variance corresponding to the blocks. On one hand, each block has corresponding initial state probability and state transition probability, so that the difference of state transition between blocks due to different function types can be reflected; on the other hand, the observation sequences corresponding to the blocks are used for learning a hidden Markov model together, but the observation sequences corresponding to the blocks are not used for learning the hidden Markov models respectively, so that the problems that the training data are sparse, the model learning is insufficient, and the association between the blocks cannot be established are effectively solved.
In one embodiment, the step of obtaining the observation sequence corresponding to the target block, i.e. step S302, may include the following steps: acquiring an original observation sequence corresponding to a target block; the original observation sequence comprises original population activity data of a target block in more than two time slices, and activity behavior characteristics related to each original population activity data comprise population floating number and access frequency aiming at predetermined types of interest points; and carrying out maximum value normalization on the population flowing number in each original population activity data and the TF-IDF parameters corresponding to the access frequency aiming at the interest points of the preset type in each original population activity data to obtain an observation sequence corresponding to the target block.
Assuming that the target block is the r-th candidate block, the original observation sequence corresponding to the r-th candidate block can be represented as Xr={Xr,1,Xr,2,Xr,3,...,Xr,NAnd N represents the total number of time slices covered by the observation sequence corresponding to the r-th candidate block. Xr,nRepresenting the original population activity data of the nth candidate block in the nth time slice, wherein R is 1,2,3, …, R, N is 1,2,3, …, N, and R represents the total number of the candidate blocks involved by the hidden Markov model.
And, Xr,n={Xr,n,1,Xr,n,2,Xr,n,3,...,Xr,n,MM is the total number of activity behavior features to which the raw demographic activity data relates. Xr,n,mIndicating the original demographic data of the nth time slice of the nth candidate block relates toM activity behavior features, M ═ 1,2,3, …, M.
Suppose that the mth activity behavior feature X in the original demographic activity data of the nth candidate neighborhood in the nth time slicer,n,mBelonging to a population floating number (e.g. X)r,n,mThe number of persons moving in, staying in, or moving out), X can be calculated by the following formular,n,mCarrying out maximum value normalization to obtain a normalization result Or,n,m
Figure RE-GDA0002012891050000101
Wherein the content of the first and second substances,
Figure RE-GDA0002012891050000102
represents Xr,1,m、Xr,2,m、Xr,3,m…, and Xr,N,mMaximum value of (2).
Suppose that the mth activity behavior feature X in the original demographic activity data of the nth candidate neighborhood in the nth time slicer,n,mBelonging to access frequencies (such as X) for predetermined types of points of interestr,n,mBelonging to a frequency of access to a point of interest of the type restaurant, company, organization, shopping, service, attraction, entertainment, education, or home), X may be calculated first by the following formular,n,mCorresponding TF-IDF parameter Yr,n,m
Figure RE-GDA0002012891050000103
Where F represents the total number of access frequencies for a predetermined type of point of interest, such as where each raw demographic data in the raw observation sequence corresponding to the r-th candidate block relates to access frequencies for 9 types of points of interest, restaurant, company, organization, shopping, services, attraction, entertainment, education, and home, then F equals 9. As another example, each original demographic data in the original observation sequence corresponding to the r-th candidate block relates to access frequencies for 4 types of points of interest, restaurant, education, attraction, and home, and F equals 4.
Furthermore, the TF-IDF parameter Y is calculated by the following formular,n,mCarrying out maximum value normalization to obtain a normalization result Or,n,m
Figure RE-GDA0002012891050000104
Wherein the content of the first and second substances,
Figure RE-GDA0002012891050000105
represents Yr,1,m、Yr,2,m、Yr,3,m…, and Yr,N,mMaximum value of (2).
It should be noted that, for the original observation sequence X corresponding to the r-th candidate block, the original observation sequence X is obtainedr={Xr,1,Xr,2,Xr,3,...,Xr,N},Xr,n={Xr,n,1,Xr,n,2,Xr,n,3,...,Xr,n,MR1, 2,3, …, R, N1, 2,3, …, N. Suppose { Xr,n,1,Xr,n,2,Xr,n,3Belongs to population floating numbers, and { X }r,n,4,Xr,n,5,Xr,n,6,...,Xr,n,MBelongs to access frequencies for predetermined types of points of interest. Then, respectively for Xr,n,1、Xr,n,2And Xr,n,3Carrying out maximum value normalization to obtain a normalization result Or,n,1、Or,n,2And Or,n,3. And, calculating Xr,n,4、Xr,n,5、 Xr,n,6…, and Xr,n,MRespectively corresponding TF-IDF parameters Yr,n,4、Yr,n,5、Yr,n,6…, and Yr,n,MThen respectively to Yr,n,4、Yr,n,5、Yr,n,6…, and Yr,n,MCarrying out maximum value normalization to obtain a normalization result Or,n,4、Or,n,5、Or,n,6…, and Or,n,M. Thereby, the population activity data O of the nth time slice of the nth candidate block is obtainedr,n={Or,n,1,Or,n,2,Or,n,3,...,Or,n,M} and thenObtaining an observation sequence O corresponding to the r-th candidate blockr={Or,1,Or,2,Or,3,...,Or,N}。
In addition, as described with reference to an actual example, maximum normalization processing is performed on the population flow numbers (the number of people moving in, the number of people staying in, and the number of people moving out) related to the population activity data in the original observation sequence corresponding to the qinghua garden block in beijing, and the corresponding TF-IDF parameters are calculated for the access frequencies of the interest points of the predetermined type covered by the maximum normalization processing, and then the maximum normalization processing is performed on the calculated TF-IDF parameters, so that the observation sequence corresponding to the qinghua garden block as shown in fig. 6 can be obtained.
In one embodiment, the manner of determining the local probability that the target block is in any hidden state of the hidden markov model within any time slice covered by the observation sequence may comprise the following steps: determining the emission probability of population activity data in the time slice in the observation sequence generated under the condition that the target block is in the hidden state in the time slice based on the population activity data in the time slice in the observation sequence, the Gaussian distribution mean value which is jointly corresponding to each candidate block related to the hidden Markov model and the Gaussian distribution variance which is jointly corresponding to each candidate block; and determining the local probability of the target block in the hidden state in the time slice based on the local probability of each hidden state of the target block in the hidden Markov model in the previous time slice adjacent to the time slice, the state transition probability corresponding to the target block in the hidden Markov model and the emission probability.
It should be noted that, the local probability that the target block is in the hidden state in the first time slice covered by the observation sequence is determined based on the probability corresponding to the hidden state in the initial state probability corresponding to the target block and the emission probability of the demographic activity data generated in the first time slice in the observation sequence under the condition that the target block is in the hidden state in the first time slice.
In one embodiment, the step of determining the emission probability of the target block in the observation sequence under the condition that the target block is in the hidden state in the time slice based on the population activity data in the time slice in the observation sequence, the gaussian distribution mean value commonly corresponding to the candidate blocks related by the hidden markov model, and the gaussian distribution variance commonly corresponding to the candidate blocks may include the following steps: and determining the emission probability of the target block generating the population activity data under the condition that the target block is in the hidden state of the hidden Markov model in the time slice based on the variance of the Gaussian distribution obeyed by the probability of each activity behavior feature related to the population activity data generating the observation sequence under the condition of the hidden state, the mean value of the Gaussian distribution obeyed by the probability of each activity behavior feature related to the population activity data generating the observation sequence under the condition that the target block is in the hidden state of the hidden Markov model in the observation sequence, and the population activity data of the target block in the time slice.
Specifically, the emission probability of the target block generating the population activity data in the nth time slice in the observation sequence under the condition that the target block is in the kth hidden state in the nth time slice can be calculated by the following formula
Figure RE-GDA0002012891050000111
Figure RE-GDA0002012891050000112
Wherein o isr,n,mRepresenting the mth activity behavior characteristic related to the population activity data of the target block in the nth time slice; mu.sk,mMeans representing a gaussian distribution obeyed by the probability of producing the mth active behavior feature under the condition of being in the kth hidden state; sigmak,mRepresenting a variance of a gaussian distribution obeyed by the probability of producing the mth active behavior feature under the condition of being in the kth hidden state; m represents the total number of activity behavior features to which each of the oral activity data in the observation sequence relates.
In one embodiment, after the step of determining the hidden state of the target block in the last time slice, the method further comprises the following steps: and predicting the hidden state of the target block in the time slice next to the last time slice based on the hidden state of the target block in the last time slice covered by the observation sequence and the state transition probability corresponding to the target block in the hidden Markov model.
It can be understood that the model parameters of the hidden markov model include the state transition probability corresponding to the target block, that is, the state transition probability corresponding to the target block is determined given the hidden markov model. As described above, the state transition probability corresponding to the target block includes the probability that the target block transitions between every two hidden states of the hidden markov model, and thus the hidden state of the target block in the last time slice covered by the observation sequence corresponding to the target block is determined
Figure RE-GDA0002012891050000126
Thereafter, the slave hidden state can be determined based on the state transition probability corresponding to the target block
Figure RE-GDA0002012891050000127
Starting to carry out the maximum transition probability of the state transition, thereby taking the hidden state corresponding to the maximum transition probability as the hidden state of the target block in the time slice next to the last time slice covered by the observation sequence.
For example, the state transition probability A corresponding to the target neighborhoodrExpressed in a matrix that assumes the hidden state of the target block in the last time slice covered by its corresponding observation sequence
Figure RE-GDA0002012891050000121
In the 3 rd hidden state, the slave Ar,3,1、Ar,3,2、Ar,3,3…, and Ar,3,KDetermining the probability of the maximum value as the hidden state
Figure RE-GDA0002012891050000122
And starting the maximum transition probability of the state transition. In addition, the method can be used for producing a composite materialLet A ber,3,2Is Ar,3,1、Ar,3,2、Ar,3,3…, and Ar,3,KThe probability that the median value is the largest can be regarded as the hidden state of the target block in the time slice next to the last time slice covered by the observation sequence corresponding to the target block.
Figure RE-GDA0002012891050000123
In one embodiment, after the step of predicting the hidden state of the target block in the time slice next to the last time slice, the method further comprises the following steps: and predicting the population activity data of the target block in the next time slice of the last time slice based on the hidden state of the target block in the next time slice of the last time slice and the Gaussian distribution mean value in the hidden Markov model.
It is understood that the hidden markov model includes the following model parameters: the gaussian distribution mean value corresponding to each candidate block related to the hidden markov model is determined under the condition of the given hidden markov model. As described above, the mean gaussian distribution may include a mean gaussian distribution to which the probability of each activity behavior feature related to the demographic activity data in the observation sequence is respectively generated under the condition that the target block is in each hidden state of the hidden markov model, and thereby the hidden state of the target block in the next time slice of the last time slice is determined
Figure RE-GDA0002012891050000124
Then, the target block in the mean of the Gaussian distribution can be in the hidden state
Figure RE-GDA0002012891050000125
Respectively generating the mean value of Gaussian distribution obeyed by the probability of each activity behavior characteristic related to the population activity data in the observation sequence corresponding to the target block,as demographic activity data for the target block in the time slice next to the last time slice.
For example, the mean value μ of the gaussian distribution commonly corresponding to each candidate block involved in the hidden markov model is represented by the following matrix, assuming that the hidden state of the target block in the last time slice covered by the observation sequence corresponding to the target block is determined
Figure RE-GDA0002012891050000131
For the 3 rd hidden state, { μ } may be3,1,μ3,2,μ3,3,...,μ3,MAnd M is the total number of activity behavior characteristics related to the population activity data in the observation sequence corresponding to the candidate block.
Figure RE-GDA0002012891050000132
In one embodiment, as shown in fig. 7, the hidden markov model training method may include the following steps S702 to S710.
S702, acquiring observation sequences corresponding to the candidate blocks respectively.
S704, in the current iteration, the current intermediate state probability and the current intermediate state transition probability corresponding to each candidate block are determined based on the observation sequence corresponding to each candidate block, the initial state probability and the state transition probability corresponding to each candidate block determined last time, the Gaussian distribution mean value corresponding to each candidate block together, and the Gaussian distribution variance corresponding to each candidate block together.
S706, determining the current initial state probability of each candidate block based on each current intermediate state probability, and determining the current state transition probability of each candidate block based on each current intermediate state transition probability.
And S708, based on the current intermediate state probabilities, determining a current Gaussian distribution mean value and a current Gaussian distribution variance which are jointly corresponding to the candidate blocks.
And S710, when the iteration termination condition is met, obtaining a hidden Markov model based on the initial state probability and the state transition probability respectively corresponding to each candidate block determined for the last time and the Gaussian distribution mean value and the Gaussian distribution variance jointly corresponding to each candidate block.
For hidden markov models, under the condition that the observation sequence is known, model parameters can be learned through a Baum-Welch algorithm (namely a Baum-Welch algorithm), so that the hidden markov model is determined.
The process of training the hidden markov model (i.e., the process of learning the model parameters of the hidden markov model) with a known sequence of observations is an iterative process. Specifically, in each iteration, for each candidate block, based on the observation sequence corresponding to the candidate block, the initial state probability and the state transition probability corresponding to the candidate block determined last time, and the gaussian distribution mean and the gaussian distribution variance commonly corresponding to the candidate blocks related to the hidden markov model, the current intermediate state probability and the current intermediate state transition probability corresponding to the candidate block are determined, so as to determine the current intermediate state probabilities respectively corresponding to the candidate blocks and the current intermediate state transition probabilities respectively corresponding to the candidate blocks.
Wherein, the current intermediate state probability corresponding to the r-th candidate block can be expressed as gammar。γrCan be represented as a matrix of N × K as follows:
Figure RE-GDA0002012891050000133
wherein the content of the first and second substances,
Figure RE-GDA0002012891050000141
and the probability that the nth candidate block is in the kth hidden state in the nth time slice is shown, wherein N is 1,2,3, …, N, K is 1,2,3, …, K, N is the total number of the time slices covered by the observation sequence corresponding to the nth candidate block, and K is the total number of the hidden states of the hidden Markov model.
The current intermediate state transition probability corresponding to the r-th candidate block may be represented as ξr。ξrCan be represented as a matrix of P × K, where P ═ K (N-1), i.e.:
Figure RE-GDA0002012891050000142
wherein the content of the first and second substances,
Figure RE-GDA0002012891050000143
the probability that the nth candidate block is in the jth hidden state in the (N-1) th time slice and the kth hidden state in the nth time slice is shown, N is 2,3,4, …, N, K is 1,2,3, …, K, j is 1,2,3, …, K, N is the total number of the time slices covered by the observation sequence corresponding to the nth candidate block, and K is the total number of the hidden states of the hidden markov model.
In addition, the probability that the nth candidate block is in the kth hidden state in the nth time slice can be calculated through the following formula
Figure RE-GDA0002012891050000144
Figure RE-GDA0002012891050000145
Wherein the content of the first and second substances,
Figure RE-GDA0002012891050000146
representing the forward probability corresponding to the condition that the nth candidate block is in the kth hidden state in the nth time slice, that is, under the condition that the nth candidate block is in the kth hidden state in the nth time slice, generating corresponding population activity data (i.e., { O }) in the observation sequence corresponding to the nth candidate block corresponding to each time slice before the nth time slicer,1,Or,2,Or,3,...,Or,n});
Figure RE-GDA0002012891050000151
the backward probability of the nth candidate block under the condition of being in the kth hidden state in the nth time slice is represented, that is, the corresponding population activity data in the observation sequence corresponding to the nth candidate block is generated in each time slice after the nth time slice under the condition that the nth candidate block is in the kth hidden state in the nth time slice (that is, { O }r,n+1,Or,n+2,Or,n+3,...,Or,N}), wherein N is the total number of time slices covered by the observation sequence corresponding to the r-th candidate block;
Figure RE-GDA0002012891050000152
representing the forward probability corresponding to the kth candidate block under the condition that the nth candidate block is in the kth hidden state in the nth time slice, that is, generating corresponding population activity data (i.e., { O }) in the observation sequence corresponding to the kth candidate block in the nth time slice and time slices before the nth time slice under the condition that the nth candidate block is in the kth hidden state in the nth time slicer,1,Or,2,Or,3,...,Or,N}).
The probability that the nth candidate block is in the jth hidden state in the (n-1) th time slice and in the kth hidden state in the nth time slice can be calculated by the following formula
Figure RE-GDA0002012891050000153
Figure RE-GDA0002012891050000154
Wherein the content of the first and second substances,
Figure RE-GDA0002012891050000155
representing the forward probability corresponding to the condition that the nth candidate block is in the jth hidden state in the nth-1 time slice, i.e. under the condition that the nth candidate block is in the jth hidden state in the nth-1 time slice, the r th time slice and each time slice before the nth-1 time slice correspondingly generate the r th time sliceCorresponding observation data (i.e. O) in observation sequence corresponding to each candidate blockr,1,Or,2,Or,3,...,Or,n-1) The probability of (d); a. ther,j,k (t)Representing the probability that the determined r-th candidate block is transferred from the j-th hidden state to the k-th hidden state;
Figure RE-GDA0002012891050000156
indicating that the nth candidate block is in the kth hidden state in the nth time slice, generating the population activity data (namely { O) in the nth time slice in the observation sequence corresponding to the nth candidate blockr,n,1,Or,n,2,Or,n,3,...,Or,n,MM is the total number of activity behavior features involved in the demographic activity data in the observation sequence); for parameters herein
Figure RE-GDA0002012891050000157
And
Figure RE-GDA0002012891050000158
the definitions of (a) and (b) may be the same as the previous definitions, and are not repeated herein.
In addition, the emission probability of the population activity data in the nth time slice in the observation sequence corresponding to the r candidate block is generated under the condition that the r candidate block is in the kth hidden state in the nth time slice can be calculated through the following formula
Figure RE-GDA0002012891050000159
Figure RE-GDA00020128910500001510
It should be noted that, here, the parameter σ is setk,m、μk,m、or,n,mAnd M may be the same as the above definition, and are not repeated herein.
In addition, the kth hidden shape of the nth candidate block in the nth time slice can be calculated by the following formulaForward probability corresponding to state condition
Figure RE-GDA0002012891050000161
Figure RE-GDA0002012891050000162
Wherein the content of the first and second substances,
Figure RE-GDA0002012891050000163
representing the forward probability corresponding to the condition that the nth candidate block is in the kth hidden state in the (n-1) th time slice, namely under the condition that the nth candidate block is in the kth hidden state in the (n-1) th time slice, respectively generating corresponding population activity data (namely O) in the observation sequence corresponding to the nth candidate block in the (n-1) th time slice and each time slice before the (n-1) th time slicer,1,Or,2,Or,3,...,Or,n-1) The probability of (c). For parameters herein
Figure RE-GDA0002012891050000164
And K can be the same as the above definition, and are not repeated herein.
And the forward probability corresponding to the condition that the kth candidate block is in the kth hidden state in the 1 st time slice
Figure RE-GDA0002012891050000165
Wherein, pir,k (t)Representing the probability that the determined r candidate block is in the k hidden state in the 1 st time slice;
Figure RE-GDA0002012891050000166
indicating that the r candidate block is in the k hidden state in the 1 st time slice, generating the population activity data (namely { O) } in the 1 st time slice in the observation sequence corresponding to the r candidate blockr,1,1,Or,1,2,Or,1,3,...,Or,1,M}).
In addition, the backward probability corresponding to the condition that the nth candidate block is in the kth hidden state in the nth time slice can be calculated through the following formula
Figure RE-GDA0002012891050000167
Figure RE-GDA0002012891050000168
Wherein the content of the first and second substances,
Figure RE-GDA0002012891050000169
the backward probability corresponding to the condition that the nth candidate block is in the kth hidden state in the (n + 1) th time slice is represented, that is, the population activity data (i.e., { O } corresponding to the nth candidate block in the observation sequence corresponding to the nth candidate block is generated in each time slice after the (n + 1) th time slice under the condition that the nth candidate block is in the kth hidden state in the (n + 1) th time slicer,n+2,Or,n+3,Or,n+4,...,Or,N}), wherein N is the total number of time slices covered by the observation sequence corresponding to the r-th candidate block;
Figure RE-GDA00020128910500001610
indicating that the r candidate block is in the k hidden state in the (n + 1) th time slice, generating the population activity data (namely { O } in the (n + 1) th time slice in the observation sequence corresponding to the r candidate blockr,n+1,1,Or,n+1,2,Or,n+1,3,...,Or,n+1,MM is the total number of activity behavior characteristics related to the population activity data in the observation sequence corresponding to the r-th candidate block); p(s)r,n|sr,n+1) Indicating that the r-th candidate block is in a hidden state s in the n-th time slicer,nAnd is in a hidden state s in the (n + 1) th time slicer,n+1The probability of (c).
And, the r-th candidate block is in the k-th hidden state in the last time slice (i.e. the nth time slice) covered by the corresponding observation sequenceCorresponding backward probability
Figure RE-GDA00020128910500001611
It should be noted that, in each iteration, the current intermediate state probabilities and the current intermediate state transition probabilities corresponding to the candidate blocks need to be calculated based on the model parameters determined last time. However, for the first determination of the model parameters in the first iteration, there are no last determined model parameters, so before performing the first determination of the model parameters in the first iteration, the model parameters of the hidden markov model may be initialized to obtain initial values of the model parameters (the initial values of the model parameters may be represented as θ)(0)={πr (0),Ar (0)(0)(0)}). And then, when the model parameters are determined for the first time in the first iteration, calculating to obtain the current intermediate state probabilities and the current intermediate state transition probabilities corresponding to the candidate blocks respectively based on the initial values of the model parameters.
In a current iteration, after determining current intermediate state probabilities and current intermediate state transition probabilities respectively corresponding to candidate blocks related to a hidden markov model once, for each candidate block, determining a current initial state probability of the candidate block based on the current intermediate state probability corresponding to the candidate block, and determining a current state transition probability of the candidate block based on the current intermediate state transition probability corresponding to the candidate block, thereby determining current initial state probabilities respectively corresponding to the candidate blocks once and current state transition probabilities respectively corresponding to the candidate blocks once.
Specifically, the current intermediate state probability pi corresponding to the r-th candidate blockr (t+1)The method can comprise the following steps: the current probability of the r-th candidate block being in each hidden state of the hidden Markov model in the 1 st time slice, i.e. pir,k (t+1)K is 1,2,3, …, K being the total number of hidden states of the hidden markov model.
Can pass throughThe current probability pi of the kth hidden state of the hidden Markov model of the kth candidate block in the 1 st time slice is calculated according to the following formular,k (t+1)
Figure RE-GDA0002012891050000171
Wherein the content of the first and second substances,
Figure RE-GDA0002012891050000172
the probability that the kth candidate block is in the kth hidden state within the 1 st time slice in the current intermediate state probability corresponding to the kth candidate block can be represented.
Current state transition probability A corresponding to the r-th candidate blockr (t+1)The method can comprise the following steps: the current probability that the r-th candidate block is transferred between every two hidden states of the hidden Markov model, namely Ar,j,k (t+1)K is 1,2,3, …, K, j is 1,2,3, …, K is the total number of hidden states of the hidden markov model.
Wherein, the current probability A of the r-th candidate block being transferred from the j-th hidden state to the k-th hidden state can be calculated by the following formular,j,k (t+1)
Figure RE-GDA0002012891050000173
Wherein, for the parameters
Figure RE-GDA0002012891050000174
The definitions of the above parameters may be the same as those of the corresponding parameters in the foregoing, and are not described herein again.
In the current iteration, after the current initial state probability of each candidate block and the current state transition probability of each candidate block are determined once, the current gaussian distribution mean value commonly corresponding to each candidate block can be further determined based on the current intermediate state probability respectively corresponding to each candidate block, and the current gaussian distribution variance commonly corresponding to each candidate block is determined based on the current intermediate state probability respectively corresponding to each candidate block.
Specifically, the current gaussian distribution mean value commonly corresponding to each candidate block may be determined based on the current intermediate state probability respectively corresponding to each candidate block and the observation sequence respectively corresponding to each candidate block. In addition, the current gaussian distribution variance commonly corresponding to each candidate block can be determined based on the current intermediate state probability respectively corresponding to each candidate block, the observation sequence respectively corresponding to each candidate block, and the current gaussian distribution mean.
After determining the current model parameters of the hidden markov model (i.e., the current initial state probability corresponding to each candidate block, the current state transition probability corresponding to each candidate block, the gaussian distribution mean value corresponding to each candidate block, and the gaussian distribution variance corresponding to each candidate block) in the current iteration, it can be determined whether the iteration termination condition is satisfied. The iteration termination condition is a condition for judging whether the current model parameter is converged. The iteration termination condition may be preset based on an actual requirement, for example, the iteration termination condition may include, but is not limited to, that the number of iterations corresponding to a current round of iteration is greater than or equal to a predetermined threshold number.
If the iteration termination condition is met, the last determined model parameters (the initial state probability and the state transition probability corresponding to each candidate block determined at the last time, the Gaussian distribution mean value corresponding to each candidate block together, and the Gaussian distribution variance corresponding to each candidate block together) are the final model parameters of the hidden Markov model. If the iteration termination condition is not met, the next iteration can be executed, and the model training is continued.
It can be understood that the trained hidden markov model can be used to determine hidden state sequences corresponding to candidate blocks involved in the hidden markov model.
The model training method comprises the steps of determining the current intermediate state probability and the current intermediate state transition probability corresponding to each block based on the observation sequence corresponding to the block, the initial state probability and the state transition probability corresponding to the block which are determined last time, and the Gaussian distribution mean value and the Gaussian distribution variance which are jointly corresponding to each block and are related to the hidden Markov model, determining the current initial state probability of each block based on the current intermediate state probability corresponding to each block, and determining the current state transition probability of each block based on the current intermediate state transition probability corresponding to each block. Therefore, in the iterative computation process, the operation support for determining the current initial state probability and the current state transition probability of each block is parallel, the time complexity is effectively reduced, and the method can be applied to large-scale data scenes, namely, the dynamic learning of long-time and fine-grained places is supported.
In addition, the method aims at a scheme for modeling symbiotic relationship among time, place and human activities in a mode of characterizing learning (such as Cross-Modal representational learning) in the traditional technology. In addition to the defect that parallel processing of data is difficult to support, the scheme cannot determine the transition situation of the state of the blocks and cannot distinguish the states of different blocks.
However, in the present application, the trained hidden markov model estimates the population flow characteristics of the blocks, and the model parameters of the trained hidden markov model include state transition probabilities corresponding to the blocks, so that the state transition situation of each block can be determined, and the states of different blocks can be distinguished.
In another embodiment, one hidden markov model corresponding to each block may be learned using an observation sequence corresponding to each block, and model parameters of the hidden markov model include an initial state probability corresponding to each block, a state transition probability corresponding to each block, and an observation probability corresponding to each block. However, this approach does not reflect the difference in state transitions between blocks due to the difference in the types of functions to which they belong.
Alternatively, for each block, a hidden markov model corresponding to the block may be learned using the observation sequence corresponding to the block. However, learning a hidden markov model for each block respectively faces the problem of insufficient learning of the model due to sparse training data, and the hidden markov models are independent, so that the association between the blocks cannot be established.
However, in the model training method provided by the present application, one hidden markov model is commonly learned using the observation sequences corresponding to the respective blocks, but the model parameters of the hidden markov model include initial state probabilities corresponding to the respective blocks related to the hidden markov model, state transition probabilities corresponding to the respective blocks, gaussian distribution means corresponding to the respective blocks, and gaussian distribution variances corresponding to the respective blocks. On one hand, each block has corresponding initial state probability and state transition probability, so that the difference of state transition between blocks due to different function types can be reflected; on the other hand, the observation sequences corresponding to the blocks are used for learning a hidden Markov model together, but the observation sequences corresponding to the blocks are not used for learning the hidden Markov models respectively, so that the problems that the training data are sparse, the model learning is insufficient, and the association between the blocks cannot be established are effectively solved.
In an embodiment, in the current iteration, the step of determining the current intermediate state probability and the current intermediate state transition probability respectively corresponding to each candidate block based on the observation sequence respectively corresponding to each candidate block, the initial state probability and the state transition probability respectively corresponding to each candidate block determined last time, the gaussian distribution mean value jointly corresponding to each candidate block, and the gaussian distribution variance jointly corresponding to each candidate block, that is, step S302, may include the following steps: in the current iteration, determining current target sequence segments corresponding to all candidate blocks respectively; the current target sequence segment corresponding to the candidate block is a sequence segment which is not used as a target sequence segment in the current iteration in each sequence segment contained in the observation sequence corresponding to the candidate block; and determining the current intermediate state probability and the current intermediate state transition probability respectively corresponding to each candidate block based on each current target sequence segment, the initial state probability respectively corresponding to each candidate block determined last time, the state transition probability respectively corresponding to each candidate block, the Gaussian distribution mean value commonly corresponding to each candidate block and the Gaussian distribution variance commonly corresponding to each candidate block.
Accordingly, after the step of determining the current gaussian distribution mean and the current gaussian distribution variance commonly corresponding to the respective street blocks based on the current intermediate state probabilities respectively corresponding to the respective candidate street blocks, that is, after step S306, the following steps may be further included: and returning to the step of determining the current target sequence segments corresponding to the candidate blocks until all the sequence segments contained in the observation sequences corresponding to the candidate blocks are used as target sequence segments in the current iteration, and judging whether the iteration termination condition is met.
In this embodiment, for each candidate block, the observation sequence corresponding to the candidate block may be split into two or more sequence segments. Accordingly, in each iteration, for each candidate block, the current intermediate state probability and the current intermediate state transition probability corresponding to the candidate block can be calculated once based on each sequence segment included in the observation sequence corresponding to the candidate block. It can be understood that, each time the current intermediate state probability and the current intermediate state transition probability corresponding to each candidate block involved in the hidden markov model are calculated, the current model parameters of the hidden markov model should be determined once (that is, the current initial state probability and the current state transition probability corresponding to each candidate block, respectively, and the current gaussian distribution mean and the current gaussian distribution variance corresponding to each candidate block involved in the hidden markov model are determined once).
In connection with the foregoing example, the observation sequences respectively corresponding to the candidate blocks related to the hidden markov model each include the demographic activity data within 720 time slices, and for the observation sequence corresponding to each candidate block, the observation sequence may be split into 15 sequence segments at an interval of 48. For example, the observation sequence O corresponding to the r-th candidate blockr={Or,1,Or,2,Or,3,...,Or,720The sequence is divided into 15 sequence segments, and the 1 st sequence segment is { O }r,1,Or,2,Or,3,...,Or,48The 2 nd sequence fragment is { O }r,49,Or,2,Or,3,...,Or,96Are sequentially classifiedTo conclude, the 15 th sequence fragment is { Or,673,Or,2,Or,3,...,Or,720}。
Accordingly, in the t +1 th iteration, the process may be as follows: for each candidate block, first, the 1 st sequence segment in the observation sequence corresponding to the candidate block is selected (for example, the 1 st sequence segment of the r-th candidate block may be { O }r,1,Or,2,Or,3,...,Or,48}) determining a current target sequence segment corresponding to the candidate block, determining a current intermediate state probability and a current intermediate state transition probability corresponding to the candidate block based on a 1 st sequence segment corresponding to the candidate block and model parameters determined last time (namely, initial state probability and state transition probability corresponding to the candidate block obtained by the t-th iteration and Gaussian distribution mean and Gaussian distribution variance which are commonly corresponding to the candidate blocks related to the hidden Markov model), determining a current initial state probability and a current intermediate state transition probability of the candidate block based on the current intermediate state probability corresponding to the candidate block, and determining a current state transition probability of the candidate block based on the current intermediate state transition probability corresponding to the candidate block. And determining the current Gaussian distribution mean value and the current Gaussian distribution variance which correspond to the candidate blocks together based on the current intermediate state probability which corresponds to the candidate blocks respectively and relates to the hidden Markov model.
Furthermore, for each candidate block, the 2 nd sequence segment in the observation sequence corresponding to the candidate block (for example, the 2 nd sequence segment of the r-th candidate block may be { O }r,49,Or,2,Or,3,...,Or,96}) determining a current target sequence segment corresponding to the candidate block, determining a current intermediate state probability and a current intermediate state transition probability corresponding to the candidate block based on a 2 nd sequence segment corresponding to the candidate block and a model parameter determined last time (namely, in the t +1 th iteration, based on an initial state probability and a state transition probability corresponding to the candidate block obtained from a 1 st sequence segment corresponding to the candidate block and a Gaussian distribution mean and a Gaussian distribution variance which are jointly corresponding to candidate blocks related to a hidden Markov model),and determining the current initial state probability of the candidate block based on the current intermediate state probability corresponding to the candidate block, and determining the current state transition probability of the candidate block based on the current intermediate state transition probability corresponding to the candidate block. And determining the current Gaussian distribution mean value and the current Gaussian distribution variance which correspond to the candidate blocks together based on the current intermediate state probability which corresponds to the candidate blocks respectively and relates to the hidden Markov model.
And analogizing in sequence until a 15 th sequence segment in the observation sequence corresponding to each candidate block is determined as a current target sequence segment corresponding to the candidate block, performing the similar steps based on the 15 th sequence segment corresponding to the candidate block, determining a current intermediate state transition probability corresponding to the candidate block, determining the current state transition probability of the candidate block, and determining a current Gaussian distribution mean value and a current Gaussian distribution variance which are jointly corresponding to the candidate blocks based on the current intermediate state probabilities respectively corresponding to the candidate blocks related to the hidden Markov model.
So far, the t +1 th iteration is completed, and whether the iteration termination condition is met or not can be judged. If so, obtaining a trained hidden Markov model based on the last determined model parameters (namely, in the t +1 round of iteration, the initial state probability and the state transition probability which respectively correspond to each candidate block and are obtained based on the 15 th sequence segment corresponding to each candidate block, and the Gaussian distribution mean value and the Gaussian distribution variance which jointly correspond to each candidate block); if not, executing the next iteration (i.e. the t +2 th iteration), wherein the processing procedure in the t +2 th iteration is similar to the processing procedure in the t +1 th iteration, which is not described herein again.
In one embodiment, for each candidate block, the current intermediate state probability corresponding to the candidate block includes current probabilities that the candidate block is in hidden states in target time slices covered by the corresponding observation sequence. And, the current Gaussian distribution mean μ(t+1)Including generating persons in observation sequences separately under hidden states of hidden Markov modelsCurrent mean value of the gaussian distribution obeyed by the probability of each activity behavior feature to which the oral activity data relates, i.e. muk,m (t+1),k=1,2,3,…,K, m=1,2,3,…,M。
Accordingly, the manner of determining a current mean of a gaussian distribution to which the probability of any activity behavior feature to which the demographic activity data in the observation sequence relates is obeyed under any hidden state of the hidden markov model may comprise the steps of: and determining the current mean value of Gaussian distribution obeyed by the probability of generating the activity behavior feature under the condition of the hidden state based on the current probability of each candidate block in the hidden state in each target time slice and the activity behavior feature related to the population activity data of each candidate block in each target time slice.
Specifically, the current mean μ of the gaussian distribution to which the probability of generating the mth active behavior feature is obeyed under the condition of the kth hidden state can be calculated by the following formulak,m (t+1)
Figure RE-GDA0002012891050000201
Wherein the content of the first and second substances,
Figure RE-GDA0002012891050000202
representing the current probability that the nth candidate block is in the kth hidden state in the nth time slice; o isr,n,mRepresenting the mth activity behavior characteristic related to the population activity data of the nth candidate block in the nth target time slice; r represents the total number of each candidate block related to the hidden Markov model; n1 represents the total number of target time slices covered by the observation sequence corresponding to the r-th candidate block.
It should be noted that, if the observation sequence corresponding to each candidate block is divided into two or more sequence segments, in each iteration, for each candidate block, the current intermediate state probability and the current intermediate state probability corresponding to the candidate block are calculated once based on each sequence segment included in the observation sequence corresponding to the candidate blockThe transition probability of the intermediate state, then calculate the current mean value mu of the Gaussian distribution obeyed by the probability of generating the mth activity behavior feature under the condition of the kth hidden statek,m (t+1)Then, each target time slice covered by the observation sequence corresponding to the nth candidate block described above is each time slice covered by the current target sequence segment, that is, N1 may be equal to the total number of time slices covered by the current target sequence segment. For the example of splitting an observation sequence containing demographic activity data in 720 time slices into 15 sequence segments at 48 intervals, N1 may then be equal to 48.
In addition, if the observation sequence corresponding to each candidate block is not split into more than two sequence segments, in each iteration, for each candidate block, the current intermediate state probability and the current intermediate state transition probability corresponding to the candidate block are calculated once based on the complete observation sequence corresponding to the candidate block, and the current mean value μ of the gaussian distribution to which the probability of generating the mth activity behavior feature obeys is calculated under the condition of being in the kth hidden statek,m (t+1)In this case, the target time slices covered by the observation sequence corresponding to the nth candidate block are the time slices covered by the complete observation sequence, i.e., N1 may be equal to the total number of time slices covered by the complete observation sequence corresponding to the nth candidate block. For the example of the observation sequence containing demographic activity data in 720 time slices in the foregoing, N1 may then be equal to 720.
In one embodiment, the current gaussian distribution variance σ(t+1)Including the current variance, sigma, of the Gaussian distribution to which the probability of activity behavior features involved in the demographic data in the observation sequence is respectively generated, subject to the hidden states of the hidden Markov modelk,m (t+1),k=1,2,3,…,K, m=1,2,3,…,M。
From this, the way of determining the current variance of the gaussian distribution to which the probability of any activity behavior feature to which the demographic activity data in the observation sequence relates is obeyed under any hidden state of the hidden markov model may comprise the steps of: and determining the current variance of the Gaussian distribution to which the probability of generating the activity behavior feature under the condition of the hidden state is obeyed based on the current probability of each candidate block in the hidden state in each target time slice, the activity behavior feature to which the population activity data of each candidate block in each target time slice relates, and the current mean of the Gaussian distribution to which the probability of generating the activity behavior feature under the condition of the hidden state is obeyed.
Specifically, the current variance σ of the gaussian distribution to which the probability of generating the mth active behavior feature is obeyed under the condition of the kth hidden state can be calculated by the following formulak,m (t+1)
Figure RE-GDA0002012891050000211
Wherein, here for the parameters
Figure RE-GDA0002012891050000212
Or,n,m、μk,m (t+1)And N1, may be the same as those described above, and will not be described herein.
In one embodiment, as shown in FIG. 8, a method for determining the functional type of a neighborhood is provided. The method is applied to a computer device (such as the terminal 210 or the server 220 in fig. 2) for example. The method may include the following steps S802 to S810.
S802, acquiring observation sequences corresponding to the candidate blocks related to the hidden Markov model.
S804, based on the initial state probability corresponding to each candidate block in the hidden Markov model, the state transition probability corresponding to each candidate block, the Gaussian distribution mean value corresponding to each candidate block together and the Gaussian distribution variance corresponding to each candidate block together, respectively determining the local probability of each candidate block in each hidden state of the hidden Markov model in each time slice covered by the observation sequence, and based on each local probability, determining the reverse pointer corresponding to each local probability respectively.
S806, based on the maximum local probability of the local probabilities of the candidate blocks in the hidden states in the last time slice covered by the observation sequence, the hidden state of each candidate block in the last time slice is determined.
And S808, performing optimal path backtracking based on the hidden state of each candidate block in the last time slice and each backward pointer to obtain a hidden state sequence corresponding to each candidate block.
And S810, clustering is carried out on the basis of the hidden state sequences respectively corresponding to the candidate blocks, and the function types of the candidate blocks are respectively determined from the candidate function types on the basis of the clustering result.
And the function type can be used for representing the functions of the block. The candidate function types may be preset based on actual needs, such as tourist spots, residential areas, general areas, business areas, schools, complex areas, companies, and others.
In this embodiment, the hidden state sequences corresponding to the candidate blocks related to the hidden markov model are determined by the hidden state sequence determining method provided in any embodiment of the present application, and then clustering is performed based on the hidden state sequences corresponding to the candidate blocks, and the function types to which the candidate blocks belong are determined from the candidate function types based on the clustering result.
Specifically, the hidden state sequence corresponding to each candidate block is determined
Figure RE-GDA0002012891050000221
R is 1,2,3, …, and R represents the total number of candidate blocks involved in the hidden markov model. Further, the sequence distance between each two hidden state sequences is determined, that is, the sequence between the hidden state sequence corresponding to the 1 st candidate block and the hidden state sequence corresponding to the 2 nd candidate block, the hidden state sequence corresponding to the 3 rd candidate block, …, and the hidden state sequence corresponding to the nth candidate block are determinedColumn distance, sequence distance between the hidden state sequence corresponding to the 2 nd candidate block and the hidden state sequence corresponding to the 3 rd candidate block, the hidden state sequence corresponding to the 4 th candidate block, …, and the hidden state sequence corresponding to the nth candidate block, respectively, and so on. And then, clustering each candidate block based on the sequence distance between each two hidden state sequences through a K-means clustering algorithm, thereby determining a plurality of clusters, wherein each cluster corresponds to each candidate function type. And aiming at each candidate block, determining the cluster to which the candidate block belongs so as to determine the function type to which the candidate block belongs.
In addition, the manner of determining the sequence distance between two hidden state sequences may specifically be as follows: and calculating the state distance between the hidden states corresponding to the same time slice in the two hidden state sequences in a Euler distance calculation mode, and further determining the sequence distance between the two hidden state sequences based on the state distance between the hidden states in each time slice. Specifically, the ratio of the sum of the state distances between the hidden states within each time slice to the total number of each time slice may be taken as the sequence distance between two sequences of hidden states.
Hidden state sequence corresponding to 1 st candidate block
Figure RE-GDA0002012891050000222
Hidden state sequence corresponding to 2 nd candidate block
Figure RE-GDA0002012891050000223
For example, by calculating the Euler distance
Figure RE-GDA0002012891050000224
And
Figure RE-GDA0002012891050000225
the state distance d1,
Figure RE-GDA0002012891050000226
And
Figure RE-GDA0002012891050000227
the state distance d2,
Figure RE-GDA0002012891050000228
And
Figure RE-GDA0002012891050000229
the state distances d3, …,
Figure RE-GDA00020128910500002210
And
Figure RE-GDA00020128910500002211
the hidden state sequence dN, based on d1, d2, d3, …, and dN
Figure RE-GDA00020128910500002212
And hidden state sequence
Figure RE-GDA00020128910500002213
The sequence distance between. In particular, hidden state sequences
Figure RE-GDA0002012891050000231
And hidden state sequence
Figure RE-GDA0002012891050000232
May be a sequence distance of
Figure RE-GDA0002012891050000233
It should be noted that, determining the function type to which the candidate block belongs can provide reference for city planning and city infrastructure, and can also directly guide the introduction of new points of interest and the location of shops.
It should be understood that, under reasonable circumstances, although the steps in the flowcharts referred to in the foregoing embodiments are shown in sequence as indicated by the arrows, the steps are not necessarily executed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in each flowchart may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
The functional characteristics of the technical scheme provided by the application are explained in combination with actual tests, and the technical scheme provided by the application is tested according to the population activity data of 665 central blocks in Beijing City and 2000000 users in 2018 and 4 months. And dividing the population activity data in the 1 month period into a training set and a test set, wherein the population activity data in the previous three weeks is the training set which is used as an existing observation sequence to learn the model parameters of the hidden Markov model, and the test set is used for verifying the performance of the trained hidden Markov model.
First, in practical tests, 100 hidden states as shown in fig. 5 were learned on 665 central blocks in beijing city, and in order to demonstrate the ability of the technical solution in the present application to find hidden states and to reveal the dynamics of the blocks within the city, a series of specific examples and detailed explanations are given below.
As shown in fig. 9, a mean value corresponding to the hidden states frequently appearing in the hidden state sequence corresponding to each block is shown, and transition conditions of the hidden states on the working day and the non-working day are respectively shown. It is understood that the normal weekend in month 4, and the 3-day-to-minded holidays of 5 days (thursday), 6 days (friday) and 7 days (saturday) in month 4 belong to non-workdays, normal workdays and 8 days (sundays) in month 4 belong to workdays.
First, discussing the hidden states discovered, each hidden state shown in FIG. 9 has two aspects of semantics: (1) population density and population flow, such as hidden state32 indicating high and high population flow, state 21 indicating low and high population density, and state17 indicating low and low population flow. (2) Access frequency for different types of points of interest, such as hidden state 31 indicating that the most frequently accessed point of interest is an educational type, and state 21 indicating that the most frequently accessed point of interest is an attraction type. As shown in fig. 10 (a), the hidden state79 appears in the block during the day because the qinghua university occupies most of the area of the block, and the hidden state 99 appears in the block where the beijing university is located.
Further, the dynamics represented by the state transition process are discussed. It is apparent from fig. 10 that the dynamics of the blocks within a city have periodicity, since the status of the same time period on different days is generally the same. It is noted that the dynamics in some regions, as shown in fig. 9 (f), have a large difference between weekdays and non-working, while for other regions, as shown in fig. 9 (c), are very similar between weekdays and non-working.
Taking the dynamics of the neighborhood where the university of qinghua is located as an example, as shown in fig. 9 (a), there are fewer people at night than in the daytime because the average value of the hidden state70 and the hidden state 31 is smaller than the hidden state 79. Furthermore, on weekdays, a sudden crowd movement occurs because the hidden state32 appears at 8: 00-9: 00 and 17: 00-19: 00. the transition from hidden state70 to hidden state32 and hidden state32 to hidden state79 during the workday reveals a dynamic feature where only students live in the area at night and more teachers enter the school in the morning, with a denser population than at night. In comparison to (a) and (b) in fig. 9, fig. 9 (c) shows that population density is consistently high and the types of points of interest visited are most often sights, whether on weekdays or non-weekdays.
In addition, the performance of the technical scheme provided by the application on determining the function type of the city is further evaluated in practical tests. Fig. 10 shows the clustering results for each block and the geographical distribution of the corresponding area. According to the technical scheme, 8 function types are obtained on the data set, namely tourist attractions, residential areas, comprehensive areas, business areas, schools, composite areas, companies and the like, and the functions of some blocks (including streets shown in fig. 9 on a map) are verified through a manual labeling method, so that the technical scheme provided by the application can effectively determine the function types of the cities. Further, this result is compared with the most advanced function type determination method in the actual test, i.e. an LDA model (Latent Dirichlet Allocation model) using the points of interest and mobility, the result of the actual test is similar to the processing result of the LDA model, and the Normalized Mutual Information (NMI) is 0.25 (range from-0.5 to 1). In conclusion, the blocks with more common states and similar state transition processes are more likely to have the same functions, and the technical scheme of the application is proved to be capable of deducing the distribution of the functional blocks in the whole city.
Meanwhile, the performance of the technical scheme in the aspect of predicting the population mobility behavior is evaluated in practical tests. Prediction results as shown in fig. 11, (a) in fig. 11 illustrates the difference between the predicted value and the actual value of the number of persons staying in the neighborhood of qinghua university from 22 to 30 in 4 months in 2018.
In order to further prove the superiority of the technical scheme of the application, the method is compared with a common hidden Markov model in an actual test. The comparison result of the indexes is shown in (b) in fig. 11, wherein the average RMSE (Root Mean Square Error) of the population flow prediction is 0.195, and the accuracy of Top3 when predicting the most frequently visited interest point is 41.4%, so that the technical scheme in the application is obviously superior to the common hidden markov model. In summary, the technical scheme in the application can be effectively applied to people stream prediction and frequently visited interest point prediction of the blocks in the city.
In one embodiment, as shown in fig. 12, a determination apparatus 1200 for a hidden state sequence is provided. The apparatus may include the following modules 1202 to 1208.
A first observation sequence obtaining module 1202, configured to obtain an observation sequence corresponding to a target block.
A first intermediate parameter determining module 1204, configured to determine, based on the observation sequence, an initial state probability corresponding to the target block in the hidden markov model, a state transition probability corresponding to the target block, a gaussian distribution mean value corresponding to each candidate block related to the hidden markov model, and a gaussian distribution variance corresponding to each candidate block, local probabilities that the target block is in each hidden state of the hidden markov model within each time slice covered by the observation sequence, respectively, and determine reverse pointers corresponding to each local probability, respectively.
A first end hidden state determining module 1206, configured to determine, based on a maximum local probability of local probabilities that the target block is in each hidden state in a last time slice covered by the observation sequence, a hidden state in which the target block is located in the last time slice.
The first hidden state sequence determining module 1208 is configured to perform optimal path backtracking based on the hidden state of the target block in the last time slice and each reverse pointer, so as to obtain a hidden state sequence.
In one embodiment, as shown in fig. 13, a determination apparatus 1300 of the functional type of the neighborhood is provided. The apparatus may include the following modules 1302 to 1310.
A second observation sequence obtaining module 1302, configured to obtain observation sequences corresponding to candidate blocks related to the hidden markov model, respectively;
a second intermediate parameter determining module 1304, configured to determine local probabilities of the candidate blocks in hidden states of the hidden markov model within time slices covered by the observation sequence based on initial state probabilities corresponding to the candidate blocks, state transition probabilities corresponding to the candidate blocks, gaussian distribution mean values corresponding to the candidate blocks, and gaussian distribution variances corresponding to the candidate blocks, respectively, and determine reverse pointers corresponding to the local probabilities based on the local probabilities;
a second end hidden state determining module 1306, configured to determine hidden states of the candidate blocks in a last time slice covered by the observation sequence based on a maximum local probability of local probabilities of the candidate blocks in the hidden states in the last time slice;
a second hidden state sequence determining module 1308, configured to perform optimal path backtracking based on the hidden state of each candidate block in the last time slice and each backward pointer, so as to obtain a hidden state sequence corresponding to each candidate block;
the function type determining module 1310 is configured to perform clustering based on the hidden state sequences corresponding to the candidate blocks, and determine the function type to which each candidate block belongs from the candidate function types based on the clustering result.
It should be noted that, for specific definition of technical features in the determination device 1200 for determining a hidden state sequence, reference may be made to the definition of the determination method for a hidden state sequence in the foregoing, and for specific definition of technical features in the determination device 1300 for determining a function type of a street block, reference may be made to the definition of the determination method for a function type of a street block in the foregoing, which is not described herein again. The various modules in the above-described apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In an embodiment, a computer device is provided, comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the above-mentioned method of determining a sequence of hidden states and/or the method of determining a functional type of a street block.
An internal block diagram of a computer device in one embodiment is shown in FIG. 14. The computer device may specifically be the server 220 in fig. 2. As shown in fig. 14, the computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor is configured to provide computational and control capabilities. The memory includes a nonvolatile storage medium storing an operating system and a computer program, and an internal memory providing an environment for the operating system and the computer program in the nonvolatile storage medium to run. The network interface is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement the above-described method of determining a sequence of hidden states and/or the method of determining a functional type of a neighborhood.
Those skilled in the art will appreciate that the architecture shown in fig. 14 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, taking the hidden state sequence determining apparatus 1200 provided in the present application as an example, the apparatus may be implemented in a form of a computer program, and the computer program may be run on a computer device as shown in fig. 14. The memory of the computer device may store various program modules constituting the hidden state sequence determining apparatus 1200, such as a first observation sequence obtaining module 1202, a first intermediate parameter determining module 1204, a first terminal hidden state determining module 1206, a first hidden state sequence determining module 1208 and the like shown in fig. 12. The computer program constituted by the respective program modules causes the processor to execute the steps in the determination method of the hidden state sequence of the embodiments of the present application described in the present specification.
For example, the computer apparatus shown in fig. 14 may execute step S302 by the first observation sequence acquisition module 1202 in the determination apparatus 1200 of the hidden state sequence shown in fig. 12, execute step S304 by the first intermediate parameter determination module 1204, and so on.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
Accordingly, in an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, causes the processor to carry out the above-mentioned determination method of the sequence of hidden states and/or the determination method of the functional type of the neighborhood.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (15)

1. A method of determining a sequence of hidden states, comprising:
acquiring an observation sequence corresponding to a target block;
determining local probabilities that the target block is respectively in each hidden state of the hidden Markov model in each time slice covered by the observation sequence based on the observation sequence, the initial state probability corresponding to the target block in the hidden Markov model, the state transition probability corresponding to the target block, the Gaussian distribution mean value corresponding to each candidate block related to the hidden Markov model, and the Gaussian distribution variance corresponding to each candidate block, and determining reverse pointers corresponding to each local probability;
determining the hidden state of the target block in the last time slice covered by the observation sequence based on the maximum local probability of the local probabilities of the target block in the hidden states in the last time slice;
and performing optimal path backtracking based on the hidden state of the target block in the last time slice and each back pointer to obtain a hidden state sequence.
2. The method of claim 1, wherein the obtaining of the observation sequence corresponding to the target block comprises:
acquiring an original observation sequence corresponding to the target block; the original observation sequence comprises original population activity data of the target block in more than two time slices, and activity behavior characteristics related to each original population activity data comprise population floating number and access frequency aiming at a preset type of interest points;
and carrying out maximum value normalization on the population flowing number in each original population activity data and the TF-IDF parameter corresponding to the access frequency aiming at the preset type of interest points in each original population activity data to obtain an observation sequence corresponding to the target block.
3. The method of claim 1, wherein determining the local probability that the target neighborhood is in any hidden state of the hidden markov model within any time slice covered by the observation sequence comprises:
determining the emission probability of the population activity data in the time slice in the observation sequence generated under the condition that the target block is in the hidden state in the time slice based on the population activity data in the time slice in the observation sequence, the Gaussian distribution mean value which is jointly corresponding to each candidate block related to the hidden Markov model and the Gaussian distribution variance which is jointly corresponding to each candidate block;
determining the local probability of the target block in the hidden state in the time slice based on the local probability of each hidden state of the hidden Markov model of the target block in the previous time slice adjacent to the time slice, the state transition probability corresponding to the target block in the hidden Markov model and the emission probability;
and the local probability of the target block in the hidden state in the first time slice covered by the observation sequence is determined based on the probability corresponding to the hidden state in the initial state probability corresponding to the target block and the emission probability of the population activity data generated in the first time slice in the observation sequence under the condition that the target block is in the hidden state in the first time slice.
4. The method according to claim 3, wherein the mean Gaussian distribution comprises a mean of a Gaussian distribution obeyed by the probability of each activity behavior feature to which the population activity data in the observation sequence relates, respectively, under the condition of each hidden state of the hidden Markov model, and the variance of the Gaussian distribution comprises a variance of the Gaussian distribution obeyed by the probability of each activity behavior feature to which the population activity data in the observation sequence relates, respectively, under the condition of each hidden state of the hidden Markov model;
the determining, based on the population activity data in the time slice in the observation sequence, the gaussian distribution mean value commonly corresponding to the candidate blocks related to the hidden markov model, and the gaussian distribution variance commonly corresponding to the candidate blocks, a transmission probability that the target block generates the population activity data in the time slice in the observation sequence under the condition that the target block is in the hidden state in the time slice includes:
and determining the emission probability of the target block generating the population activity data under the condition that the target block is in the hidden state of the hidden Markov model in the time slice based on the variance of the Gaussian distribution obeyed by the probability of generating each activity behavior feature related to the population activity data of the observation sequence under the condition that the target block is in the hidden state, the mean value of the Gaussian distribution obeyed by the probability of generating each activity behavior feature related to the population activity data of the observation sequence respectively under the condition that the target block is in the hidden state of the hidden Markov model in the observation sequence, and the population activity data of the target block in the time slice.
5. The method of claim 1, wherein the hidden Markov model training mode comprises:
acquiring observation sequences corresponding to the candidate blocks respectively;
in the current round of iteration, determining the current intermediate state probability and the current intermediate state transition probability respectively corresponding to each candidate block based on the observation sequence respectively corresponding to each candidate block, the initial state probability and the state transition probability respectively corresponding to each candidate block determined last time, the Gaussian distribution mean value commonly corresponding to each candidate block and the Gaussian distribution variance commonly corresponding to each candidate block;
determining the current initial state probability of each candidate block based on each current intermediate state probability, and determining the current state transition probability of each candidate block based on each current intermediate state transition probability;
based on the current intermediate state probabilities, determining a current Gaussian distribution mean value and a current Gaussian distribution variance which are jointly corresponding to the candidate blocks;
and when the iteration termination condition is met, obtaining the hidden Markov model based on the initial state probability and the state transition probability respectively corresponding to each candidate block which are determined for the last time and the Gaussian distribution mean value and the Gaussian distribution variance which are jointly corresponding to each candidate block.
6. The method of claim 5, wherein the current intermediate state probability corresponding to the candidate block comprises a current probability that the candidate block is in the hidden state in each target time slice of the time slices covered by the corresponding observation sequence; the current mean value of the Gaussian distribution to which the probability of each activity behavior feature related to the demographic activity data in the observation sequence is respectively generated comprises the current mean value of the Gaussian distribution under the condition of each hidden state of the hidden Markov model;
a way of determining a current mean of a gaussian distribution obeying the probability of any activity behavior feature to which demographic activity data in the observation sequence relates, subject to any hidden state of the hidden markov model, comprises:
and determining a current mean value of a Gaussian distribution to which the probability of generating the activity behavior feature under the condition of being in the hidden state is subjected based on the current probability of each candidate block in the hidden state in each target time slice and the activity behavior feature related to the population activity data of each candidate block in each target time slice.
7. The method according to claim 6, wherein the current Gaussian distribution variance comprises a current variance of the Gaussian distribution to which probabilities of activity behavior features involved by the demographic activity data in the observation sequence are respectively generated on condition that the hidden states of the hidden Markov model are present;
a way of determining a current variance of a gaussian distribution obeying to a probability of any activity behavior feature to which demographic activity data in the observation sequence relates, subject to any hidden state of the hidden markov model, comprising:
and determining the current variance of the Gaussian distribution to which the probability of generating the activity behavior feature under the condition of the hidden state is obeyed based on the current probability of each candidate block in the hidden state in each target time slice, the activity behavior feature to which the population activity data of each candidate block in each target time slice relates, and the current mean of the Gaussian distribution to which the probability of generating the activity behavior feature under the condition of the hidden state is obeyed.
8. The method as claimed in any one of claims 5 to 7, wherein the determining, in the current iteration, a current intermediate state probability and a current intermediate state transition probability corresponding to each candidate block based on the observation sequence corresponding to each candidate block, the initial state probability and the state transition probability corresponding to each candidate block determined last time, the mean value of the Gaussian distribution corresponding to each candidate block in common, and the variance of the Gaussian distribution corresponding to each candidate block in common comprises:
in the current iteration, determining current target sequence segments corresponding to the candidate blocks respectively; the current target sequence segment corresponding to the candidate block is a sequence segment which is not used as a target sequence segment in the current iteration in each sequence segment contained in the observation sequence corresponding to the candidate block;
determining current intermediate state probability and current intermediate state transition probability corresponding to each candidate block based on each current target sequence segment, initial state probability corresponding to each candidate block determined last time, state transition probability corresponding to each candidate block, Gaussian distribution mean value corresponding to each candidate block and Gaussian distribution variance corresponding to each candidate block;
after the determining the current mean gaussian distribution and the current variance of gaussian distribution that are commonly corresponding to the blocks based on the current intermediate state probability that each of the candidate blocks respectively corresponds to, the method further includes:
and returning to the step of determining the current target sequence segments corresponding to the candidate blocks respectively, and judging whether an iteration termination condition is met or not until all sequence segments contained in the observation sequence corresponding to the candidate blocks are used as target sequence segments in the current iteration.
9. The method of claim 1, further comprising, after determining the hidden state of the target block within the last time slice:
and predicting the hidden state of the target block in the time slice next to the last time slice based on the hidden state of the target block in the last time slice covered by the observation sequence and the state transition probability corresponding to the target block in the hidden Markov model.
10. The method of claim 9, further comprising, after predicting the hidden state of the target block in a time slice next to the last time slice:
and predicting the population activity data of the target block in the next time slice of the last time slice based on the hidden state of the target block in the next time slice of the last time slice and the Gaussian distribution mean value in the hidden Markov model.
11. A method for determining a functional type of a neighborhood, comprising:
acquiring observation sequences corresponding to all candidate blocks related to the hidden Markov model;
respectively determining local probabilities of the candidate blocks in hidden states of the hidden Markov model in time slices covered by the observation sequence based on initial state probabilities respectively corresponding to the candidate blocks, state transition probabilities respectively corresponding to the candidate blocks, mean Gaussian distribution values commonly corresponding to the candidate blocks and variance Gaussian distribution values commonly corresponding to the candidate blocks in the hidden Markov model, and respectively determining reverse pointers respectively corresponding to the local probabilities based on the local probabilities;
respectively determining the hidden state of each candidate block in the last time slice covered by the observation sequence based on the maximum local probability of the local probabilities of each candidate block in the hidden state in the last time slice;
performing optimal path backtracking based on the hidden state of each candidate block in the last time slice and each back pointer to obtain a hidden state sequence corresponding to each candidate block;
and clustering is carried out on the basis of the hidden state sequences respectively corresponding to the candidate blocks, and the function types of the candidate blocks are respectively determined from the candidate function types on the basis of the clustering result.
12. An apparatus for determining a sequence of hidden states, comprising:
the first observation sequence acquisition module is used for acquiring an observation sequence corresponding to a target block;
a first intermediate parameter determining module, configured to determine, based on the observation sequence, an initial state probability corresponding to the target block in a hidden markov model, a state transition probability corresponding to the target block, a gaussian distribution mean value corresponding to each candidate block related to the hidden markov model in common, and a gaussian distribution variance corresponding to each candidate block in common, local probabilities that the target block is in each hidden state of the hidden markov model in each time slice covered by the observation sequence, respectively, and determine reverse pointers corresponding to each local probability, respectively;
a first end hidden state determining module, configured to determine, based on a maximum local probability of local probabilities that the target block is in each hidden state in a last time slice covered by the observation sequence, a hidden state in which the target block is located in the last time slice;
and the first hidden state sequence determining module is used for performing optimal path backtracking on the basis of the hidden state of the target block in the last time slice and each back pointer to obtain a hidden state sequence.
13. An apparatus for determining a functional type of a neighborhood, comprising:
the second observation sequence acquisition module is used for acquiring observation sequences corresponding to the candidate blocks related to the hidden Markov model;
a second intermediate parameter determining module, configured to determine, based on an initial state probability corresponding to each candidate block in the hidden markov model, a state transition probability corresponding to each candidate block, a gaussian distribution mean value corresponding to each candidate block, and a gaussian distribution variance corresponding to each candidate block, local probabilities that each candidate block is in each hidden state of the hidden markov model in each time slice covered by the observation sequence, respectively, and determine, based on each local probability, reverse pointers corresponding to each local probability, respectively;
a second end hidden state determining module, configured to determine, based on a maximum local probability of local probabilities of the candidate blocks being in the hidden states in a last time slice covered by the observation sequence, a hidden state of the candidate blocks in the last time slice;
a second hidden state sequence determining module, configured to perform optimal path backtracking based on the hidden state of each candidate block in the last time slice and each backward pointer, so as to obtain a hidden state sequence corresponding to each candidate block;
and the function type determining module is used for clustering based on the hidden state sequences respectively corresponding to the candidate blocks and respectively determining the function type of each candidate block from the candidate function types based on the clustering result.
14. A computer-readable storage medium, storing a computer program which, when executed by a processor, causes the processor to carry out the steps of the method according to any one of claims 1 to 11.
15. A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of the method according to any one of claims 1 to 11.
CN201910127322.5A 2019-02-20 2019-02-20 Method for determining hidden state sequence and method for determining function type of block Active CN111598114B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910127322.5A CN111598114B (en) 2019-02-20 2019-02-20 Method for determining hidden state sequence and method for determining function type of block

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910127322.5A CN111598114B (en) 2019-02-20 2019-02-20 Method for determining hidden state sequence and method for determining function type of block

Publications (2)

Publication Number Publication Date
CN111598114A true CN111598114A (en) 2020-08-28
CN111598114B CN111598114B (en) 2023-07-25

Family

ID=72183228

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910127322.5A Active CN111598114B (en) 2019-02-20 2019-02-20 Method for determining hidden state sequence and method for determining function type of block

Country Status (1)

Country Link
CN (1) CN111598114B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114021421A (en) * 2021-09-17 2022-02-08 青岛理工大学 Wireless sensor availability evaluation method and device based on hidden Markov model
CN116521764A (en) * 2023-07-05 2023-08-01 武昌理工学院 Environment design data processing method based on artificial intelligence

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106297296A (en) * 2016-10-12 2017-01-04 北京理工大学 A kind of fine granularity distribution method hourage based on sparse tracing point data
FR3046006A1 (en) * 2015-12-18 2017-06-23 Inst Mines-Telecom METHOD OF ESTIMATING TRAJECTORIES USING MOBILE DATA
CN108170680A (en) * 2017-12-29 2018-06-15 厦门市美亚柏科信息股份有限公司 Keyword recognition method, terminal device and storage medium based on Hidden Markov Model
CN108960181A (en) * 2018-07-17 2018-12-07 东南大学 Black smoke vehicle detection method based on multiple dimensioned piecemeal LBP and Hidden Markov Model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR3046006A1 (en) * 2015-12-18 2017-06-23 Inst Mines-Telecom METHOD OF ESTIMATING TRAJECTORIES USING MOBILE DATA
CN106297296A (en) * 2016-10-12 2017-01-04 北京理工大学 A kind of fine granularity distribution method hourage based on sparse tracing point data
CN108170680A (en) * 2017-12-29 2018-06-15 厦门市美亚柏科信息股份有限公司 Keyword recognition method, terminal device and storage medium based on Hidden Markov Model
CN108960181A (en) * 2018-07-17 2018-12-07 东南大学 Black smoke vehicle detection method based on multiple dimensioned piecemeal LBP and Hidden Markov Model

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MITRA BARATCHI ET.AL: "A hierarchical hidden semi-Markov model for modeling mobility data" *
WANZHENG ZHU ET.AL: "A Spherial Hidden Markov Model for Semantics-Rich Human Mobility Modeling" *
欧阳黜霏: "基于隐马尔可夫模型的车辆行程时间预测方法研究" *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114021421A (en) * 2021-09-17 2022-02-08 青岛理工大学 Wireless sensor availability evaluation method and device based on hidden Markov model
CN116521764A (en) * 2023-07-05 2023-08-01 武昌理工学院 Environment design data processing method based on artificial intelligence
CN116521764B (en) * 2023-07-05 2023-09-05 武昌理工学院 Environment design data processing method based on artificial intelligence

Also Published As

Publication number Publication date
CN111598114B (en) 2023-07-25

Similar Documents

Publication Publication Date Title
Sillero et al. Common mistakes in ecological niche models
Chen et al. Delineating urban functional areas with building-level social media data: A dynamic time warping (DTW) distance based k-medoids method
Reza Hosseini et al. Roadmap to mature BIM use in Australian SMEs: Competitive dynamics perspective
Bijak et al. Reforging the wedding ring: Exploring a semi-artificial model of population for the United Kingdom with Gaussian process emulators
CN110414732B (en) Travel future trajectory prediction method and device, storage medium and electronic equipment
CN113139140B (en) Tourist attraction recommendation method based on space-time perception GRU and combined with user relationship preference
Murphy et al. Risky planning on probabilistic costmaps for path planning in outdoor environments
Wu et al. A dynamic MSM with agent elements for spatial demographic forecasting
Fabusuyi et al. Decision analytics for parking availability in downtown Pittsburgh
Xu et al. Empirical stochastic branch-and-bound for optimization via simulation
Werndl Initial-condition dependence and initial-condition uncertainty in climate science
Whitworth Evaluations and improvements in small area estimation methodologies
Kulkarni et al. A microsimulation of daily activity patterns
CN111598114B (en) Method for determining hidden state sequence and method for determining function type of block
Wang et al. On the development of a semi-nonparametric generalized multinomial logit model for travel-related choices
Huynh et al. An agent based model for the simulation of road traffic and transport demand in a Sydney metropolitan area
Marrone et al. Automatic assessment of mathematical creativity using natural language processing
Pratama " Smart is not Equal to Technology": An Interview with Suhono Harso Supangkat on the Emergence and Development of Smart Cities in Indonesia
CN111126422A (en) Industry model establishing method, industry determining method, industry model establishing device, industry determining equipment and industry determining medium
Qian et al. Vehicle trajectory modelling with consideration of distant neighbouring dependencies for destination prediction
CN115222081A (en) Academic resource prediction method and device and computer equipment
Vu et al. Bus running time prediction using a statistical pattern recognition technique
Quan et al. An optimized task assignment framework based on crowdsourcing knowledge graph and prediction
Mulder et al. Bayesian inference for mixtures of von Mises distributions using reversible jump MCMC sampler
Bao et al. Storm-gan: spatio-temporal meta-gan for cross-city estimation of human mobility responses to covid-19

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant