CN106777091A - The double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment - Google Patents
The double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment Download PDFInfo
- Publication number
- CN106777091A CN106777091A CN201611150486.2A CN201611150486A CN106777091A CN 106777091 A CN106777091 A CN 106777091A CN 201611150486 A CN201611150486 A CN 201611150486A CN 106777091 A CN106777091 A CN 106777091A
- Authority
- CN
- China
- Prior art keywords
- data
- medical
- attribute
- skyline
- filtering
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G06F19/324—
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides the double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment, excessive to solve the problems, such as existing mobile network's result set, technical essential is:Including doctor terminal, client terminal and high in the clouds;The real-time medical flow data carried out using doctor terminal under O2O environment is gathered, and inquiry request is initiated using client terminal, and the inquiry request includes the threshold value of each attribute of multiple related medical science attributes, the preference to medical science attribute and input;High in the clouds be based on the medical science flow data and inquiry request, from the total space extract inquiry request parallel in relevant medical attribute n-dimensional subspace n;Effect is:Constantly interactive in the way of data flow can screen feedback data, accuracy is high, suitable for big data environment, the interaction platform that user and data segment are provided can provide accurate and satisfactory information, and the accuracy and quick user's decision-making that improve user's final decision are experienced.
Description
Technical field
It is the Skyline based on many medical factors under a kind of mobile O2O environment the invention belongs to database research field
Double filtering searching systems, are related to the mass data processing under large-scale data analysis, mobile O2O environment, are related to medical science intelligence
Can data processing and application and development.
Background technology
With the high speed development of internet, mobile O2O is arisen at the historic moment, i.e., the commercial chance under line is combined with internet, is allowed
Internet turns into the foreground of transaction.And the improvement of people's living standards so that the degree of concern to health problem is more and more hotter, right
It is also more and more in the retrieval of medical science.The explosive growth of mass data so that traditional unit Data Analysis Services technology is
Through the demand for being increasingly not suitable with current Method on Dense Type of Data Using analysis and treatment.Subspace Skyline algorithm is inquired about as Skyline
A kind of important mutation algorithm, be widely applied to the fields such as Multifactor Decision Making.With the development of mobile Internet,
Subspace Skyline inquiry under mobile internet environment receives much concern, and subspace Skyline algorithm is interested to user
Attribute set on carry out Skyline inquiries, how effectively to obtain the Skyline results of any subspace, be still tool
Challenging task.
In medical research, we are frequently encountered multiple medical factor collective effects in a situation for typical case,
Such as the patient that is suffered from tumor disease, if by the way of focus or expectant treatment is cut away, often comprehensively to comment
Estimate each side's factor, the joint effect of this multiple medical factor being related to, such as the age of patient is to the tolerance level performed the operation, and swells
The diffusion possibility of oncocyte, the hobby of patient, and the data of the history case that these factors are related to or all kinds of medical science of observation
Data volume is also very huge.This is accomplished by a kind of being filtered for large-scale data based on the effective of many medical factors
Search method processed.
It is big in order to solve the problems, such as massive medical data Skyline computing costs, carried out at Skyline under distributed environment
Reason is preferable solution.In order to the medical data retrieval set for solving to be fed back under mobile 020 environment is excessive, that is, deviate
The Skyline points of user preferences are returned as a result, so as to have influence on the problem of user's final decision, based on the starting point, I
Designed and Implemented the invention.
The content of the invention
Defect and deficiency according to present in above-mentioned background technology, the invention provides mobile O2O environment under based on many
The double filtering searching systems of the Skyline of medical factor, to solve the excessive i.e. deviation user preferences of result set under existing mobile network
Skyline points return as a result, so as to have influence on the problem of user's final decision.The present invention is simultaneously in the prior art
Deficiency in the subspace Skyline querying method of presence is improved, and is used to improve the degree of accuracy and real-time.
To achieve these goals, the technical solution adopted in the present invention is:Being cured based under a kind of mobile O2O environment more
The double filtering searching systems of the Skyline of factor, including doctor terminal, client terminal and high in the clouds;O2O is carried out using doctor terminal
Real-time medical flow data collection under environment, inquiry request is initiated using client terminal, and the inquiry request includes multiple correlations
The threshold value of each attribute of medical science attribute, the preference to medical science attribute and input;High in the clouds is based on the medical science flow data and inquiry
Request, from the total space extract inquiry request parallel in relevant medical attribute k (k≤d) n-dimensional subspace n, wherein d be the total space
Dimension, k is subspace data dimension, and the preference of each medical science attribute is arranged the k n-dimensional subspace ns according to client
Sequence ties up grid index to obtain an orderly k;High in the clouds scan data is tieed up in grid to the k of the unequal division of Custom Space,
According to the threshold value of attribute each described, beta pruning is carried out using the grid dominance relation in a dimension, cut and arranged by Skyline
Data in the grid and grid of the dimension fallen, realize to data filtering;High in the clouds carries out subspace to remaining data
Skyline is inquired about to obtain the Skyline results of the medical science attribute of user's request, in the subspace Skyline query process
In, for parallel flow data carry out it is bi-directional filtered with merge.
The method of the sequence is:The partial ordering relation of Importance of Attributes is defined as assuming there is a binary partial ordering relation >
Be on the F of subspace, partial ordering relation > represent in F Importance of Attributes more than relation, f1,f2It is two attributes on F,
f1,f2∈ F, if f1Importance be more than f2, then their partial ordering relation is expressed as f1> f2。
The filtering, is specially to perform A-filtering filter methods and ε-filtering filter methods, wherein:
A-filtering filtering method steps are as follows:The majorized function of multiobjective decision-making is defined as min (f1(x),f2
(x),...,fk(x)), wherein x ∈ P, fiX () is values of the data object x on i-th dimension attribute;
…
Above-mentioned formula is used for calculating the medical data object set R that value on the first dimension attribute most deviates user preference1, R0
Initialization data set P is represented, the relatively excellent medical data set of value on the first dimension attribute is derived from
Next in the medical data set that the first dimension is relatively excellentIn, using formula (3.2)Obtain
The worst medical data object set of second dimension attribute value, then from medical dataWeeded out in set, by that analogy, most
The all relatively excellent medical data object set of k dimension attributes value is obtained afterwards
ε-filtering filtering method steps are as follows:εiValue the preference of each medical science attribute is set according to user
Put, any user can provide the threshold value of maximum for each attribute, if value of the data object on attribute is unsatisfactory for corresponding threshold
Value, then the data object is filtered, and any user is modeled asεiCalculated by equation below:
WhereinRepresent threshold value of j-th user to ith attribute, max { fiRepresent data on data set P i-th
The maximum of value on individual attribute, falls value on the attribute paid close attention in user and is unsatisfactory for the data that user requires using threshold filtering
Object.
It is described subspace Skyline inquiry is carried out to remaining data method be:It is assumed that a medical data space for d dimensions
S={ s1,s2,...,sd, P is the medical data collection on data space S, each data point pi∈ P are the d dimensions in space S
Medical data in one dimension of data point, F be subspace in medical data space S i.e.| F |=k and k≤d,
Medical data object pi on data space S, its projection on the F of subspace is represented as pi' it is k tuples, and if only if
Do not exist point p ' on the F of subspacejDomination pi', pi' it is subspace Skyline Query Result, and the Query Result is returned into client
Terminal.
Under the O2O environment real-time medical flow data collection specific method be:Doctor is used based on mobile network
Mobile multimedia data collector, the Multiple factors for patient are multifactor using the simultaneously parallel Internet of Things based on flow data
Sensing chip taken at regular intervals individual patient state, feature extraction is committed to high in the clouds with the data after filtration treatment;The medical science stream
Data and inquiry request in the way of flow data, and by using time window monitor mode with time window as size, with
The mode of stream collects transmission to high in the clouds by wireless network batch.
It is described bi-directional filtered to be with method that is merging:On stream merging processor, in the time window reached to priority
Data flow carries out skyline filterings, if the data in same time window have dominance relation, carries out filter operation, because
Data in same time window enter equity, so filtering two mistakes with reverse skyline using positive skyline filterings
Filter step, positive skyline filterings are A-filtering filter methods, after user initiates inquiry request using client terminal,
It utilizes the majorized function of multiobjective decision-making, finally obtains all relatively excellent medical data object set of k dimension attributes value;Reversely
Skyline filtering be ε-filtering filtering, its by positive skyline filter after, according to user to each medical science attribute
Preference maximum threshold value is set, if value of the data object on attribute is unsatisfactory for respective threshold, then the data
Object is filtered.
Beneficial effect is:Intelligent mobile client adapts to different user functional requirement and use habit, initiates inquiry request,
And carry out information exchange with many medical factor subspace Skyline inquiry systems under mobile O2O environment.Can be with the side of data flow
Formula constantly interaction screening feedback data, accuracy is high, it is adaptable to the interaction platform that big data environment, user and data segment are provided
Accurate and satisfactory information can be provided, the accuracy and quick user's decision-making that improve user's final decision are experienced.
Brief description of the drawings
Fig. 1 is the system model of subspace Skyline inquiry of the invention;
Fig. 2 is the formula of A-filtering filter methods of the invention;
Fig. 3 is test data set of the invention;
Fig. 4 is MapReduce index buildings file of the invention;
Fig. 5 is subspace Skyline query process of the invention;
Fig. 6 is the symbol definition of subspace Skyline query process algorithm of the present invention;
Fig. 7 is subspace Skyline query process algorithm of the invention.
Specific embodiment
Embodiment 1:A kind of Skyline based on many medical factors double filtering search methods under mobile O2O environment, including
Following steps:
S1. the real-time medical flow data carried out using doctor terminal under O2O environment is gathered, and initiates to inquire about using client terminal
Request, the inquiry request includes the threshold of each attribute of multiple related medical science attributes, the preference to medical science attribute and input
Value;
S2. high in the clouds is based on the medical science flow data and inquiry request, from the total space extract inquiry request parallel in related doctor
K (k≤d) n-dimensional subspace n of attribute is learned, wherein d is total space dimension, and k subspaces data dimension extracts user's sense in total space F
The field (dimension) of interest is both k.According to client the preference of each medical science attribute is ranked up to the k n-dimensional subspace ns with
Obtain an orderly k dimension grid index.It is preferred that, the method for the sequence is:The partial ordering relation of Importance of Attributes
It is that on the F of subspace, partial ordering relation > represents the Importance of Attributes in F to be defined as assuming in the presence of a binary partial ordering relation >
More than relation, f1,f2It is two attributes on F, f1,f2∈ F, if f1Importance be more than f2, then their partial order is closed
System is expressed as f1> f2。
S3. during scan data ties up grid to the k of the unequal division of Custom Space, according to the threshold value of attribute each described,
Beta pruning is carried out using the grid dominance relation in a dimension, the grid and net of the dimension fallen by Skyline dominations is cut
Data in lattice, realize to data filtering;Described filtering, be specially perform A-filtering filter methods and ε-
Filtering filter methods, wherein:
A-filtering filtering method steps are as follows:The majorized function of multiobjective decision-making is defined as min (f1(x),f2
(x),...,fk(x)), wherein x ∈ P, fiX () is values of the data object x on i-th dimension attribute;
…
Above-mentioned formula is used for calculating the medical data object set R that value on the first dimension attribute most deviates user preference1, R0
Initialization data set P is represented, the relatively excellent medical data set of value on the first dimension attribute is derived from
Next in the medical data set that the first dimension is relatively excellentIn, using formula (3.2)Obtain
The worst medical data object set of second dimension attribute value, then from medical dataWeeded out in set, by that analogy, most
The all relatively excellent medical data object set of k dimension attributes value is obtained afterwards
ε-filtering filtering method steps are as follows:εiValue the preference of each medical science attribute is set according to user
Put, any user can provide the threshold value of maximum for each attribute, if value of the data object on attribute is unsatisfactory for corresponding threshold
Value, then the data object is filtered, and any user is modeled asεiCalculated by equation below:
WhereinRepresent threshold value of j-th user to ith attribute, max { fiRepresent data on data set P i-th
The maximum of value on individual attribute, falls value on the attribute paid close attention in user and is unsatisfactory for the data that user requires using threshold filtering
Object.
S4. subspace Skyline inquiry is carried out to remaining data and is tied with obtaining the Skyline of the medical science attribute of user's request
Really, in the subspace Skyline query process, for parallel flow data carry out it is bi-directional filtered with merge.It is described to residue
The method that data carry out subspace Skyline inquiry is:It is assumed that a medical data space S={ s for d dimensions1,s2,...,sd, P
It is the medical data collection on data space S, each data point pi∈ P are in a dimension at the d dimensions strong point in space S
Medical data, F be subspace in medical data space S i.e.| F |=k and k≤d, the medical treatment on data space S
Data object pi, its projection on the F of subspace is represented as pi' it is k tuples, do not exist point p ' on and if only if subspace Fj
Domination pi', pi' it is subspace Skyline Query Result, and the Query Result is returned into client terminal.
In this embodiment, the specific method of the real-time medical flow data collection under the O2O environment in the step S1
It is:Doctor using the mobile multimedia data collector based on mobile (3G/4G) network, for Multiple factors (such as heart of patient
Electricity monitoring figure, blood pressure, body temperature, pulse frequency etc.), it is regular using the simultaneously parallel multifactor sensing chip of the Internet of Things based on flow data
Collection individual patient state, feature extraction is committed to high in the clouds with the data after filtration treatment;The medical science flow data and inquiry please
Ask in the way of flow data, and by using time window monitor mode with time window as size, pass through in a streaming manner
Wireless network batch collects transmission to high in the clouds.
Bi-directional filtered in the step S4 be with method that is merging:Stream merging processor on, to priority reach when
Between data flow in window carry out skyline filterings, if the data in same time window have dominance relation, filtered
Operation, because the data in same time window enter equity, using positive skyline filterings and reverse skyline
Two filtration steps of filtering.Positive skyline filterings are A-filtering filter methods, and user is initiated using client terminal
After inquiry request, it utilizes the majorized function of multiobjective decision-making, finally obtains all relatively excellent medical data pair of k dimension attributes value
As set.Reverse skyline filterings are-filtering filterings, its after being filtered by positive skyline, according to user couple
The preference of each medical science attribute sets maximum threshold value, if value of the data object on attribute is unsatisfactory for respective threshold,
So the data object is filtered.
In the practical operation of the embodiment, such as doctor select operation plan when, it is considered to factor have:The valency of operation
Lattice;, the effect of operation;Postoperative influence;Speed of recovery etc..So required according to user, to effect, the valency of operation performed the operation
Lattice and postoperative influence these three attributes compare concern.Because patient has certain economic strength, carried out according to Importance of Attributes
After sequence, the order of above-mentioned subspace attribute is adjusted to:{ effect of operation, postoperative influence, the price of operation } such as Fig. 5 (b)
It is shown.For user tired out, due to the data p in data set P5,p6Curative effect is not fine, according to A-filtering mistakes
The filtering principle of filtering method, the two data objects should not be taken as result set and return to user.Therefore p5,p6In Fig. 5 (c)
Red is noted as, and was just deleted before dominance relation compares.Similarly data object p4,p8Because postoperative effect is inclined
From user preferences, it is noted as in Fig. 5 (c) orange, is also deleted.By that analogy, until all subspaces attribute all
It is taken into account.In A-filtering filter methods, it can only be filtered out every time, and attribute value is worst most to deviate user preferences
Data object.Then filtered again with another filter method ε-filtering for being based on tolerance.It is assumed that doctor exists
Selection operation plan when, be patient consult after propose selection success rate of operation more than 80%, prognostic function recover to 90% with
On operation plan.So the surgical effect that user compares concern, postoperative influence, valency of performing the operation can be filtered out using these threshold values
Value is unsatisfactory for the data object of user's hard requirement on lattice.Subspace is carried out to remaining data object after filtration
Skyline inquiries are as shown in Fig. 5 (d).Finally obtain result set p1,p2Return to user.
In the embodiment, the double filtering retrievals of the Skyline based on many medical factors under a kind of mobile O2O environment are further related to
System, including doctor terminal, client terminal and high in the clouds;The real-time medical flow data carried out using doctor terminal under O2O environment is adopted
Collection, inquiry request is initiated using client terminal, and the inquiry request includes multiple related medical science attribute, the preferences to medical science attribute
The threshold value of degree and each attribute of input;High in the clouds is based on the medical science flow data and inquiry request, extracts inquiry parallel from the total space
K (k≤d) n-dimensional subspace n of relevant medical attribute in request, wherein d is total space dimension, and k is subspace data dimension, is pressed
The preference of each medical science attribute is ranked up to the k n-dimensional subspace ns according to client ties up grid to obtain an orderly k
Index;High in the clouds scan data is tieed up in grid to the k of the unequal division of Custom Space, according to the threshold value of attribute each described, is made
Beta pruning is carried out with the grid dominance relation in a dimension, the grid and grid of the dimension fallen by Skyline dominations is cut
In data, realize to data filtering;High in the clouds carries out subspace Skyline inquiry to remaining data to obtain the doctor of user's request
The Skyline results of attribute are learned, in the subspace Skyline query process, is carried out for parallel flow data bi-directional filtered
With merge.
The method of the sequence is:The partial ordering relation of Importance of Attributes is defined as assuming there is a binary partial ordering relation >
Be on the F of subspace, partial ordering relation > represent in F Importance of Attributes more than relation, f1,f2It is two attributes on F,
f1,f2∈ F, if f1Importance be more than f2, then their partial ordering relation is expressed as f1> f2。
The filtering, is specially to perform A-filtering filter methods and ε-filtering filter methods, wherein:
A-filtering filtering method steps are as follows:The majorized function of multiobjective decision-making is defined as min (f1(x),f2
(x),...,fk(x)), wherein x ∈ P, fiX () is values of the data object x on i-th dimension attribute;
…
Above-mentioned formula is used for calculating the medical data object set R that value on the first dimension attribute most deviates user preference1, R0
Initialization data set P is represented, the relatively excellent medical data set of value on the first dimension attribute is derived from
Next in the medical data set that the first dimension is relatively excellentIn, using formula (3.2)Obtain
The worst medical data object set of second dimension attribute value, then from medical dataWeeded out in set, by that analogy, most
The all relatively excellent medical data object set of k dimension attributes value is obtained afterwards
ε-filtering filtering method steps are as follows:εiValue the preference of each medical science attribute is set according to user
Put, any user can provide the threshold value of maximum for each attribute, if value of the data object on attribute is unsatisfactory for corresponding threshold
Value, then the data object is filtered, and any user is modeled asεiCalculated by equation below:
WhereinRepresent threshold value of j-th user to ith attribute, max { fiRepresent data on data set P i-th
The maximum of value on individual attribute, falls value on the attribute paid close attention in user and is unsatisfactory for the data that user requires using threshold filtering
Object.
It is described subspace Skyline inquiry is carried out to remaining data method be:It is assumed that a medical data space for d dimensions
S={ s1,s2,...,sd, P is the medical data collection on data space S, each data point pi∈ P are the d dimensions in space S
Medical data in one dimension of data point, F be subspace in medical data space S i.e.| F |=k and k≤d,
Medical data object pi on data space S, its projection on the F of subspace is represented as pi' it is k tuples, and if only if
Do not exist point p ' on the F of subspacejDomination pi', pi' it is subspace Skyline Query Result, and the Query Result is returned into client
Terminal.
Under the O2O environment real-time medical flow data collection specific method be:Doctor is used based on mobile network
Mobile multimedia data collector, the Multiple factors for patient are multifactor using the simultaneously parallel Internet of Things based on flow data
Sensing chip taken at regular intervals individual patient state, feature extraction is committed to high in the clouds with the data after filtration treatment;The medical science stream
Data and inquiry request in the way of flow data, and by using time window monitor mode with time window as size, with
The mode of stream collects transmission to high in the clouds by wireless network batch.
It is described bi-directional filtered to be with method that is merging:On stream merging processor, in the time window reached to priority
Data flow carries out skyline filterings, if the data in same time window have dominance relation, carries out filter operation, because
Data in same time window enter equity, so filtering two mistakes with reverse skyline using positive skyline filterings
Filter step, positive skyline filterings are A-filtering filter methods, after user initiates inquiry request using client terminal,
It utilizes the majorized function of multiobjective decision-making, finally obtains all relatively excellent medical data object set of k dimension attributes value;Reversely
Skyline filtering be-filtering filtering, its by positive skyline filter after, according to user to each medical science attribute
Preference maximum threshold value is set, if value of the data object on attribute is unsatisfactory for respective threshold, then the data
Object is filtered.
Embodiment 2:A kind of double filtering search methods of the Skyline based on many medical factors under mobile O2O environment, are solution
Certainly move under O2O environment many medical factor mass data high costs of subspace Skyline algorithm process and result set is huge asks
Topic, for the specific demand of user, according to different user demands come Selecting Representative Points from A, mainly by three kinds of filter method A-
Filtering methods, ε-filtering and D-filtering methods are constituted, and perform step as follows:
S1. O2O environment and the foundation of the subspace Skyline inquiry system of many medical factors are used movement by mobile terminal
Terminal carries out real-time medical science flow data collection and performs A-filtering methods, ε-filtering and D-filtering
Method these three filter methods;
S2. intelligent mobile client adapts to different terminals medical science user functional requirement and use habit, initiates inquiry request,
User terminal mainly includes doctor and patient;
S3. doctor uses the mobile multimedia data collector based on 3G/4G networks, and the Multiple factors for patient are carried out
Collection, use can the simultaneously parallel multifactor sensing chip based on Internet of Things with flow data to carry out individual patient state regular
Collection;
S4. by the way of time window monitoring, the data that will periodically collect are merged gatherer process, and with stream
Mode is transmitted by wireless network;
S5. the mobile O2O environment by building is acquired, and then carries out the data after feature extraction and filtration treatment,
Submit cloud terminal or cloud data center to, operation carries out data distribution formula based on many medical factor subspace Skyline inquiry systems
Analysis, the data in mobile and high in the clouds in a streaming manner, carry out information exchange with bi-directional filtered and beta pruning optimization processing.
Used as the supplement of technical scheme, many medical factor subspace Skyline inquiry systems are one under movement O2O environment
The mobile cloud computing model based on mobile terminal/cloud server framework is planted, client is to operate in cell phone, panel computer, doctor
The application program on the terminal devices such as the wearable collecting device of Internet of Things is learned, these terminals use internet or mobile 3G/4G
Network is communicated with server, is sent inquiry request and is received Query Result.In order to determine Skyline Multiple factors really
Fixed, the using terminal user of medical system can be selected according to the preference to multiple related medical science attributes and defeated in terminal
Enter the threshold value of each attribute, finally by inquiry request in the way of flow data, with time window as size, collect in batches and be sent to
Cloud server end, it is notable that we and conventional C/S frameworks are different from, we are used based on time window size
The mode of data flow carries out data transmission, and cloud server end is also to carry out data receiver in a streaming manner with verification.When any
During the data over-time window size of one end, waited.The data of each time window, we will be same in a parallel fashion
When distribute to different calculate nodes and carry out parallelization treatment.So can greatly improve the bi-directional filtered algorithms of Skyline
Parallelization execution efficiency
The high in the clouds model of the system is lightweight client in cloud environment based on the system tray after Spark improvement
Structure.Distributed Skyline is calculated and mainly carried out in the cloud platform of rear end, and it includes two modules:Pretreatment module and inquiry
Module.In pretreatment module, according to time window size, system in parallel ground scan data to the unequal division of Custom Space
Multi-dimensional grid in, according to the threshold value of each factor, beta pruning is carried out using the grid dominance relation in a dimension, cut by
Data in the grid and grid of the dimension that Skyline dominations are fallen.It is that we are to adopt with traditional skyline differences
In the pattern of multi-dimensional grid, parallelization is carried out on Multiple factors different dimensions parallel and is calculated and treatment.Received server-side is used
The inquiry request at family, performs A-filtering methods, ε-filtering methods and D-filtering these three filter methods,
First two filter method can parallel extract user k interested (k≤d) n-dimensional subspace n from the total space, and wherein d is tieed up for the total space
Number, and it is ranked up with the fancy grade to each factor according to medical terminal obtains an orderly k dimensions grid index,
Then Distributed Scans are compared the dominance relation of data point and obtain subspace Skyline result in a streaming manner, in a streaming manner
Return to terminal and using the 3rd filter method carry out it is bi-directional filtered with merge.
Because the distributed stream treatment technology employed based on time window that we innovate, enters in multiple calculate nodes
Row parallelization is processed, and like this, the computer node of some virtualizations may be calculated if resource or computational load are big
Result can be slow, in some instances it may even be possible to crashes.In this case, the calculating task can be abandoned, and transfer to other nodes to carry out
Treatment.When the node of Distributed Calculation has completion, will in a streaming manner be pushed to flow data collector and result will be closed
And with optimization.The Consumer's Experience that can so produce is that the result of computer can occur rapidly, but result be it is incomplete, with
The accumulation of the flow data of Skyline filter results, final result can be more and more accurate.Wait requirement according to user, user
Decision-making is can be carried out after the data of result of certain precision are accumulated, and without waiting all of result to collect, this will be very big
Raising user interactive experience
Used as the supplement of technical scheme, it is one medical data space S of d dimensions of hypothesis that subspace Skyline is query-defined
={ s1,s2,...,sd, P is the medical data collection on data space S, i.e. each data point pi∈ P are the d in space S
Dimension strong point is the medical data in a dimension, such as cardiopathic rhythm of the heart situation.F is the son in medical data space S
Space is| F |=k and k≤d.Medical data object p on data space Si, piProjection quilt on the F of subspace
It is expressed as pi' it is k tuples.If pi' it is subspace Skyline result, do not exist point p ' on and if only if subspace FjDomination
pi′。
Used as the supplement of technical scheme, the partial ordering relation of Importance of Attributes is defined as assuming there is a binary partial ordering relation
> is on the F of subspace.Partial ordering relation > represent in F Importance of Attributes more than relation, f1,f2It is two category on F
Property, f1,f2∈ F, if f1Importance be more than f2, then their partial ordering relation can be expressed as f1> f2.We obtain one
Individual orderly k n-dimensional subspace ns { f1,f2,...,fk}.On the basis of two above definition, it is proposed that three kinds of filter methods:A-
Filtering, ε-filtering and D-filtering filter methods are big with the result set for reducing subspace Skyline return
It is small.
The process step of A-filtering filter methods is specially:The majorized function of multiobjective decision-making can be defined as min
(f1(x),f2(x),...,fk(x)), wherein x ∈ P, fiX () is values of the data object x on i-th dimension attribute.Formula is shown in figure
1:
Formula 1.1 is used for calculating the set of data objects R that value on the first dimension attribute most deviates medical science user preferences1, R0Table
Show initialization data set P, it is possible thereby to obtain the relatively excellent data acquisition system of value on the first dimension attributeI.e.
Next in the data acquisition system that the first dimension is relatively excellentIn, the second dimension attribute value of acquisition is calculated using formula 1.2 most
Poor set of data objects, Ran HoucongWeeded out in set.By that analogy, k dimension attributes value is finally obtained all relatively excellent
Medical data object set
The process step of ε-filtering filter methods is specially:Filter method ε-filtering are in A-
On the basis of filtering filter methods, for each attribute provides corresponding threshold value.The ε in formula 1.4i(0≤εi≤ 1) be exactly
It is the tolerance threshold limit that ith attribute is provided, εiValue according to medical science user's fancy setting, such as want the interior of hypertension
Section treats, and patient is associated with renal dysfunction, therefore the purpose of the treatment is can effectively to control hypertension and can reduce to kidney
Function damage, then user sends order to system, the result for limiting transmission is damaged as control hypertension and to reducing to renal function
Evil.It would generally shift to an earlier date and sends jointly to server by inquiry request by medical science user, perform inquiry.So filter out every time
The data object that attribute value deviates user preferences is more, and obtaining final result set can be relatively smaller.
As the supplement of technical scheme, above formula is applied to the data set in table 1, it is assumed that the attribute in the F of subspace
The result sorted by user preferences significance level is { Mileage, Price, OccupancyRate }, ε1(ε2,ε3) it is each attribute
Corresponding threshold value, works as ε1When=1, R can be calculated1={ p5,p6,
And work asWhen, R1={ p5,p6,p1,p9,
Further, user subspace interested is extracted from total space index according to medical science user preferences, for example, is entered
One hyperpietic of row carries out medical treatment, it is considered to factor have curative effect of medication, drug side-effect, three bars of patient's complication
Part.So required according to doctor, concern is compared to curative effect of medication, drug side-effect, complication for patients these three attributes.User couple
The demand of curative effect of medication is than larger, then after system is ranked up according to Importance of Attributes, and the order of above-mentioned subspace attribute is adjusted
It is whole to be:{ curative effect of medication, patient's complication, drug side-effect }.For the preferential user of curative effect of medication, due to data set P
In data p5,p6Differed with condition farther out, according to the filtering principle of A-filtering filter methods, the two data objects are not
Should as a result collect and return to user.Therefore p5,p6Red is noted as, and was just deleted before dominance relation compares
.Similarly data object p4,p8Because complication for patients this attribute deviates user preferences, it is noted as orange, is also deleted
Fall.By that analogy, until the attribute of all subspaces is all taken into account.
Further, in A-filtering filter methods, it can only be filtered out every time, and attribute value is worst most to be deviateed
The data object of user preferences.Then further screened with the filter method ε-filtering based on tolerance.ε-
Filtering filter methods are concretely comprised the following steps:εiFormula 1.4 before was mentioned.εiValue will be according to the happiness of user
It is good to set.Any user can provide the threshold value of maximum for each attribute, if value of the data object on attribute is unsatisfactory for
Respective threshold, then the data object will be filtered.Any user can be modeled asSo
εiCalculated by equation below:
WhereinRepresent threshold value of j-th user to ith attribute, max { fiRepresent data on data set P i-th
The maximum of value on individual attribute.Using these threshold filterings fall those user compare value on the attribute of concern be unsatisfactory for use
The data object of family hard requirement.
Further, subspace Skyline inquiry is carried out to remaining data object after filtration, finally obtains result
Collection p1,p2Return to user.
Because during Skyline query processings beyond the clouds, we employ the processing procedure of parallel flow data, and some are fast
The result data stream of calculate node can first reach, the result data stream of some slow calculate nodes reaches after the meeting thus we
On stream merging processor, D-filtering filter processes are carried out, the data flow in time window reached to priority is carried out
Skyline is filtered, if the data in same time window have dominance relation, just carries out filter operation, because with for the moment
Between data in window enter equity, so we filter two filterings using forward direction skyline filterings and reverse skyline walking
Suddenly, so our this filter method is called the bi-directional filtered method i.e. D-filtering excessively of Double.This mode is really
The further filtering of stream data result and optimization, to produce smaller result set.And accelerate retrieval rate, improve user's body
Test.
The above, the only present invention preferably specific embodiment, but protection scope of the present invention is not limited thereto,
Any one skilled in the art in the technical scope of present disclosure, technology according to the present invention scheme and its
Inventive concept is subject to equivalent or change, should all be included within the scope of the present invention.
Claims (6)
1. a kind of double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment, it is characterised in that:Bag
Include doctor terminal, client terminal and high in the clouds;The real-time medical flow data carried out using doctor terminal under O2O environment is gathered, and is used
Client terminal initiates inquiry request, the inquiry request include multiple related medical science attributes, the preference to medical science attribute and
The threshold value of each attribute of input;High in the clouds is based on the medical science flow data and inquiry request, in extracting inquiry request parallel from the total space
Relevant medical attribute k (k≤d) n-dimensional subspace n, wherein d be total space dimension, k be subspace data dimension, according to client
To hold be ranked up the k n-dimensional subspace ns preference of each medical science attribute and tie up grid index to obtain an orderly k;Cloud
End scan data is tieed up in grid to the k of the unequal division of Custom Space, according to the threshold value of attribute each described, uses a dimension
Grid dominance relation on degree carries out beta pruning, cuts the data in the grid and grid of the dimension fallen by Skyline dominations,
Realize to data filtering;High in the clouds carries out subspace Skyline inquiry to remaining data to obtain the medical science attribute of user's request
Skyline results, in the subspace Skyline query process, for parallel flow data carry out it is bi-directional filtered with merge.
2. the Skyline based on many medical factors pairs under mobile O2O environment as claimed in claim 1 filters searching systems,
Characterized in that, the method for the sequence is:The partial ordering relation of Importance of Attributes is defined as assuming that there is a binary partial order closes
Be > be on the F of subspace, partial ordering relation > represent in F Importance of Attributes more than relation, f1,f2It is two category on F
Property, f1,f2∈ F, if f1Importance be more than f2, then their partial ordering relation is expressed as f1> f2。
3. the Skyline based on many medical factors pairs under mobile O2O environment as claimed in claim 1 filters searching systems,
Characterized in that, the filtering, is specially to perform A-filtering filter methods and ε-filtering filter methods, its
In:
A-filtering filtering method steps are as follows:The majorized function of multiobjective decision-making is defined as min (f1(x),f2
(x),...,fk(x)), wherein x ∈ P, fiX () is values of the data object x on i-th dimension attribute;
…
Above-mentioned formula is used for calculating the medical data object set R that value on the first dimension attribute most deviates user preference1, R0Represent
Initialization data set P, is derived from the relatively excellent medical data set of value on the first dimension attribute Next
In the medical data set that the first dimension is relatively excellentIn, using formula (3.2)Second is obtained to tie up
The worst medical data object set of attribute value, then from medical dataWeeded out in set, by that analogy, finally obtained
The all relatively excellent medical data object set of k dimension attributes value
ε-filtering filtering method steps are as follows:εiValue the preference of each medical science attribute is set according to user, appoint
Meaning user can provide the threshold value of maximum for each attribute, if value of the data object on attribute is unsatisfactory for respective threshold, that
The data object is filtered, and any user is modeled asεiCalculated by equation below:
WhereinRepresent threshold value of j-th user to ith attribute, max { fiRepresent that the data on data set P belong at i-th
Property on value maximum, using threshold filtering fall user pay close attention to attribute on value be unsatisfactory for user require data pair
As.
4. the Skyline based on many medical factors pairs under mobile O2O environment as claimed in claim 1 filters searching systems,
Characterized in that, it is described subspace Skyline inquiry is carried out to remaining data method be:It is assumed that a medical data for d dimensions
Space S={ s1,s2,...,sd, P is the medical data collection on data space S, each data point pi∈ P are in space S
D dimensions strong point a dimension on medical data, F be subspace in medical data space S i.e.| F |=k and
K≤d, the medical data object p on data space Si, its projection on the F of subspace is represented as p 'iK tuples, when and
Only when on the F of subspace do not exist point p 'jDomination p 'i, p 'iIt is subspace Skyline Query Result, and the Query Result is returned
Client terminal.
5. the Skyline based on many medical factors pairs under mobile O2O environment as claimed in claim 1 filters searching systems,
Characterized in that, the specific method of the real-time medical flow data collection under the O2O environment is:Doctor is used based on mobile network
Mobile multimedia data collector, for patient Multiple factors using the parallel Internet of Things based on flow data simultaneously it is many because
Plain sensing chip taken at regular intervals individual patient state, feature extraction is committed to high in the clouds with the data after filtration treatment;The medical science
Flow data and inquiry request in the way of flow data, and by using time window monitor mode with time window as size,
Transmission to high in the clouds is collected by wireless network batch in a streaming manner.
6. the Skyline based on many medical factors pairs under mobile O2O environment as claimed in claim 1 filters searching systems,
Bi-directional filtered it is with method that is merging characterized in that, described:On stream merging processor, in the time window reached to priority
Data flow carry out skyline filterings, if the data in same time window have dominance relation, carry out filter operation, because
It is that data in same time window enter equity, so filtering two using positive skyline filterings and reverse skyline
Filtration step, positive skyline filterings are A-filtering filter methods, and user initiates inquiry request using client terminal
Afterwards, it utilizes the majorized function of multiobjective decision-making, finally obtains all relatively excellent medical data object set of k dimension attributes value;
Reverse skyline filterings are ε-filtering filterings, its after being filtered by positive skyline, according to user to each medical science
The preference of attribute sets maximum threshold value, if value of the data object on attribute is unsatisfactory for respective threshold, then should
Data object is filtered.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611150486.2A CN106777091A (en) | 2016-12-14 | 2016-12-14 | The double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611150486.2A CN106777091A (en) | 2016-12-14 | 2016-12-14 | The double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106777091A true CN106777091A (en) | 2017-05-31 |
Family
ID=58876930
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611150486.2A Pending CN106777091A (en) | 2016-12-14 | 2016-12-14 | The double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106777091A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109446294A (en) * | 2018-11-13 | 2019-03-08 | 嘉兴学院 | A kind of parallel mutual subspace Skyline querying method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103473320A (en) * | 2013-09-12 | 2013-12-25 | 南京大学 | Method for service combination facing cloud-spanning platform |
CN104809210A (en) * | 2015-04-28 | 2015-07-29 | 东南大学 | Top-k query method based on massive data weighing under distributed computing framework |
CN105183921A (en) * | 2015-10-23 | 2015-12-23 | 大连大学 | Shop addressing system based on bi-chromatic reverse nearest neighbor inquiry under mobile cloud computing environment |
-
2016
- 2016-12-14 CN CN201611150486.2A patent/CN106777091A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103473320A (en) * | 2013-09-12 | 2013-12-25 | 南京大学 | Method for service combination facing cloud-spanning platform |
CN104809210A (en) * | 2015-04-28 | 2015-07-29 | 东南大学 | Top-k query method based on massive data weighing under distributed computing framework |
CN105183921A (en) * | 2015-10-23 | 2015-12-23 | 大连大学 | Shop addressing system based on bi-chromatic reverse nearest neighbor inquiry under mobile cloud computing environment |
Non-Patent Citations (2)
Title |
---|
YUANYUAN LI 等: "Efficient Subspace Skyline Query based on User Preference using MapReduce", 《AD HOC NETWORKS》 * |
李媛媛 等: "基于时间序列的Global Skyline并行算法", 《***工程与电子技术》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109446294A (en) * | 2018-11-13 | 2019-03-08 | 嘉兴学院 | A kind of parallel mutual subspace Skyline querying method |
CN109446294B (en) * | 2018-11-13 | 2021-09-07 | 嘉兴学院 | Parallel mutual subspace Skyline query method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Nie et al. | Sustainable computing in smart agriculture: survey and challenges | |
CN104809130B (en) | Method, equipment and the system of data query | |
CN111680813B (en) | Method, device, equipment and storage medium for intelligent reserved vaccination | |
WO2020147353A1 (en) | Embedded time series decision tree classification method and system for edge end | |
CN104866831B (en) | The face recognition algorithms of characteristic weighing | |
CN106650228A (en) | Noise data removal method through improved k-means algorithm and implementation system | |
CN104462318A (en) | Identity recognition method and device of identical names in multiple networks | |
CN115497631A (en) | Clinical scientific research big data analysis system | |
CN111694651B (en) | Task processing optimization system based on cloud computing and medical big data | |
CN109543117A (en) | Service push method and terminal device based on intelligent recommendation | |
CN109885651A (en) | A kind of question pushing method and device | |
CN116860840A (en) | Rapid retrieval method for highway pavement information | |
CN107622796A (en) | Skin disease based on health control platform is assessed and remote medical consultation with specialists method and system | |
CN107480426A (en) | From iteration case history archive cluster analysis system | |
CN106777091A (en) | The double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment | |
CN106844419A (en) | A kind of skyline preference querying methods based on magnanimity Incomplete data set | |
CN107944487A (en) | A kind of crop breeding recommendation of new cultivars method based on mixing collaborative filtering | |
CN106503271A (en) | The intelligent shop site selection system of subspace Skyline inquiry under mobile Internet and cloud computing environment | |
CN103617474A (en) | System and method for managing clinical path variation based on dynamic routing algorithm | |
CN110427870B (en) | Eye picture recognition method, target recognition model training method and device | |
CN103034728A (en) | Method for carrying out information interaction by utilizing academic resource interaction platform of social network | |
CN106777095A (en) | The double filtering search methods of the Skyline based on many medical factors under mobile O2O environment | |
CN107610741A (en) | A kind of the interrogation analysis method and system of the intelligent health management based on mobile terminal | |
CN107391912A (en) | The hospital clinical operation data system of selection for the size stream classification applied in cloud data center system | |
Joshi et al. | Predicting suitability of crop by developing fuzzy decision support system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170531 |