CN106777091A - The double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment - Google Patents

The double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment Download PDF

Info

Publication number
CN106777091A
CN106777091A CN201611150486.2A CN201611150486A CN106777091A CN 106777091 A CN106777091 A CN 106777091A CN 201611150486 A CN201611150486 A CN 201611150486A CN 106777091 A CN106777091 A CN 106777091A
Authority
CN
China
Prior art keywords
data
medical
attribute
skyline
filtering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611150486.2A
Other languages
Chinese (zh)
Inventor
季长清
张敏
宋晗
李媛媛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University
Original Assignee
Dalian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University filed Critical Dalian University
Priority to CN201611150486.2A priority Critical patent/CN106777091A/en
Publication of CN106777091A publication Critical patent/CN106777091A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • G06F19/324

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides the double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment, excessive to solve the problems, such as existing mobile network's result set, technical essential is:Including doctor terminal, client terminal and high in the clouds;The real-time medical flow data carried out using doctor terminal under O2O environment is gathered, and inquiry request is initiated using client terminal, and the inquiry request includes the threshold value of each attribute of multiple related medical science attributes, the preference to medical science attribute and input;High in the clouds be based on the medical science flow data and inquiry request, from the total space extract inquiry request parallel in relevant medical attribute n-dimensional subspace n;Effect is:Constantly interactive in the way of data flow can screen feedback data, accuracy is high, suitable for big data environment, the interaction platform that user and data segment are provided can provide accurate and satisfactory information, and the accuracy and quick user's decision-making that improve user's final decision are experienced.

Description

The double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment
Technical field
It is the Skyline based on many medical factors under a kind of mobile O2O environment the invention belongs to database research field Double filtering searching systems, are related to the mass data processing under large-scale data analysis, mobile O2O environment, are related to medical science intelligence Can data processing and application and development.
Background technology
With the high speed development of internet, mobile O2O is arisen at the historic moment, i.e., the commercial chance under line is combined with internet, is allowed Internet turns into the foreground of transaction.And the improvement of people's living standards so that the degree of concern to health problem is more and more hotter, right It is also more and more in the retrieval of medical science.The explosive growth of mass data so that traditional unit Data Analysis Services technology is Through the demand for being increasingly not suitable with current Method on Dense Type of Data Using analysis and treatment.Subspace Skyline algorithm is inquired about as Skyline A kind of important mutation algorithm, be widely applied to the fields such as Multifactor Decision Making.With the development of mobile Internet, Subspace Skyline inquiry under mobile internet environment receives much concern, and subspace Skyline algorithm is interested to user Attribute set on carry out Skyline inquiries, how effectively to obtain the Skyline results of any subspace, be still tool Challenging task.
In medical research, we are frequently encountered multiple medical factor collective effects in a situation for typical case, Such as the patient that is suffered from tumor disease, if by the way of focus or expectant treatment is cut away, often comprehensively to comment Estimate each side's factor, the joint effect of this multiple medical factor being related to, such as the age of patient is to the tolerance level performed the operation, and swells The diffusion possibility of oncocyte, the hobby of patient, and the data of the history case that these factors are related to or all kinds of medical science of observation Data volume is also very huge.This is accomplished by a kind of being filtered for large-scale data based on the effective of many medical factors Search method processed.
It is big in order to solve the problems, such as massive medical data Skyline computing costs, carried out at Skyline under distributed environment Reason is preferable solution.In order to the medical data retrieval set for solving to be fed back under mobile 020 environment is excessive, that is, deviate The Skyline points of user preferences are returned as a result, so as to have influence on the problem of user's final decision, based on the starting point, I Designed and Implemented the invention.
The content of the invention
Defect and deficiency according to present in above-mentioned background technology, the invention provides mobile O2O environment under based on many The double filtering searching systems of the Skyline of medical factor, to solve the excessive i.e. deviation user preferences of result set under existing mobile network Skyline points return as a result, so as to have influence on the problem of user's final decision.The present invention is simultaneously in the prior art Deficiency in the subspace Skyline querying method of presence is improved, and is used to improve the degree of accuracy and real-time.
To achieve these goals, the technical solution adopted in the present invention is:Being cured based under a kind of mobile O2O environment more The double filtering searching systems of the Skyline of factor, including doctor terminal, client terminal and high in the clouds;O2O is carried out using doctor terminal Real-time medical flow data collection under environment, inquiry request is initiated using client terminal, and the inquiry request includes multiple correlations The threshold value of each attribute of medical science attribute, the preference to medical science attribute and input;High in the clouds is based on the medical science flow data and inquiry Request, from the total space extract inquiry request parallel in relevant medical attribute k (k≤d) n-dimensional subspace n, wherein d be the total space Dimension, k is subspace data dimension, and the preference of each medical science attribute is arranged the k n-dimensional subspace ns according to client Sequence ties up grid index to obtain an orderly k;High in the clouds scan data is tieed up in grid to the k of the unequal division of Custom Space, According to the threshold value of attribute each described, beta pruning is carried out using the grid dominance relation in a dimension, cut and arranged by Skyline Data in the grid and grid of the dimension fallen, realize to data filtering;High in the clouds carries out subspace to remaining data Skyline is inquired about to obtain the Skyline results of the medical science attribute of user's request, in the subspace Skyline query process In, for parallel flow data carry out it is bi-directional filtered with merge.
The method of the sequence is:The partial ordering relation of Importance of Attributes is defined as assuming there is a binary partial ordering relation > Be on the F of subspace, partial ordering relation > represent in F Importance of Attributes more than relation, f1,f2It is two attributes on F, f1,f2∈ F, if f1Importance be more than f2, then their partial ordering relation is expressed as f1> f2
The filtering, is specially to perform A-filtering filter methods and ε-filtering filter methods, wherein:
A-filtering filtering method steps are as follows:The majorized function of multiobjective decision-making is defined as min (f1(x),f2 (x),...,fk(x)), wherein x ∈ P, fiX () is values of the data object x on i-th dimension attribute;
Above-mentioned formula is used for calculating the medical data object set R that value on the first dimension attribute most deviates user preference1, R0 Initialization data set P is represented, the relatively excellent medical data set of value on the first dimension attribute is derived from Next in the medical data set that the first dimension is relatively excellentIn, using formula (3.2)Obtain The worst medical data object set of second dimension attribute value, then from medical dataWeeded out in set, by that analogy, most The all relatively excellent medical data object set of k dimension attributes value is obtained afterwards
ε-filtering filtering method steps are as follows:εiValue the preference of each medical science attribute is set according to user Put, any user can provide the threshold value of maximum for each attribute, if value of the data object on attribute is unsatisfactory for corresponding threshold Value, then the data object is filtered, and any user is modeled asεiCalculated by equation below:
WhereinRepresent threshold value of j-th user to ith attribute, max { fiRepresent data on data set P i-th The maximum of value on individual attribute, falls value on the attribute paid close attention in user and is unsatisfactory for the data that user requires using threshold filtering Object.
It is described subspace Skyline inquiry is carried out to remaining data method be:It is assumed that a medical data space for d dimensions S={ s1,s2,...,sd, P is the medical data collection on data space S, each data point pi∈ P are the d dimensions in space S Medical data in one dimension of data point, F be subspace in medical data space S i.e.| F |=k and k≤d, Medical data object pi on data space S, its projection on the F of subspace is represented as pi' it is k tuples, and if only if Do not exist point p ' on the F of subspacejDomination pi', pi' it is subspace Skyline Query Result, and the Query Result is returned into client Terminal.
Under the O2O environment real-time medical flow data collection specific method be:Doctor is used based on mobile network Mobile multimedia data collector, the Multiple factors for patient are multifactor using the simultaneously parallel Internet of Things based on flow data Sensing chip taken at regular intervals individual patient state, feature extraction is committed to high in the clouds with the data after filtration treatment;The medical science stream Data and inquiry request in the way of flow data, and by using time window monitor mode with time window as size, with The mode of stream collects transmission to high in the clouds by wireless network batch.
It is described bi-directional filtered to be with method that is merging:On stream merging processor, in the time window reached to priority Data flow carries out skyline filterings, if the data in same time window have dominance relation, carries out filter operation, because Data in same time window enter equity, so filtering two mistakes with reverse skyline using positive skyline filterings Filter step, positive skyline filterings are A-filtering filter methods, after user initiates inquiry request using client terminal, It utilizes the majorized function of multiobjective decision-making, finally obtains all relatively excellent medical data object set of k dimension attributes value;Reversely Skyline filtering be ε-filtering filtering, its by positive skyline filter after, according to user to each medical science attribute Preference maximum threshold value is set, if value of the data object on attribute is unsatisfactory for respective threshold, then the data Object is filtered.
Beneficial effect is:Intelligent mobile client adapts to different user functional requirement and use habit, initiates inquiry request, And carry out information exchange with many medical factor subspace Skyline inquiry systems under mobile O2O environment.Can be with the side of data flow Formula constantly interaction screening feedback data, accuracy is high, it is adaptable to the interaction platform that big data environment, user and data segment are provided Accurate and satisfactory information can be provided, the accuracy and quick user's decision-making that improve user's final decision are experienced.
Brief description of the drawings
Fig. 1 is the system model of subspace Skyline inquiry of the invention;
Fig. 2 is the formula of A-filtering filter methods of the invention;
Fig. 3 is test data set of the invention;
Fig. 4 is MapReduce index buildings file of the invention;
Fig. 5 is subspace Skyline query process of the invention;
Fig. 6 is the symbol definition of subspace Skyline query process algorithm of the present invention;
Fig. 7 is subspace Skyline query process algorithm of the invention.
Specific embodiment
Embodiment 1:A kind of Skyline based on many medical factors double filtering search methods under mobile O2O environment, including Following steps:
S1. the real-time medical flow data carried out using doctor terminal under O2O environment is gathered, and initiates to inquire about using client terminal Request, the inquiry request includes the threshold of each attribute of multiple related medical science attributes, the preference to medical science attribute and input Value;
S2. high in the clouds is based on the medical science flow data and inquiry request, from the total space extract inquiry request parallel in related doctor K (k≤d) n-dimensional subspace n of attribute is learned, wherein d is total space dimension, and k subspaces data dimension extracts user's sense in total space F The field (dimension) of interest is both k.According to client the preference of each medical science attribute is ranked up to the k n-dimensional subspace ns with Obtain an orderly k dimension grid index.It is preferred that, the method for the sequence is:The partial ordering relation of Importance of Attributes It is that on the F of subspace, partial ordering relation > represents the Importance of Attributes in F to be defined as assuming in the presence of a binary partial ordering relation > More than relation, f1,f2It is two attributes on F, f1,f2∈ F, if f1Importance be more than f2, then their partial order is closed System is expressed as f1> f2
S3. during scan data ties up grid to the k of the unequal division of Custom Space, according to the threshold value of attribute each described, Beta pruning is carried out using the grid dominance relation in a dimension, the grid and net of the dimension fallen by Skyline dominations is cut Data in lattice, realize to data filtering;Described filtering, be specially perform A-filtering filter methods and ε- Filtering filter methods, wherein:
A-filtering filtering method steps are as follows:The majorized function of multiobjective decision-making is defined as min (f1(x),f2 (x),...,fk(x)), wherein x ∈ P, fiX () is values of the data object x on i-th dimension attribute;
Above-mentioned formula is used for calculating the medical data object set R that value on the first dimension attribute most deviates user preference1, R0 Initialization data set P is represented, the relatively excellent medical data set of value on the first dimension attribute is derived from Next in the medical data set that the first dimension is relatively excellentIn, using formula (3.2)Obtain The worst medical data object set of second dimension attribute value, then from medical dataWeeded out in set, by that analogy, most The all relatively excellent medical data object set of k dimension attributes value is obtained afterwards
ε-filtering filtering method steps are as follows:εiValue the preference of each medical science attribute is set according to user Put, any user can provide the threshold value of maximum for each attribute, if value of the data object on attribute is unsatisfactory for corresponding threshold Value, then the data object is filtered, and any user is modeled asεiCalculated by equation below:
WhereinRepresent threshold value of j-th user to ith attribute, max { fiRepresent data on data set P i-th The maximum of value on individual attribute, falls value on the attribute paid close attention in user and is unsatisfactory for the data that user requires using threshold filtering Object.
S4. subspace Skyline inquiry is carried out to remaining data and is tied with obtaining the Skyline of the medical science attribute of user's request Really, in the subspace Skyline query process, for parallel flow data carry out it is bi-directional filtered with merge.It is described to residue The method that data carry out subspace Skyline inquiry is:It is assumed that a medical data space S={ s for d dimensions1,s2,...,sd, P It is the medical data collection on data space S, each data point pi∈ P are in a dimension at the d dimensions strong point in space S Medical data, F be subspace in medical data space S i.e.| F |=k and k≤d, the medical treatment on data space S Data object pi, its projection on the F of subspace is represented as pi' it is k tuples, do not exist point p ' on and if only if subspace Fj Domination pi', pi' it is subspace Skyline Query Result, and the Query Result is returned into client terminal.
In this embodiment, the specific method of the real-time medical flow data collection under the O2O environment in the step S1 It is:Doctor using the mobile multimedia data collector based on mobile (3G/4G) network, for Multiple factors (such as heart of patient Electricity monitoring figure, blood pressure, body temperature, pulse frequency etc.), it is regular using the simultaneously parallel multifactor sensing chip of the Internet of Things based on flow data Collection individual patient state, feature extraction is committed to high in the clouds with the data after filtration treatment;The medical science flow data and inquiry please Ask in the way of flow data, and by using time window monitor mode with time window as size, pass through in a streaming manner Wireless network batch collects transmission to high in the clouds.
Bi-directional filtered in the step S4 be with method that is merging:Stream merging processor on, to priority reach when Between data flow in window carry out skyline filterings, if the data in same time window have dominance relation, filtered Operation, because the data in same time window enter equity, using positive skyline filterings and reverse skyline Two filtration steps of filtering.Positive skyline filterings are A-filtering filter methods, and user is initiated using client terminal After inquiry request, it utilizes the majorized function of multiobjective decision-making, finally obtains all relatively excellent medical data pair of k dimension attributes value As set.Reverse skyline filterings are-filtering filterings, its after being filtered by positive skyline, according to user couple The preference of each medical science attribute sets maximum threshold value, if value of the data object on attribute is unsatisfactory for respective threshold, So the data object is filtered.
In the practical operation of the embodiment, such as doctor select operation plan when, it is considered to factor have:The valency of operation Lattice;, the effect of operation;Postoperative influence;Speed of recovery etc..So required according to user, to effect, the valency of operation performed the operation Lattice and postoperative influence these three attributes compare concern.Because patient has certain economic strength, carried out according to Importance of Attributes After sequence, the order of above-mentioned subspace attribute is adjusted to:{ effect of operation, postoperative influence, the price of operation } such as Fig. 5 (b) It is shown.For user tired out, due to the data p in data set P5,p6Curative effect is not fine, according to A-filtering mistakes The filtering principle of filtering method, the two data objects should not be taken as result set and return to user.Therefore p5,p6In Fig. 5 (c) Red is noted as, and was just deleted before dominance relation compares.Similarly data object p4,p8Because postoperative effect is inclined From user preferences, it is noted as in Fig. 5 (c) orange, is also deleted.By that analogy, until all subspaces attribute all It is taken into account.In A-filtering filter methods, it can only be filtered out every time, and attribute value is worst most to deviate user preferences Data object.Then filtered again with another filter method ε-filtering for being based on tolerance.It is assumed that doctor exists Selection operation plan when, be patient consult after propose selection success rate of operation more than 80%, prognostic function recover to 90% with On operation plan.So the surgical effect that user compares concern, postoperative influence, valency of performing the operation can be filtered out using these threshold values Value is unsatisfactory for the data object of user's hard requirement on lattice.Subspace is carried out to remaining data object after filtration Skyline inquiries are as shown in Fig. 5 (d).Finally obtain result set p1,p2Return to user.
In the embodiment, the double filtering retrievals of the Skyline based on many medical factors under a kind of mobile O2O environment are further related to System, including doctor terminal, client terminal and high in the clouds;The real-time medical flow data carried out using doctor terminal under O2O environment is adopted Collection, inquiry request is initiated using client terminal, and the inquiry request includes multiple related medical science attribute, the preferences to medical science attribute The threshold value of degree and each attribute of input;High in the clouds is based on the medical science flow data and inquiry request, extracts inquiry parallel from the total space K (k≤d) n-dimensional subspace n of relevant medical attribute in request, wherein d is total space dimension, and k is subspace data dimension, is pressed The preference of each medical science attribute is ranked up to the k n-dimensional subspace ns according to client ties up grid to obtain an orderly k Index;High in the clouds scan data is tieed up in grid to the k of the unequal division of Custom Space, according to the threshold value of attribute each described, is made Beta pruning is carried out with the grid dominance relation in a dimension, the grid and grid of the dimension fallen by Skyline dominations is cut In data, realize to data filtering;High in the clouds carries out subspace Skyline inquiry to remaining data to obtain the doctor of user's request The Skyline results of attribute are learned, in the subspace Skyline query process, is carried out for parallel flow data bi-directional filtered With merge.
The method of the sequence is:The partial ordering relation of Importance of Attributes is defined as assuming there is a binary partial ordering relation > Be on the F of subspace, partial ordering relation > represent in F Importance of Attributes more than relation, f1,f2It is two attributes on F, f1,f2∈ F, if f1Importance be more than f2, then their partial ordering relation is expressed as f1> f2
The filtering, is specially to perform A-filtering filter methods and ε-filtering filter methods, wherein:
A-filtering filtering method steps are as follows:The majorized function of multiobjective decision-making is defined as min (f1(x),f2 (x),...,fk(x)), wherein x ∈ P, fiX () is values of the data object x on i-th dimension attribute;
Above-mentioned formula is used for calculating the medical data object set R that value on the first dimension attribute most deviates user preference1, R0 Initialization data set P is represented, the relatively excellent medical data set of value on the first dimension attribute is derived from Next in the medical data set that the first dimension is relatively excellentIn, using formula (3.2)Obtain The worst medical data object set of second dimension attribute value, then from medical dataWeeded out in set, by that analogy, most The all relatively excellent medical data object set of k dimension attributes value is obtained afterwards
ε-filtering filtering method steps are as follows:εiValue the preference of each medical science attribute is set according to user Put, any user can provide the threshold value of maximum for each attribute, if value of the data object on attribute is unsatisfactory for corresponding threshold Value, then the data object is filtered, and any user is modeled asεiCalculated by equation below:
WhereinRepresent threshold value of j-th user to ith attribute, max { fiRepresent data on data set P i-th The maximum of value on individual attribute, falls value on the attribute paid close attention in user and is unsatisfactory for the data that user requires using threshold filtering Object.
It is described subspace Skyline inquiry is carried out to remaining data method be:It is assumed that a medical data space for d dimensions S={ s1,s2,...,sd, P is the medical data collection on data space S, each data point pi∈ P are the d dimensions in space S Medical data in one dimension of data point, F be subspace in medical data space S i.e.| F |=k and k≤d, Medical data object pi on data space S, its projection on the F of subspace is represented as pi' it is k tuples, and if only if Do not exist point p ' on the F of subspacejDomination pi', pi' it is subspace Skyline Query Result, and the Query Result is returned into client Terminal.
Under the O2O environment real-time medical flow data collection specific method be:Doctor is used based on mobile network Mobile multimedia data collector, the Multiple factors for patient are multifactor using the simultaneously parallel Internet of Things based on flow data Sensing chip taken at regular intervals individual patient state, feature extraction is committed to high in the clouds with the data after filtration treatment;The medical science stream Data and inquiry request in the way of flow data, and by using time window monitor mode with time window as size, with The mode of stream collects transmission to high in the clouds by wireless network batch.
It is described bi-directional filtered to be with method that is merging:On stream merging processor, in the time window reached to priority Data flow carries out skyline filterings, if the data in same time window have dominance relation, carries out filter operation, because Data in same time window enter equity, so filtering two mistakes with reverse skyline using positive skyline filterings Filter step, positive skyline filterings are A-filtering filter methods, after user initiates inquiry request using client terminal, It utilizes the majorized function of multiobjective decision-making, finally obtains all relatively excellent medical data object set of k dimension attributes value;Reversely Skyline filtering be-filtering filtering, its by positive skyline filter after, according to user to each medical science attribute Preference maximum threshold value is set, if value of the data object on attribute is unsatisfactory for respective threshold, then the data Object is filtered.
Embodiment 2:A kind of double filtering search methods of the Skyline based on many medical factors under mobile O2O environment, are solution Certainly move under O2O environment many medical factor mass data high costs of subspace Skyline algorithm process and result set is huge asks Topic, for the specific demand of user, according to different user demands come Selecting Representative Points from A, mainly by three kinds of filter method A- Filtering methods, ε-filtering and D-filtering methods are constituted, and perform step as follows:
S1. O2O environment and the foundation of the subspace Skyline inquiry system of many medical factors are used movement by mobile terminal Terminal carries out real-time medical science flow data collection and performs A-filtering methods, ε-filtering and D-filtering Method these three filter methods;
S2. intelligent mobile client adapts to different terminals medical science user functional requirement and use habit, initiates inquiry request, User terminal mainly includes doctor and patient;
S3. doctor uses the mobile multimedia data collector based on 3G/4G networks, and the Multiple factors for patient are carried out Collection, use can the simultaneously parallel multifactor sensing chip based on Internet of Things with flow data to carry out individual patient state regular Collection;
S4. by the way of time window monitoring, the data that will periodically collect are merged gatherer process, and with stream Mode is transmitted by wireless network;
S5. the mobile O2O environment by building is acquired, and then carries out the data after feature extraction and filtration treatment, Submit cloud terminal or cloud data center to, operation carries out data distribution formula based on many medical factor subspace Skyline inquiry systems Analysis, the data in mobile and high in the clouds in a streaming manner, carry out information exchange with bi-directional filtered and beta pruning optimization processing.
Used as the supplement of technical scheme, many medical factor subspace Skyline inquiry systems are one under movement O2O environment The mobile cloud computing model based on mobile terminal/cloud server framework is planted, client is to operate in cell phone, panel computer, doctor The application program on the terminal devices such as the wearable collecting device of Internet of Things is learned, these terminals use internet or mobile 3G/4G Network is communicated with server, is sent inquiry request and is received Query Result.In order to determine Skyline Multiple factors really Fixed, the using terminal user of medical system can be selected according to the preference to multiple related medical science attributes and defeated in terminal Enter the threshold value of each attribute, finally by inquiry request in the way of flow data, with time window as size, collect in batches and be sent to Cloud server end, it is notable that we and conventional C/S frameworks are different from, we are used based on time window size The mode of data flow carries out data transmission, and cloud server end is also to carry out data receiver in a streaming manner with verification.When any During the data over-time window size of one end, waited.The data of each time window, we will be same in a parallel fashion When distribute to different calculate nodes and carry out parallelization treatment.So can greatly improve the bi-directional filtered algorithms of Skyline Parallelization execution efficiency
The high in the clouds model of the system is lightweight client in cloud environment based on the system tray after Spark improvement Structure.Distributed Skyline is calculated and mainly carried out in the cloud platform of rear end, and it includes two modules:Pretreatment module and inquiry Module.In pretreatment module, according to time window size, system in parallel ground scan data to the unequal division of Custom Space Multi-dimensional grid in, according to the threshold value of each factor, beta pruning is carried out using the grid dominance relation in a dimension, cut by Data in the grid and grid of the dimension that Skyline dominations are fallen.It is that we are to adopt with traditional skyline differences In the pattern of multi-dimensional grid, parallelization is carried out on Multiple factors different dimensions parallel and is calculated and treatment.Received server-side is used The inquiry request at family, performs A-filtering methods, ε-filtering methods and D-filtering these three filter methods, First two filter method can parallel extract user k interested (k≤d) n-dimensional subspace n from the total space, and wherein d is tieed up for the total space Number, and it is ranked up with the fancy grade to each factor according to medical terminal obtains an orderly k dimensions grid index, Then Distributed Scans are compared the dominance relation of data point and obtain subspace Skyline result in a streaming manner, in a streaming manner Return to terminal and using the 3rd filter method carry out it is bi-directional filtered with merge.
Because the distributed stream treatment technology employed based on time window that we innovate, enters in multiple calculate nodes Row parallelization is processed, and like this, the computer node of some virtualizations may be calculated if resource or computational load are big Result can be slow, in some instances it may even be possible to crashes.In this case, the calculating task can be abandoned, and transfer to other nodes to carry out Treatment.When the node of Distributed Calculation has completion, will in a streaming manner be pushed to flow data collector and result will be closed And with optimization.The Consumer's Experience that can so produce is that the result of computer can occur rapidly, but result be it is incomplete, with The accumulation of the flow data of Skyline filter results, final result can be more and more accurate.Wait requirement according to user, user Decision-making is can be carried out after the data of result of certain precision are accumulated, and without waiting all of result to collect, this will be very big Raising user interactive experience
Used as the supplement of technical scheme, it is one medical data space S of d dimensions of hypothesis that subspace Skyline is query-defined ={ s1,s2,...,sd, P is the medical data collection on data space S, i.e. each data point pi∈ P are the d in space S Dimension strong point is the medical data in a dimension, such as cardiopathic rhythm of the heart situation.F is the son in medical data space S Space is| F |=k and k≤d.Medical data object p on data space Si, piProjection quilt on the F of subspace It is expressed as pi' it is k tuples.If pi' it is subspace Skyline result, do not exist point p ' on and if only if subspace FjDomination pi′。
Used as the supplement of technical scheme, the partial ordering relation of Importance of Attributes is defined as assuming there is a binary partial ordering relation > is on the F of subspace.Partial ordering relation > represent in F Importance of Attributes more than relation, f1,f2It is two category on F Property, f1,f2∈ F, if f1Importance be more than f2, then their partial ordering relation can be expressed as f1> f2.We obtain one Individual orderly k n-dimensional subspace ns { f1,f2,...,fk}.On the basis of two above definition, it is proposed that three kinds of filter methods:A- Filtering, ε-filtering and D-filtering filter methods are big with the result set for reducing subspace Skyline return It is small.
The process step of A-filtering filter methods is specially:The majorized function of multiobjective decision-making can be defined as min (f1(x),f2(x),...,fk(x)), wherein x ∈ P, fiX () is values of the data object x on i-th dimension attribute.Formula is shown in figure 1:
Formula 1.1 is used for calculating the set of data objects R that value on the first dimension attribute most deviates medical science user preferences1, R0Table Show initialization data set P, it is possible thereby to obtain the relatively excellent data acquisition system of value on the first dimension attributeI.e. Next in the data acquisition system that the first dimension is relatively excellentIn, the second dimension attribute value of acquisition is calculated using formula 1.2 most Poor set of data objects, Ran HoucongWeeded out in set.By that analogy, k dimension attributes value is finally obtained all relatively excellent Medical data object set
The process step of ε-filtering filter methods is specially:Filter method ε-filtering are in A- On the basis of filtering filter methods, for each attribute provides corresponding threshold value.The ε in formula 1.4i(0≤εi≤ 1) be exactly It is the tolerance threshold limit that ith attribute is provided, εiValue according to medical science user's fancy setting, such as want the interior of hypertension Section treats, and patient is associated with renal dysfunction, therefore the purpose of the treatment is can effectively to control hypertension and can reduce to kidney Function damage, then user sends order to system, the result for limiting transmission is damaged as control hypertension and to reducing to renal function Evil.It would generally shift to an earlier date and sends jointly to server by inquiry request by medical science user, perform inquiry.So filter out every time The data object that attribute value deviates user preferences is more, and obtaining final result set can be relatively smaller.
As the supplement of technical scheme, above formula is applied to the data set in table 1, it is assumed that the attribute in the F of subspace The result sorted by user preferences significance level is { Mileage, Price, OccupancyRate }, ε123) it is each attribute Corresponding threshold value, works as ε1When=1, R can be calculated1={ p5,p6, And work asWhen, R1={ p5,p6,p1,p9,
Further, user subspace interested is extracted from total space index according to medical science user preferences, for example, is entered One hyperpietic of row carries out medical treatment, it is considered to factor have curative effect of medication, drug side-effect, three bars of patient's complication Part.So required according to doctor, concern is compared to curative effect of medication, drug side-effect, complication for patients these three attributes.User couple The demand of curative effect of medication is than larger, then after system is ranked up according to Importance of Attributes, and the order of above-mentioned subspace attribute is adjusted It is whole to be:{ curative effect of medication, patient's complication, drug side-effect }.For the preferential user of curative effect of medication, due to data set P In data p5,p6Differed with condition farther out, according to the filtering principle of A-filtering filter methods, the two data objects are not Should as a result collect and return to user.Therefore p5,p6Red is noted as, and was just deleted before dominance relation compares .Similarly data object p4,p8Because complication for patients this attribute deviates user preferences, it is noted as orange, is also deleted Fall.By that analogy, until the attribute of all subspaces is all taken into account.
Further, in A-filtering filter methods, it can only be filtered out every time, and attribute value is worst most to be deviateed The data object of user preferences.Then further screened with the filter method ε-filtering based on tolerance.ε- Filtering filter methods are concretely comprised the following steps:εiFormula 1.4 before was mentioned.εiValue will be according to the happiness of user It is good to set.Any user can provide the threshold value of maximum for each attribute, if value of the data object on attribute is unsatisfactory for Respective threshold, then the data object will be filtered.Any user can be modeled asSo εiCalculated by equation below:
WhereinRepresent threshold value of j-th user to ith attribute, max { fiRepresent data on data set P i-th The maximum of value on individual attribute.Using these threshold filterings fall those user compare value on the attribute of concern be unsatisfactory for use The data object of family hard requirement.
Further, subspace Skyline inquiry is carried out to remaining data object after filtration, finally obtains result Collection p1,p2Return to user.
Because during Skyline query processings beyond the clouds, we employ the processing procedure of parallel flow data, and some are fast The result data stream of calculate node can first reach, the result data stream of some slow calculate nodes reaches after the meeting thus we On stream merging processor, D-filtering filter processes are carried out, the data flow in time window reached to priority is carried out Skyline is filtered, if the data in same time window have dominance relation, just carries out filter operation, because with for the moment Between data in window enter equity, so we filter two filterings using forward direction skyline filterings and reverse skyline walking Suddenly, so our this filter method is called the bi-directional filtered method i.e. D-filtering excessively of Double.This mode is really The further filtering of stream data result and optimization, to produce smaller result set.And accelerate retrieval rate, improve user's body Test.
The above, the only present invention preferably specific embodiment, but protection scope of the present invention is not limited thereto, Any one skilled in the art in the technical scope of present disclosure, technology according to the present invention scheme and its Inventive concept is subject to equivalent or change, should all be included within the scope of the present invention.

Claims (6)

1. a kind of double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment, it is characterised in that:Bag Include doctor terminal, client terminal and high in the clouds;The real-time medical flow data carried out using doctor terminal under O2O environment is gathered, and is used Client terminal initiates inquiry request, the inquiry request include multiple related medical science attributes, the preference to medical science attribute and The threshold value of each attribute of input;High in the clouds is based on the medical science flow data and inquiry request, in extracting inquiry request parallel from the total space Relevant medical attribute k (k≤d) n-dimensional subspace n, wherein d be total space dimension, k be subspace data dimension, according to client To hold be ranked up the k n-dimensional subspace ns preference of each medical science attribute and tie up grid index to obtain an orderly k;Cloud End scan data is tieed up in grid to the k of the unequal division of Custom Space, according to the threshold value of attribute each described, uses a dimension Grid dominance relation on degree carries out beta pruning, cuts the data in the grid and grid of the dimension fallen by Skyline dominations, Realize to data filtering;High in the clouds carries out subspace Skyline inquiry to remaining data to obtain the medical science attribute of user's request Skyline results, in the subspace Skyline query process, for parallel flow data carry out it is bi-directional filtered with merge.
2. the Skyline based on many medical factors pairs under mobile O2O environment as claimed in claim 1 filters searching systems, Characterized in that, the method for the sequence is:The partial ordering relation of Importance of Attributes is defined as assuming that there is a binary partial order closes Be > be on the F of subspace, partial ordering relation > represent in F Importance of Attributes more than relation, f1,f2It is two category on F Property, f1,f2∈ F, if f1Importance be more than f2, then their partial ordering relation is expressed as f1> f2
3. the Skyline based on many medical factors pairs under mobile O2O environment as claimed in claim 1 filters searching systems, Characterized in that, the filtering, is specially to perform A-filtering filter methods and ε-filtering filter methods, its In:
A-filtering filtering method steps are as follows:The majorized function of multiobjective decision-making is defined as min (f1(x),f2 (x),...,fk(x)), wherein x ∈ P, fiX () is values of the data object x on i-th dimension attribute;
R 1 = arg m a x x ∈ R 0 ( f 1 ( x ) ) - - - ( 3.1 )
R 2 = arg m a x x ∈ R ~ 1 ( f 2 ( x ) ) - - - ( 3.2 )
R k = arg m a x x ∈ R ~ k - 1 ( f k - 1 ( x ) ) - - - ( 3.3 )
R i = { x | f i ( x ) ≥ ϵ i m a x ( f i ( x ) ) , x ∈ R ~ i - 1 } - - - ( 3.4 )
Above-mentioned formula is used for calculating the medical data object set R that value on the first dimension attribute most deviates user preference1, R0Represent Initialization data set P, is derived from the relatively excellent medical data set of value on the first dimension attribute Next In the medical data set that the first dimension is relatively excellentIn, using formula (3.2)Second is obtained to tie up The worst medical data object set of attribute value, then from medical dataWeeded out in set, by that analogy, finally obtained The all relatively excellent medical data object set of k dimension attributes value
ε-filtering filtering method steps are as follows:εiValue the preference of each medical science attribute is set according to user, appoint Meaning user can provide the threshold value of maximum for each attribute, if value of the data object on attribute is unsatisfactory for respective threshold, that The data object is filtered, and any user is modeled asεiCalculated by equation below:
ϵ i = UP j i max { f i }
WhereinRepresent threshold value of j-th user to ith attribute, max { fiRepresent that the data on data set P belong at i-th Property on value maximum, using threshold filtering fall user pay close attention to attribute on value be unsatisfactory for user require data pair As.
4. the Skyline based on many medical factors pairs under mobile O2O environment as claimed in claim 1 filters searching systems, Characterized in that, it is described subspace Skyline inquiry is carried out to remaining data method be:It is assumed that a medical data for d dimensions Space S={ s1,s2,...,sd, P is the medical data collection on data space S, each data point pi∈ P are in space S D dimensions strong point a dimension on medical data, F be subspace in medical data space S i.e.| F |=k and K≤d, the medical data object p on data space Si, its projection on the F of subspace is represented as p 'iK tuples, when and Only when on the F of subspace do not exist point p 'jDomination p 'i, p 'iIt is subspace Skyline Query Result, and the Query Result is returned Client terminal.
5. the Skyline based on many medical factors pairs under mobile O2O environment as claimed in claim 1 filters searching systems, Characterized in that, the specific method of the real-time medical flow data collection under the O2O environment is:Doctor is used based on mobile network Mobile multimedia data collector, for patient Multiple factors using the parallel Internet of Things based on flow data simultaneously it is many because Plain sensing chip taken at regular intervals individual patient state, feature extraction is committed to high in the clouds with the data after filtration treatment;The medical science Flow data and inquiry request in the way of flow data, and by using time window monitor mode with time window as size, Transmission to high in the clouds is collected by wireless network batch in a streaming manner.
6. the Skyline based on many medical factors pairs under mobile O2O environment as claimed in claim 1 filters searching systems, Bi-directional filtered it is with method that is merging characterized in that, described:On stream merging processor, in the time window reached to priority Data flow carry out skyline filterings, if the data in same time window have dominance relation, carry out filter operation, because It is that data in same time window enter equity, so filtering two using positive skyline filterings and reverse skyline Filtration step, positive skyline filterings are A-filtering filter methods, and user initiates inquiry request using client terminal Afterwards, it utilizes the majorized function of multiobjective decision-making, finally obtains all relatively excellent medical data object set of k dimension attributes value; Reverse skyline filterings are ε-filtering filterings, its after being filtered by positive skyline, according to user to each medical science The preference of attribute sets maximum threshold value, if value of the data object on attribute is unsatisfactory for respective threshold, then should Data object is filtered.
CN201611150486.2A 2016-12-14 2016-12-14 The double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment Pending CN106777091A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611150486.2A CN106777091A (en) 2016-12-14 2016-12-14 The double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611150486.2A CN106777091A (en) 2016-12-14 2016-12-14 The double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment

Publications (1)

Publication Number Publication Date
CN106777091A true CN106777091A (en) 2017-05-31

Family

ID=58876930

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611150486.2A Pending CN106777091A (en) 2016-12-14 2016-12-14 The double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment

Country Status (1)

Country Link
CN (1) CN106777091A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109446294A (en) * 2018-11-13 2019-03-08 嘉兴学院 A kind of parallel mutual subspace Skyline querying method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473320A (en) * 2013-09-12 2013-12-25 南京大学 Method for service combination facing cloud-spanning platform
CN104809210A (en) * 2015-04-28 2015-07-29 东南大学 Top-k query method based on massive data weighing under distributed computing framework
CN105183921A (en) * 2015-10-23 2015-12-23 大连大学 Shop addressing system based on bi-chromatic reverse nearest neighbor inquiry under mobile cloud computing environment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473320A (en) * 2013-09-12 2013-12-25 南京大学 Method for service combination facing cloud-spanning platform
CN104809210A (en) * 2015-04-28 2015-07-29 东南大学 Top-k query method based on massive data weighing under distributed computing framework
CN105183921A (en) * 2015-10-23 2015-12-23 大连大学 Shop addressing system based on bi-chromatic reverse nearest neighbor inquiry under mobile cloud computing environment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YUANYUAN LI 等: "Efficient Subspace Skyline Query based on User Preference using MapReduce", 《AD HOC NETWORKS》 *
李媛媛 等: "基于时间序列的Global Skyline并行算法", 《***工程与电子技术》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109446294A (en) * 2018-11-13 2019-03-08 嘉兴学院 A kind of parallel mutual subspace Skyline querying method
CN109446294B (en) * 2018-11-13 2021-09-07 嘉兴学院 Parallel mutual subspace Skyline query method

Similar Documents

Publication Publication Date Title
Nie et al. Sustainable computing in smart agriculture: survey and challenges
CN104809130B (en) Method, equipment and the system of data query
CN111680813B (en) Method, device, equipment and storage medium for intelligent reserved vaccination
WO2020147353A1 (en) Embedded time series decision tree classification method and system for edge end
CN104866831B (en) The face recognition algorithms of characteristic weighing
CN106650228A (en) Noise data removal method through improved k-means algorithm and implementation system
CN104462318A (en) Identity recognition method and device of identical names in multiple networks
CN115497631A (en) Clinical scientific research big data analysis system
CN111694651B (en) Task processing optimization system based on cloud computing and medical big data
CN109543117A (en) Service push method and terminal device based on intelligent recommendation
CN109885651A (en) A kind of question pushing method and device
CN116860840A (en) Rapid retrieval method for highway pavement information
CN107622796A (en) Skin disease based on health control platform is assessed and remote medical consultation with specialists method and system
CN107480426A (en) From iteration case history archive cluster analysis system
CN106777091A (en) The double filtering searching systems of the Skyline based on many medical factors under mobile O2O environment
CN106844419A (en) A kind of skyline preference querying methods based on magnanimity Incomplete data set
CN107944487A (en) A kind of crop breeding recommendation of new cultivars method based on mixing collaborative filtering
CN106503271A (en) The intelligent shop site selection system of subspace Skyline inquiry under mobile Internet and cloud computing environment
CN103617474A (en) System and method for managing clinical path variation based on dynamic routing algorithm
CN110427870B (en) Eye picture recognition method, target recognition model training method and device
CN103034728A (en) Method for carrying out information interaction by utilizing academic resource interaction platform of social network
CN106777095A (en) The double filtering search methods of the Skyline based on many medical factors under mobile O2O environment
CN107610741A (en) A kind of the interrogation analysis method and system of the intelligent health management based on mobile terminal
CN107391912A (en) The hospital clinical operation data system of selection for the size stream classification applied in cloud data center system
Joshi et al. Predicting suitability of crop by developing fuzzy decision support system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170531