WO2020119105A1 - 基于大数据的支付超量识别方法、设备、存储介质及装置 - Google Patents

基于大数据的支付超量识别方法、设备、存储介质及装置 Download PDF

Info

Publication number
WO2020119105A1
WO2020119105A1 PCT/CN2019/095412 CN2019095412W WO2020119105A1 WO 2020119105 A1 WO2020119105 A1 WO 2020119105A1 CN 2019095412 W CN2019095412 W CN 2019095412W WO 2020119105 A1 WO2020119105 A1 WO 2020119105A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
payment
cell
data points
points
Prior art date
Application number
PCT/CN2019/095412
Other languages
English (en)
French (fr)
Inventor
黄越
陈明东
Original Assignee
平安医疗健康管理股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安医疗健康管理股份有限公司 filed Critical 平安医疗健康管理股份有限公司
Publication of WO2020119105A1 publication Critical patent/WO2020119105A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Definitions

  • the present application relates to the technical field of abnormal data identification, and in particular to a method, device, storage medium, and device for overpayment identification based on big data.
  • the cost of anti-rejection drugs may be too high in certain periods, for example: the cost of anti-rejection drugs in the first year exceeds 100,000, or In addition to the first year, the annual cost of anti-rejection drugs exceeds 80,000.
  • the main method for checking the above payment overpayment situation is: the staff of the Human Resources and Social Security Bureau finds and checks whether the charges are abnormal in the huge detailed data of the diagnosis and treatment.
  • this method is prone to two types of problems. One is that manual inspections inevitably have omissions. The second is lower efficiency, longer time consumption and higher cost.
  • the main purpose of the present application is to provide a method, device, storage medium and device for overpayment recognition based on big data, aiming to solve the technology in the prior art to more conveniently determine whether the payment of anti-rejection drugs is excessive problem.
  • the present application provides a method for overpayment identification based on big data.
  • the method for overpayment identification based on big data includes the following steps:
  • Whether the periodic payment fee exceeds the first preset threshold is determined by a preset unit-based outlier detection algorithm.
  • the present application also provides a user equipment, the user equipment includes a memory, a processor, and computer readable instructions stored on the memory and operable on the processor, the computer
  • the readable instructions are configured to implement the steps of the big data-based payment excess recognition method as described above.
  • the present application also proposes a storage medium on which computer readable instructions are stored, and when the computer readable instructions are executed by the processor, big data-based payment as described above is realized Steps of over-recognition method.
  • the present application also proposes a payment overpayment identification device based on big data, which includes:
  • a processing module configured to obtain anti-rejection payment data of the patient, perform standardized processing on the anti-rejection payment data, and obtain standardized payment data;
  • a statistics module configured to count the periodic payment of the patient according to the standardized payment data
  • the mining module is configured to determine whether the periodic payment fee exceeds a first preset threshold through a preset unit-based outlier detection algorithm.
  • the periodic payment fee of the patient is counted, through a preset
  • the unit-based outlier detection algorithm determines whether the periodic payment fee exceeds a first preset threshold. Due to the standardization and statistics of the patient's anti-rejection payment data, periodic payment is obtained, which can accurately determine whether the periodic payment is excessive according to the preset unit-based outlier detection algorithm, thereby urging the hospital to charge reasonably. Protect the interests of patients.
  • FIG. 1 is a schematic diagram of a user equipment structure of a hardware operating environment involved in an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a first embodiment of a method for identifying excess payment based on big data in this application;
  • FIG. 3 is a schematic flowchart of a second embodiment of a method for overpayment identification based on big data in this application;
  • FIG. 4 is a schematic flowchart of a third embodiment of a method for overpayment identification based on big data in this application;
  • FIG. 5 is a structural block diagram of a first embodiment of a payment overpayment identification device based on big data in this application.
  • FIG. 1 is a schematic structural diagram of user equipment in a hardware operating environment according to an embodiment of the present application.
  • the user equipment may include: a processor 1001, such as a central processing unit (Central Processing Unit, CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005.
  • the communication bus 1002 is used to implement connection communication between these components.
  • the user interface 1003 may include a display (Display), and the optional user interface 1003 may further include a standard wired interface and a wireless interface.
  • the wired interface of the user interface 1003 may be a USB interface in this application.
  • the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a wireless fidelity (WIreless-FIdelity, WI-FI) interface).
  • WIreless-FIdelity WI-FI
  • the memory 1005 may be a high-speed random access memory (Random Access Memory, RAM) memory, or may be a stable memory (Non-volatile Memory, NVM), such as a disk memory.
  • RAM Random Access Memory
  • NVM Non-volatile Memory
  • the memory 1005 may optionally be a storage device independent of the foregoing processor 1001.
  • FIG. 1 does not constitute a limitation on the user equipment, and may include more or less components than shown, or combine certain components, or arrange different components.
  • the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a payment excess recognition program based on big data.
  • the network interface 1004 is mainly used to connect to a background server and perform data communication with the background server;
  • the user interface 1003 is mainly used to connect to peripheral devices and perform data communication with the peripheral device;
  • the user equipment calls the computer-readable instructions stored in the memory 1005 through the processor 1001, and executes the method for overpayment identification based on big data provided by the embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a first embodiment of a method for overpayment identification based on big data of the present application, and a first embodiment of a method for overpayment identification based on big data of the present application is proposed.
  • the method for overpayment identification based on big data includes the following steps:
  • Step S10 Obtain anti-rejection payment data of the patient, standardize the anti-rejection payment data, and obtain standardized payment data.
  • the execution subject of this embodiment is user equipment, and the user equipment may be an electronic device such as a personal computer or a server.
  • the application scenario of this embodiment is that a patient swipes a medical insurance card to immediately settle the cost of diagnosis and treatment during hospital treatment.
  • the medical insurance card will record the patient's anti-rejection payment data, which includes the charging time, drug name, and amount of charges, etc., and upload the anti-rejection payment data to the core system of the human society at regular intervals For example, in one year, the user will use the user device to perform abnormal mining on the patient's anti-rejection payment data to determine whether the payment amount corresponding to the anti-rejection payment data for a period of time (such as a month) exceeds the threshold, thereby avoiding outpatient clinics Unreasonable charging situations protect the interests of patients.
  • the anti-rejection payment data recorded in the core system of the human society is generally non-standard text information. In order to easily determine whether the payment fee is exceeded, the anti-rejection payment data will be standardized in advance to The different payment data is converted into standardized payment data that can be recognized by the computer.
  • the user equipment obtains the patient's anti-rejection payment data from the human and social core system according to the patient's identity information for subsequent determination whether the payment amount corresponding to the anti-rejection payment data for a period of time exceeds the threshold.
  • the patient identity information includes information such as the patient's name and ID number, and the ID number is used to confirm the identity of the patient and manage the patient list.
  • Neuro-Linguistic Programming (NLP) technology is used to convert the anti-rejection payment data into standardized payment data, and the words in the anti-rejection payment data are represented by vectors, in order to represent each word
  • NLP Neuro-Linguistic Programming
  • a preset bidirectional recurrent neural network model is used to encode the vector into a sentence matrix, and the sentence matrix is compressed into a sentence vector through the attention model, and the sentence vector is the standardized payment data.
  • Step S20 Statistics the periodic payment fee of the patient according to the standardized payment data.
  • the periodic payment fee may be a monthly payment fee, a quarterly payment fee or an annual payment fee, which is not limited in this embodiment.
  • the periodic payment of the patient will be counted according to the requirements. For example, before the monthly payment is required to exceed 10,000 yuan, the patient will be counted according to the standardized payment data Before the monthly payment is required to determine whether the annual payment exceeds 100,000 yuan, the annual payment of the patient will be calculated based on the standardized payment data.
  • Step S30 Determine whether the periodic payment fee exceeds the first preset threshold through a preset unit-based outlier detection algorithm.
  • the first preset threshold is the highest fee allowed to be paid in the period corresponding to the periodic payment fee. If the periodic payment fee exceeds the first preset threshold, it indicates that the hospital has not In a reasonable charging situation, if the periodic payment fee does not exceed the first preset threshold, it indicates that the hospital charging is more reasonable.
  • this embodiment is to determine whether the periodic payment fee exceeds the first preset threshold, and the unit-based outlier detection algorithm is used to detect whether there is an outlier in the data set, therefore, according to this characteristic , Constructing the preset unit-based isolated point detection algorithm to find the isolated point in the periodic payment fee, the isolated point exceeding the first preset threshold.
  • the anti-rejection payment data is standardized to obtain standardized payment data, and the periodic payment fee of the patient is counted according to the standardized payment data, Whether the periodic payment fee exceeds the first preset threshold is determined by a preset unit-based outlier detection algorithm. Due to the standardization and statistics of the patient's anti-rejection payment data, periodic payment is obtained, which can accurately determine whether the periodic payment is excessive according to the preset unit-based outlier detection algorithm, thereby urging the hospital to charge reasonably. Protect the interests of patients.
  • FIG. 3 is a schematic flowchart of a second embodiment of a method for identifying excess payment based on big data of the present application. Based on the first embodiment shown in FIG. 2 above, a method for identifying excess payment based on large data of the present application is proposed The second embodiment.
  • step S30 specifically includes:
  • the preset unit-based outlier detection algorithm will be used for determination.
  • the data space where the periodic payment fee is located is divided into cells of equal length, and the periodic payment fee is mapped to data points in the cell, wherein, according to A preset threshold value and a preset formula determine the cell side length, and the cell side length is used as a division basis.
  • the preset formula is:
  • L is the cell side length and D is the first preset threshold.
  • the data point in the cell is an isolated point, it is determined that the periodic payment fee corresponding to the isolated point exceeds the first preset threshold.
  • the isolated point refers to a data point that does not have enough neighbors. Since the cell side length is determined by the first preset threshold, the preset cell-based isolated point detection algorithm is based on the If the cell side length divides the cell, the detected isolated point is an isolated point relative to the first preset threshold. Therefore, if the data point in the cell is an isolated point, the The periodic payment fee corresponding to the isolated point exceeds the first preset threshold. If the data point in the cell is not an isolated point, it is determined that the periodic payment fee corresponding to the data point in the cell does not exceed the first Preset threshold.
  • step S30 includes:
  • Step S301 Determine the cell side length according to the first preset threshold, divide the data space where the periodic payment fee is located into several cells according to the cell side length, and map the periodic payment fee as The data point in the cell.
  • Step S302 Traverse each cell, count the number of first data points in each cell, the number of second data points in the first layer neighbors of each cell, and the number of third data points in the second layer neighbors of each cell Mesh.
  • the data space includes several cells
  • the first layer of neighbors is an adjacent layer of cells
  • the second layer of neighbors is two layers of cells outside the first layer of neighbors
  • the number of the first data points Is the number of data points in the cell
  • the second number of data points is the number of data points in the first-level neighbors of the cell
  • the number of third data points is the second of the cell The number of data points in the layer neighbor.
  • Step S303 Determine whether the data point in the cell is an isolated point according to the number of the first data points, the number of the second data points, and the number of the third data points.
  • the number of neighbors of the data point in the cell can be determined according to the number of the first data point, the number of the second data point, and the number of the third data point, thereby determining the data in the cell Whether the point is an isolated point.
  • each cell is traversed, and for each cell, the number of the first data point, the number of the second data point, and the number of the third data point are counted, and the number of cells in each cell can be determined. Whether the data point is an isolated point, to determine whether the periodic payment fee corresponding to the data point in each cell exceeds the first preset threshold, so as to accurately and comprehensively identify the periodic payment fee exceeding the first preset threshold.
  • Step S304 If the data point in the cell is an isolated point, it is determined that the periodic payment fee corresponding to the isolated point exceeds the first preset threshold.
  • step S303 includes:
  • the data points in the cell are deemed to be isolated points
  • the sum of the number of the first data points and the number of the second data points is greater than the second preset threshold, it means that a small range centered on the cell contains a large number of data points , So that the data point in the cell is not an isolated point, and the periodic payment fee corresponding to the data point in the cell does not exceed the first preset threshold.
  • the sum of the number of the first data points, the number of the second data points, and the number of the third data points is not greater than the second preset threshold, it means that a larger range centered on the cell, It contains fewer data points, so that the data points in the cell are isolated points, and the periodic payment fee corresponding to the data points in the cell exceeds the first preset threshold.
  • the data point in the cell is used as the pending data point, and the The distance algorithm determines whether the data points in the cells are isolated points one by one.
  • the cell side length is determined according to the first preset threshold
  • the data space where the periodic payment fee is located is divided into several cells according to the cell side length
  • the periodicity is divided
  • the payment fee is mapped to the data point in the cell to determine whether the data point in the cell is an isolated point, if the data point in the cell is an isolated point, the periodicity corresponding to the isolated point is determined
  • the payment fee exceeds the first preset threshold. Since the cell side length is determined by the first preset threshold, the isolated point detected based on the cell side length is an isolated point relative to the first preset threshold, so that it can be based on the detected The isolated point determines whether the corresponding periodic payment fee exceeds the first preset threshold, urges the hospital to charge reasonably, and protects the interests of patients.
  • FIG. 4 is a schematic flowchart of a third embodiment of a method for identifying excess payment based on big data of the present application. Based on the second embodiment shown in FIG. 3 above, a method for identifying excess payment based on large data of the present application is proposed The third embodiment.
  • step S10 specifically includes:
  • Step S101 Obtain anti-rejection payment data of the patient, and perform word segmentation processing on the anti-rejection payment data to generate a word sequence.
  • Step S102 Convert words in the word sequence into word vectors to generate corresponding word vector sequences.
  • the anti-rejection payment data needs to be converted into a periodic payment fee that can be recognized by the computer, such as a vector.
  • the The anti-rejection payment data is subjected to word segmentation processing to generate a word sequence, and the word sequence includes each word and the sequence of words in the anti-rejection payment data.
  • the words in the word sequence are converted into word vectors, and a sequence of word vectors is obtained in combination with the sequence of words, and the word vector includes a sequence of word vectors and word vectors of the anti-exclusion payment data.
  • Step S103 encode the word vector sequence into a sentence matrix according to a preset bidirectional recurrent neural network model.
  • the preset bidirectional recurrent neural network (BRNN) model is a neural network model with a feedback structure, and the word vector is input into the preset bidirectional recurrent neural network model , So that the preset bidirectional recurrent neural network model encodes the word vector sequence and outputs a sentence matrix, and each row of the sentence matrix represents the meaning of each word expressed in the context.
  • Step S104 Compress the sentence matrix into a sentence vector through a preset attention model, and use the sentence vector as standardized payment data.
  • the attention model (Attention model) is used to select information that is more critical to the current task goal from a large number of information, and the preset attention model is used to extract valid data from the sentence matrix, and Convert the valid data into sentence vectors.
  • the step S103 includes:
  • the word vector sequence is sequentially input into a preset bidirectional recursive neural network model in a forward direction and then a reverse direction, so that the preset bidirectional recursive neural network model encodes the word vector sequence and outputs a sentence matrix.
  • the word vector sequence is sequentially input forward and backward into the preset bidirectional recurrent neural network model, where forward input refers to inputting the word vectors in the word vector sequence according to position
  • the reverse input refers to sequentially inputting the word vectors in the word vector sequence in reverse order to the preset bidirectional recursive neural network model at the corresponding time
  • the input signal of the preset bidirectional recursive neural network model at each current time also includes the output signal of the preset bidirectional recursive neural network model at the previous time.
  • step S104 includes:
  • the context vector expresses the context relationship between word vectors
  • the context vector is extracted from the sentence matrix through the preset attention model
  • the sentence matrix is compressed into sentences according to the context vector Vectors can improve the accuracy and comprehensiveness of sentence vectors, thereby obtaining accurate standardized payment data.
  • word segmentation processing is performed on the anti-rejection payment data to generate a word sequence, convert words in the word sequence into word vectors, generate corresponding word vector sequences, and according to a preset bidirectional recurrent neural network
  • the model encodes the word vector sequence into a sentence matrix, compresses the sentence matrix into a sentence vector through a preset attention model, and uses the sentence vector as standardized payment data.
  • the dependence on the context vector improves the efficiency and accuracy of generating standardized payment data.
  • an embodiment of the present application further provides a storage medium, and the storage medium may be a non-volatile readable storage medium.
  • Computer-readable instructions are stored on the storage medium of the present application. When the computer-readable instructions are executed by the processor, the steps of the method for identifying a payment overpayment based on big data as described above are implemented.
  • the method implemented when the computer-readable instruction is executed can refer to various embodiments of the method for overpayment identification based on big data of the present application, and details are not described herein again.
  • an embodiment of the present application further proposes a payment overpayment identification device based on big data.
  • the payment overpayment identification device based on big data includes:
  • the processing module 10 is configured to obtain anti-rejection payment data of the patient, perform standardized processing on the anti-rejection payment data, and obtain standardized payment data.
  • the application scenario of this embodiment is that the patient swipes the medical insurance card to settle the medical expenses in real time during hospital treatment, and the medical insurance card will record the patient's anti-rejection payment data.
  • the anti-rejection payment data includes charging time, The name of the drug and the amount of charges, etc., and upload the anti-rejection payment data to the core system of the human society. Every fixed time, for example, one year, the user will use the user device to mine the anti-rejection payment data of the patient abnormally To determine whether the payment amount corresponding to the anti-rejection payment data for a period of time (such as one month) exceeds the threshold, so as to avoid unreasonable outpatient charges and protect the interests of patients.
  • the anti-rejection payment data recorded in the core system of the human society is generally non-standard text information.
  • the anti-rejection payment data will be standardized in advance to Different payment data is converted into standardized payment data that can be recognized by the computer.
  • the user equipment obtains the patient's anti-rejection payment data from the human and social core system according to the patient's identity information for subsequent determination whether the payment amount corresponding to the anti-rejection payment data for a period of time exceeds the threshold.
  • the patient identity information includes information such as the patient's name and ID number, and the ID number is used to confirm the identity of the patient and manage the patient list.
  • Neuro-Linguistic Programming (NLP) technology is used to convert the anti-rejection payment data into standardized payment data, and the words in the anti-rejection payment data are represented by vectors, in order to represent each word
  • NLP Neuro-Linguistic Programming
  • a preset bidirectional recurrent neural network model is used to encode the vector into a sentence matrix, and the sentence matrix is compressed into a sentence vector through the attention model, and the sentence vector is the standardized payment data.
  • the statistics module 20 is configured to count the periodic payment of the patient according to the standardized payment data.
  • the periodic payment fee may be a monthly payment fee, a quarterly payment fee, or an annual payment fee, which is not limited in this embodiment.
  • the periodic payment of the patient will be counted according to the requirements. For example, before the monthly payment is required to exceed 10,000 yuan, the patient will be counted according to the standardized payment data Before the monthly payment is required to determine whether the annual payment exceeds 100,000 yuan, the annual payment of the patient will be calculated based on the standardized payment data.
  • the mining module 30 is configured to determine whether the periodic payment fee exceeds a first preset threshold through a preset unit-based outlier detection algorithm.
  • the first preset threshold is the highest fee allowed to be paid in the period corresponding to the periodic payment fee. If the periodic payment fee exceeds the first preset threshold, it indicates that the hospital has not In a reasonable charging situation, if the periodic payment fee does not exceed the first preset threshold, it indicates that the hospital charging is more reasonable.
  • this embodiment is to determine whether the periodic payment fee exceeds the first preset threshold, and the unit-based outlier detection algorithm is used to detect whether there is an outlier in the data set, therefore, according to this characteristic , Constructing the preset unit-based isolated point detection algorithm to find the isolated point in the periodic payment fee, the isolated point exceeding the first preset threshold.
  • the anti-rejection payment data is standardized to obtain standardized payment data, and the periodic payment fee of the patient is counted according to the standardized payment data.
  • a unit-based outlier detection algorithm is preset to determine whether the periodic payment fee exceeds a first preset threshold. Due to the standardization and statistics of the patient's anti-rejection payment data, periodic payment is obtained, which can accurately determine whether the periodic payment is excessive according to the preset unit-based outlier detection algorithm, thereby urging the hospital to charge reasonably. Protect the interests of patients.
  • the mining module 30 is further configured to determine the cell side length according to the first preset threshold, and divide the data space where the periodic payment fee is located into units according to the cell side length Grid, and map the periodic payment fees to data points in the cell;
  • the data point in the cell is an isolated point, it is determined that the periodic payment fee corresponding to the isolated point exceeds the first preset threshold.
  • the mining module 30 is also used to traverse each cell, count the number of first data points in each cell, the number of second data points in the first-level neighbors of each cell, and each cell The number of third data points in the second-layer neighbors of the grid;
  • the number of the first data points, the number of the second data points, and the number of the third data points determine whether the data points in the cell are isolated points.
  • the mining module 30 is further configured to determine that the data point in the cell is not if the sum of the number of first data points and the number of second data points is greater than a second preset threshold Outlier
  • the data points in the cell are deemed to be isolated points
  • the processing module 10 is further configured to obtain anti-rejection payment data of the patient, and perform word segmentation processing on the anti-rejection payment data to generate a word sequence;
  • the sentence matrix is compressed into a sentence vector through a preset attention model, and the sentence vector is used as standardized payment data.
  • the processing module 10 is further configured to sequentially input the word vector sequence forward and backward into a preset bidirectional recursive neural network model, so that the preset bidirectional recursive neural network model Encode the word vector sequence and output a sentence matrix.
  • the processing module 10 is further configured to extract a context vector from the sentence matrix through a preset attention model
  • sequence numbers of the above embodiments of the present application are for description only, and do not represent the advantages and disadvantages of the embodiments.
  • several of these devices may be embodied by the same hardware item.
  • the use of the words first, second, and third does not indicate any order, and these words can be interpreted as names.
  • the methods in the above embodiments can be implemented by means of software plus a necessary general hardware platform, and of course, can also be implemented by hardware, but in many cases the former is better Implementation.
  • the technical solution of the present application can be embodied in the form of a software product in essence or part that contributes to the existing technology, and the computer software product is stored in a storage medium (such as a read-only memory image (Read Only Only) Memory (image), ROM)/Random Access (RAM), magnetic disks, and optical disks include several instructions to make a terminal device (which can be a mobile phone, computer, server, air conditioner, or network device) Etc.) The method described in each embodiment of the present application is performed.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Accounting & Taxation (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Technology Law (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

一种基于大数据的支付超量识别方法、设备、存储介质及装置,该方法包括:获取患者的抗排异支付数据,对所述抗排异支付数据进行标准化处理,获得标准化支付数据(S10),根据所述标准化支付数据统计所述患者的周期性支付费用(S20),通过预设基于单元的孤立点检测算法判断所述周期性支付费用是否超过第一预设阈值(S30)。

Description

基于大数据的支付超量识别方法、设备、存储介质及装置
本申请要求于2018年12月13日提交中国专利局、申请号为201811530549.6、发明名称为“基于大数据的支付超量识别方法、设备、存储介质及装置”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
技术领域
本申请涉及异常数据识别技术领域,尤其涉及一种基于大数据的支付超量识别方法、设备、存储介质及装置。
背景技术
由于医保体系的不完善,参保人在医院治疗期间,可能会出现某几个时期支付抗排异药物的费用过高的情形,例如:首年支付抗排异药物的费用超过10万,或者,除首年外的年份,每年支付抗排异药物的费用超过8万。
目前对上述支付超量情形进行排查的主要手段是:人社局工作人员在庞大的诊疗明细数据中查找并核对收费是否异常,然而,该手段易出现两类问题,一是人工排查难免存在疏漏,二是效率较低、耗时较长及成本较高。
上述内容仅用于辅助理解本申请的技术方案,并不代表承认上述内容是现有技术。
发明内容
本申请的主要目的在于提供一种基于大数据的支付超量识别方法、设备、存储介质及装置,旨在解决现有技术中如何更便捷地判断抗排异药物的支付费用是否超量的技术问题。
为实现上述目的,本申请提供一种基于大数据的支付超量识别方法,所述基于大数据的支付超量识别方法包括以下步骤:
获取患者的抗排异支付数据,对所述抗排异支付数据进行标准化处理,获得标准化支付数据;
根据所述标准化支付数据统计所述患者的周期性支付费用;
通过预设基于单元的孤立点检测算法判断所述周期性支付费用是否超过第一预设阈值。
此外,为实现上述目的,本申请还提出一种用户设备,所述用户设备包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述计算机可读指令配置为实现如上所述的基于大数据的支付超量识别方法的步骤。
此外,为实现上述目的,本申请还提出一种存储介质,所述存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如上文所述的基于大数据的支付超量识别方法的步骤。
此外,为实现上述目的,本申请还提出一种基于大数据的支付超量识别装置,所述基于大数据的支付超量识别装置包括:
处理模块,用于获取患者的抗排异支付数据,对所述抗排异支付数据进行标准化处理,获得标准化支付数据;
统计模块,用于根据所述标准化支付数据统计所述患者的周期性支付费用;
挖掘模块,用于通过预设基于单元的孤立点检测算法判断所述周期性支付费用是否超过第一预设阈值。
本申请中,通过获取患者的抗排异支付数据,对所述抗排异支付数据进行标准化处理,获得标准化支付数据,根据所述标准化支付数据统计所述患者的周期性支付费用,通过预设基于单元的孤立点检测算法判断所述周期性支付费用是否超过第一预设阈值。由于对患者的抗排异支付数据进行标准化与统计,获得了周期性支付费用,从而能够根据预设基于单元的孤立点检测算法准确地判断周期性支付费用是否超量,从而督促医院合理收费,保障患者的利益。
附图说明
图1是本申请实施例方案涉及的硬件运行环境的用户设备结构 示意图;
图2为本申请基于大数据的支付超量识别方法第一实施例的流程示意图;
图3为本申请基于大数据的支付超量识别方法第二实施例的流程示意图;
图4为本申请基于大数据的支付超量识别方法第三实施例的流程示意图;
图5为本申请基于大数据的支付超量识别装置第一实施例的结构框图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
参照图1,图1为本申请实施例方案涉及的硬件运行环境的用户设备结构示意图。
如图1所示,该用户设备可以包括:处理器1001,例如中央处理器(Central Processing Unit,CPU),通信总线1002、用户接口1003,网络接口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display),可选用户接口1003还可以包括标准的有线接口、无线接口,对于用户接口1003的有线接口在本申请中可为USB接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如无线保真(WIreless-FIdelity,WI-FI)接口)。存储器1005可以是高速的随机存取存储器(Random Access Memory,RAM)存储器,也可以是稳定的存储器(Non-volatile Memory,NVM),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。
本领域技术人员可以理解,图1中示出的结构并不构成对用户设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件, 或者不同的部件布置。
如图1所示,作为一种计算机存储介质的存储器1005中可以包括操作***、网络通信模块、用户接口模块以及基于大数据的支付超量识别程序。
在图1所示的用户设备中,网络接口1004主要用于连接后台服务器,与所述后台服务器进行数据通信;用户接口1003主要用于连接外设,与所述外设进行数据通信;所述用户设备通过处理器1001调用存储器1005中存储的计算机可读指令,并执行本申请实施例提供的基于大数据的支付超量识别方法。
基于上述硬件结构,提出本申请基于大数据的支付超量识别方法的实施例。
参照图2,图2为本申请基于大数据的支付超量识别方法第一实施例的流程示意图,提出本申请基于大数据的支付超量识别方法第一实施例。
在第一实施例中,所述基于大数据的支付超量识别方法包括以下步骤:
步骤S10:获取患者的抗排异支付数据,对所述抗排异支付数据进行标准化处理,获得标准化支付数据。
需要说明的是,本实施例的执行主体是用户设备,所述用户设备可为个人电脑或服务器等电子设备,本实施例的应用场景是,患者在医院治疗时刷医保卡即时结算诊疗费用,医保卡就会记录患者的抗排异支付数据,所述抗排异支付数据包括收费时间、药物名称及收费金额等,并将该抗排异支付数据上传至人社核心***,每隔固定时间,例如,一年,用户将使用所述用户设备对患者的抗排异支付数据进行异常挖掘,判断一段时间(比如一个月)的抗排异支付数据对应的支付金额是否超过阈值,从而避免门诊不合理的收费情形,保障患者的利益。所述人社核心***记载的抗排异支付数据一般为不规范的文本信息,为了方便地判断支付费用是否超量,将预先对所述抗排异支付数据进行标准化处理,将所述抗排异支付数据转化为计算机能够识别 的标准化支付数据。
在具体实现中,所述用户设备根据患者的身份信息从人社核心***中获取患者的抗排异支付数据,以供后续判断一段时间的抗排异支付数据对应的支付金额是否超过阈值,所述患者身份信息包含患者姓名和身份证号等信息,所述身份证号用于确认患者身份和管理患者名单。本实施例利用神经语言程序学(Neuro-Linguistic Programming,NLP)技术将所述抗排异支付数据转化为标准化支付数据,通过向量表示所述抗排异支付数据中的词语,为了表示每个词语之间的联系,使用预设双向递归神经网络模型将向量编码为一个句子矩阵,并通过注意力模型将所述句子矩阵压缩为句向量,该句向量即为所述标准化支付数据。
步骤S20:根据所述标准化支付数据统计所述患者的周期性支付费用。
需要说明的是,所述周期性支付费用可以是月支付费用、季度支付费用或者年支付费用,本实施例对此不加以限制。为了判断患者的周期性支付费用是否超过阈值,将按照要求统计所述患者的周期性支付费用,例如,要求判断月支付费用是否超过1万元之前,将根据所述标准化支付数据统计所述患者每月的月支付费用,要求判断年支付费用是否超过10万元之前,将根据所述标准化支付数据统计所述患者每年的年支付费用。
步骤S30:通过预设基于单元的孤立点检测算法判断所述周期性支付费用是否超过第一预设阈值。
需要说明的是,所述第一预设阈值为所述周期性支付费用对应的周期内允许支付的最高费用,若所述周期性支付费用超过所述第一预设阈值,则说明医院存在不合理的收费情形,若所述周期性支付费用不超过所述第一预设阈值,则说明医院收费较合理。
在具体实现中,由于本实施例是为了判断所述周期性支付费用是否超过第一预设阈值,而基于单元的孤立点检测算法用于检测数据集中是否存在孤立点,因此,根据这一特性,构造所述预设基于单元的孤立点检测算法,以查找所述周期性支付费用中的孤立点,该孤立点 即超过了所述第一预设阈值。
在第一实施例中,通过获取患者的抗排异支付数据,对所述抗排异支付数据进行标准化处理,获得标准化支付数据,根据所述标准化支付数据统计所述患者的周期性支付费用,通过预设基于单元的孤立点检测算法判断所述周期性支付费用是否超过第一预设阈值。由于对患者的抗排异支付数据进行标准化与统计,获得了周期性支付费用,从而能够根据预设基于单元的孤立点检测算法准确地判断周期性支付费用是否超量,从而督促医院合理收费,保障患者的利益。
参照图3,图3为本申请基于大数据的支付超量识别方法第二实施例的流程示意图,基于上述图2所示的第一实施例,提出本申请基于大数据的支付超量识别方法的第二实施例。
在第二实施例中,所述步骤S30,具体包括:
根据第一预设阈值确定单元格边长,根据所述单元格边长将所述周期性支付费用所处的数据空间划分为若干单元格,并将所述周期性支付费用映射为所述单元格中的数据点。
需要说明的是,为了判断所述周期性支付费用是否超过第一预设阈值,将通过所述预设基于单元的孤立点检测算法进行判断。
在具体实现中,将所述周期性支付费用所处的数据空间划分为若干等边长的单元格,并将所述周期性支付费用映射为所述单元格中的数据点,其中,根据第一预设阈值和预设公式确定单元格边长,以所述单元格边长作为划分依据,该预设公式为:
Figure PCTCN2019095412-appb-000001
其中,L为单元格边长,D为第一预设阈值。
判断所述单元格中的数据点是否为孤立点。
若所述单元格中的数据点为孤立点,则认定所述孤立点对应的周期性支付费用超过所述第一预设阈值。
可以理解的是,所述孤立点指的是没有足够多邻居的数据点,由于通过所述第一预设阈值确定了所述单元格边长,而预设基于单元的 孤立点检测算法基于该单元格边长划分单元格,则检测出的孤立点,是相对于所述第一预设阈值的孤立点,因此,若单元格中的数据点为孤立点,则认定所述单元格中的孤立点对应的周期性支付费用超过所述第一预设阈值,若单元格中的数据点不是孤立点,则认定所述单元格中的数据点对应的周期性支付费用未超过所述第一预设阈值。
进一步地,所述步骤S30,包括:
步骤S301:根据第一预设阈值确定单元格边长,根据所述单元格边长将所述周期性支付费用所处的数据空间划分为若干单元格,并将所述周期性支付费用映射为所述单元格中的数据点。
步骤S302:遍历各单元格,统计各单元格中的第一数据点数目、各单元格的第一层邻居中的第二数据点数目以及各单元格的第二层邻居中的第三数据点数目。
可以理解的是,所述数据空间包含若干单元格,第一层邻居为相邻的一层单元格,第二层邻居为第一层邻居外的两层单元格,所述第一数据点数目为所述单元格中数据点的数目,所述第二数据点数目为所述单元格的第一层邻居中的数据点的数目,所述第三数据点数目为所述单元格的第二层邻居中的数据点的数目。
步骤S303:根据所述第一数据点数目、所述第二数据点数目以及所述第三数据点数目判断所述单元格中的数据点是否为孤立点。
需要说明的是,根据所述第一数据点数目、所述第二数据点数目以及所述第三数据点数目可判断所述单元格中的数据点的邻居数目,从而判断单元格中的数据点是否为孤立点。
在具体实现中,遍历各单元格,对每个单元格,均统计所述第一数据点数目、所述第二数据点数目以及所述第三数据点数目,可判断每个单元格中的数据点是否为孤立点,判断每个单元格中的数据点对应的周期性支付费用是否超过第一预设阈值,从而实现准确全面地识别出超过第一预设阈值的周期性支付费用。
步骤S304:若所述单元格中的数据点为孤立点,则认定所述孤立点对应的周期性支付费用超过所述第一预设阈值。
进一步地,所述步骤S303,包括:
若所述第一数据点数目与所述第二数据点数目的和大于第二预设阈值,则认定所述单元格中的数据点不是孤立点;
若所述第一数据点数目、所述第二数据点数目以及所述第三数据点数目的和不大于所述第二预设阈值,则认定所述单元格中的数据点为孤立点;
否则,逐个判断所述单元格中的数据点是否为孤立点。
可以理解的是,若所述第一数据点数目与所述第二数据点数目的和大于第二预设阈值,则说明以所述单元格为中心的较小范围内,包含有大量的数据点,从而说明所述单元格中的数据点不是孤立点,所述单元格中的数据点对应的周期性支付费用未超过所述第一预设阈值。若所述第一数据点数目、所述第二数据点数目以及所述第三数据点数目的和不大于所述第二预设阈值,则说明以所述单元格为中心的较大范围内,包含较少的数据点,从而说明所述单元格中的数据点是孤立点,所述单元格中的数据点对应的周期性支付费用超过所述第一预设阈值。若所述第一数据点数目与所述第二数据点数目的和不大于所述第二预设阈值,且所述第一数据点数目、所述第二数据点数目以及所述第三数据点数目的和大于所述第二预设阈值,则说明不能准确地认定所述单元格中的数据点是否为孤立点,此时,将所述单元格中的数据点作为待定数据点,并采用基于距离的算法逐个判断所述单元格中的数据点是否为孤立点。
在第二实施例中,根据第一预设阈值确定单元格边长,根据所述单元格边长将所述周期性支付费用所处的数据空间划分为若干单元格,并将所述周期性支付费用映射为所述单元格中的数据点,判断所述单元格中的数据点是否为孤立点,若所述单元格中的数据点为孤立点,则认定所述孤立点对应的周期性支付费用超过所述第一预设阈值。由于通过第一预设阈值确定了所述单元格边长,则基于所述单元格边长检测出的孤立点,是相对于所述第一预设阈值的孤立点,从而可根据检测出的孤立点判断对应的周期性支付费用是否超过第一预设阈值,督促医院合理收费,保障患者的利益。
参照图4,图4为本申请基于大数据的支付超量识别方法第三实施例的流程示意图,基于上述图3所示的第二实施例,提出本申请基于大数据的支付超量识别方法的第三实施例。
在第二实施例中,所述步骤S10,具体包括:
步骤S101:获取患者的抗排异支付数据,并对所述抗排异支付数据进行分词处理,生成词语序列。
步骤S102:将所述词语序列中的词语转化为词向量,生成对应的词向量序列。
可以理解的是,为了实现对所述抗排异支付数据的标准化,需将所述抗排异支付数据转化为计算机可以识别的周期性支付费用,比如向量,在本实施例中,对所述抗排异支付数据进行分词处理,生成词语序列,所述词语序列包含所述抗排异支付数据的每个词语与词语的序列。将所述词语序列中的词语转化为词向量,结合所述词语的序列,可获得词向量序列,所述词向量包含所述抗排异支付数据的词向量与词向量的序列。
步骤S103:根据预设双向递归神经网络模型将所述词向量序列编码为句子矩阵。
需要说明的是,所述预设双向递归神经网络(Bidirectional recurrent neural network,BRNN)模型是一种具有反馈结构的神经网络模型,将所述词向量输入至所述预设双向递归神经网络模型中,以使所述预设双向递归神经网络模型对所述词向量序列进行编码,并输出句子矩阵,所述句子矩阵的每一行表示每个词语在上下文中所表达的意思。
步骤S104:通过预设注意力模型将所述句子矩阵压缩为句向量,并将所述句向量作为标准化支付数据。
可以理解的是,注意力模型(Attention model)用于从众多信息中选择出对当前任务目标更关键的信息,而所述预设注意力模型用于从所述句子矩阵中提取有效数据,并将所述有效数据转化为句向量。
进一步地,在第三实施例中,所述步骤S103,包括:
将所述词向量序列依次先正向后反向输入到预设双向递归神经 网络模型中,以使所述预设双向递归神经网络模型对所述词向量序列进行编码,并输出句子矩阵。
需要说明的是,将所述词向量序列依次正向和反向输入到所述预设双向递归神经网络模型中,其中,正向输入是指将所述词向量序列中的词向量,按照位置的前后顺序依次输入对应时刻的预设双向递归神经网络模型中,所述反向输入是指将所述词向量序列中的词向量倒序依次输入对应时刻的预设双向递归神经网络模型,所述预设双向递归神经网络模型每个当前时刻的输入信号还包括上一时刻所述预设双向递归神经网络模型的输出信号,正向和反向信息输入都结束后,停止递归,输出句子矩阵。
进一步地,在第三实施例中,所述步骤S104,包括:
通过预设注意力模型从所述句子矩阵中提取上下文向量;
根据所述上下文向量将所述句子矩阵压缩为句向量,并将所述句向量作为标准化支付数据。
可以理解的是,所述上下文向量表达了词向量之间的上下文关系,通过所述预设注意力模型从所述句子矩阵中提取上下文向量,根据所述上下文向量将所述句子矩阵压缩为句向量,能够提高句向量的准确性与全面性,从而获得准确的标准化支付数据。
在第三实施例中,对所述抗排异支付数据进行分词处理,生成词语序列,将所述词语序列中的词语转化为词向量,生成对应的词向量序列,根据预设双向递归神经网络模型将所述词向量序列编码为句子矩阵,通过预设注意力模型将所述句子矩阵压缩为句向量,并将所述句向量作为标准化支付数据。由于依赖上下文向量,提高了生成标准化支付数据的效率和准确率。
此外,本申请实施例还提出一种存储介质,所述存储介质可以为非易失性可读存储介质。
本申请所述存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如上文所述的基于大数据的支付超量识别方法的步骤。
其中,该计算机可读指令被执行时所实现的方法可参照本申请基于大数据的支付超量识别方法的各个实施例,此处不再赘述。
此外,参照图5,本申请实施例还提出一种基于大数据的支付超量识别装置,所述基于大数据的支付超量识别装置包括:
处理模块10,用于获取患者的抗排异支付数据,对所述抗排异支付数据进行标准化处理,获得标准化支付数据。
需要说明的是,本实施例的应用场景是,患者在医院治疗时刷医保卡即时结算诊疗费用,医保卡就会记录患者的抗排异支付数据,所述抗排异支付数据包括收费时间、药物名称及收费金额等,并将该抗排异支付数据上传至人社核心***,每隔固定时间,例如,一年,用户将使用所述用户设备对患者的抗排异支付数据进行异常挖掘,判断一段时间(比如一个月)的抗排异支付数据对应的支付金额是否超过阈值,从而避免门诊不合理的收费情形,保障患者的利益。所述人社核心***记载的抗排异支付数据一般为不规范的文本信息,为了方便地判断支付费用是否超量,将预先对所述抗排异支付数据进行标准化处理,将所述抗排异支付数据转化为计算机能够识别的标准化支付数据。
在具体实现中,所述用户设备根据患者的身份信息从人社核心***中获取患者的抗排异支付数据,以供后续判断一段时间的抗排异支付数据对应的支付金额是否超过阈值,所述患者身份信息包含患者姓名和身份证号等信息,所述身份证号用于确认患者身份和管理患者名单。本实施例利用神经语言程序学(Neuro-Linguistic Programming,NLP)技术将所述抗排异支付数据转化为标准化支付数据,通过向量表示所述抗排异支付数据中的词语,为了表示每个词语之间的联系,使用预设双向递归神经网络模型将向量编码为一个句子矩阵,并通过注意力模型将所述句子矩阵压缩为句向量,该句向量即为所述标准化支付数据。
统计模块20,用于根据所述标准化支付数据统计所述患者的周期性支付费用。
需要说明的是,所述周期性支付费用可以是月支付费用、季度支 付费用或者年支付费用,本实施例对此不加以限制。为了判断患者的周期性支付费用是否超过阈值,将按照要求统计所述患者的周期性支付费用,例如,要求判断月支付费用是否超过1万元之前,将根据所述标准化支付数据统计所述患者每月的月支付费用,要求判断年支付费用是否超过10万元之前,将根据所述标准化支付数据统计所述患者每年的年支付费用。
挖掘模块30,用于通过预设基于单元的孤立点检测算法判断所述周期性支付费用是否超过第一预设阈值。
需要说明的是,所述第一预设阈值为所述周期性支付费用对应的周期内允许支付的最高费用,若所述周期性支付费用超过所述第一预设阈值,则说明医院存在不合理的收费情形,若所述周期性支付费用不超过所述第一预设阈值,则说明医院收费较合理。
在具体实现中,由于本实施例是为了判断所述周期性支付费用是否超过第一预设阈值,而基于单元的孤立点检测算法用于检测数据集中是否存在孤立点,因此,根据这一特性,构造所述预设基于单元的孤立点检测算法,以查找所述周期性支付费用中的孤立点,该孤立点即超过了所述第一预设阈值。
在本实施例中,通过获取患者的抗排异支付数据,对所述抗排异支付数据进行标准化处理,获得标准化支付数据,根据所述标准化支付数据统计所述患者的周期性支付费用,通过预设基于单元的孤立点检测算法判断所述周期性支付费用是否超过第一预设阈值。由于对患者的抗排异支付数据进行标准化与统计,获得了周期性支付费用,从而能够根据预设基于单元的孤立点检测算法准确地判断周期性支付费用是否超量,从而督促医院合理收费,保障患者的利益。
在一实施例中,所述挖掘模块30,还用于根据第一预设阈值确定单元格边长,根据所述单元格边长将所述周期性支付费用所处的数据空间划分为若干单元格,并将所述周期性支付费用映射为所述单元格中的数据点;
判断所述单元格中的数据点是否为孤立点;
若所述单元格中的数据点为孤立点,则认定所述孤立点对应的周期性支付费用超过所述第一预设阈值。
在一实施例中,所述挖掘模块30,还用于遍历各单元格,统计各单元格中的第一数据点数目、各单元格的第一层邻居中的第二数据点数目以及各单元格的第二层邻居中的第三数据点数目;
根据所述第一数据点数目、所述第二数据点数目以及所述第三数据点数目判断所述单元格中的数据点是否为孤立点。
在一实施例中,所述挖掘模块30,还用于若所述第一数据点数目与所述第二数据点数目的和大于第二预设阈值,则认定所述单元格中的数据点不是孤立点;
若所述第一数据点数目、所述第二数据点数目以及所述第三数据点数目的和不大于所述第二预设阈值,则认定所述单元格中的数据点为孤立点;
否则,逐个判断所述单元格中的数据点是否为孤立点。
在一实施例中,所述处理模块10,还用于获取患者的抗排异支付数据,并对所述抗排异支付数据进行分词处理,生成词语序列;
将所述词语序列中的词语转化为词向量,生成对应的词向量序列;
根据预设双向递归神经网络模型将所述词向量序列编码为句子矩阵;
通过预设注意力模型将所述句子矩阵压缩为句向量,并将所述句向量作为标准化支付数据。
在一实施例中,所述处理模块10,还用于将所述词向量序列依次先正向后反向输入到预设双向递归神经网络模型中,以使所述预设双向递归神经网络模型对所述词向量序列进行编码,并输出句子矩阵。
在一实施例中,所述处理模块10,还用于通过预设注意力模型从所述句子矩阵中提取上下文向量;
根据所述上下文向量将所述句子矩阵压缩为句向量,并将所述句向量作为标准化支付数据。
本申请所述基于大数据的支付超量识别装置的其他实施例或具体实现方式可参照上述各方法实施例,此处不再赘述。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者***不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者***所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者***中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。词语第一、第二、以及第三等的使用不表示任何顺序,可将这些词语解释为名称。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如只读存储器镜像(Read Only Memory image,ROM)/随机存取存储器(Random Access Memory,RAM)、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种基于大数据的支付超量识别方法,其中,所述基于大数据的支付超量识别方法包括以下步骤:
    获取患者的抗排异支付数据,对所述抗排异支付数据进行标准化处理,获得标准化支付数据;
    根据所述标准化支付数据统计所述患者的周期性支付费用;
    通过预设基于单元的孤立点检测算法判断所述周期性支付费用是否超过第一预设阈值。
  2. 如权利要求1所述的基于大数据的支付超量识别方法,其中,所述通过预设基于单元的孤立点检测算法判断所述周期性支付费用是否超过第一预设阈值,包括:
    根据第一预设阈值确定单元格边长,根据所述单元格边长将所述周期性支付费用所处的数据空间划分为若干单元格,并将所述周期性支付费用映射为所述单元格中的数据点;
    判断所述单元格中的数据点是否为孤立点;
    若所述单元格中的数据点为孤立点,则认定所述孤立点对应的周期性支付费用超过所述第一预设阈值。
  3. 如权利要求2所述的基于大数据的支付超量识别方法,其中,所述判断所述单元格中的数据点是否为孤立点,包括:
    遍历各单元格,统计各单元格中的第一数据点数目、各单元格的第一层邻居中的第二数据点数目以及各单元格的第二层邻居中的第三数据点数目;
    根据所述第一数据点数目、所述第二数据点数目以及所述第三数据点数目判断所述单元格中的数据点是否为孤立点。
  4. 如权利要求3所述的基于大数据的支付超量识别方法,其中,所述根据所述第一数据点数目、所述第二数据点数目以及所述第三数据点数目判断所述单元格中的数据点是否为孤立点,包括:
    若所述第一数据点数目与所述第二数据点数目的和大于第二预设阈值,则认定所述单元格中的数据点不是孤立点;
    若所述第一数据点数目、所述第二数据点数目以及所述第三数据点数目的和不大于所述第二预设阈值,则认定所述单元格中的数据点为孤立点;
    否则,逐个判断所述单元格中的数据点是否为孤立点。
  5. 如权利要求1所述的基于大数据的支付超量识别方法,其中,所述获取患者的抗排异支付数据,对所述抗排异支付数据进行标准化处理,获得标准化支付数据,包括:
    获取患者的抗排异支付数据,并对所述抗排异支付数据进行分词处理,生成词语序列;
    将所述词语序列中的词语转化为词向量,生成对应的词向量序列;
    根据预设双向递归神经网络模型将所述词向量序列编码为句子矩阵;
    通过预设注意力模型将所述句子矩阵压缩为句向量,并将所述句向量作为标准化支付数据。
  6. 如权利要求5所述的基于大数据的支付超量识别方法,其中,所述根据预设双向递归神经网络模型将所述词向量序列编码为句子矩阵,包括:
    将所述词向量序列依次先正向后反向输入到预设双向递归神经网络模型中,以使所述预设双向递归神经网络模型对所述词向量序列进行编码,并输出句子矩阵。
  7. 如权利要求6所述的基于大数据的支付超量识别方法,其中,所述通过预设注意力模型将所述句子矩阵压缩为句向量,并将所述句向量作为标准化支付数据,包括:
    通过预设注意力模型从所述句子矩阵中提取上下文向量;
    根据所述上下文向量将所述句子矩阵压缩为句向量,并将所述句向量作为标准化支付数据。
  8. 一种用户设备,其中,所述用户设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述计算机可读指令配置为实现以下步骤:
    获取患者的抗排异支付数据,对所述抗排异支付数据进行标准化 处理,获得标准化支付数据;
    根据所述标准化支付数据统计所述患者的周期性支付费用;
    通过预设基于单元的孤立点检测算法判断所述周期性支付费用是否超过第一预设阈值。
  9. 如权利要求8所述的用户设备,其中,所述计算机可读指令还配置为实现以下步骤:
    根据第一预设阈值确定单元格边长,根据所述单元格边长将所述周期性支付费用所处的数据空间划分为若干单元格,并将所述周期性支付费用映射为所述单元格中的数据点;
    判断所述单元格中的数据点是否为孤立点;
    若所述单元格中的数据点为孤立点,则认定所述孤立点对应的周期性支付费用超过所述第一预设阈值。
  10. 如权利要求9所述的用户设备,其中,所述计算机可读指令还配置为实现以下步骤:
    遍历各单元格,统计各单元格中的第一数据点数目、各单元格的第一层邻居中的第二数据点数目以及各单元格的第二层邻居中的第三数据点数目;
    根据所述第一数据点数目、所述第二数据点数目以及所述第三数据点数目判断所述单元格中的数据点是否为孤立点。
  11. 如权利要求10所述的用户设备,其中,所述计算机可读指令还配置为实现以下步骤:
    若所述第一数据点数目与所述第二数据点数目的和大于第二预设阈值,则认定所述单元格中的数据点不是孤立点;
    若所述第一数据点数目、所述第二数据点数目以及所述第三数据点数目的和不大于所述第二预设阈值,则认定所述单元格中的数据点为孤立点;
    否则,逐个判断所述单元格中的数据点是否为孤立点。
  12. 如权利要求8所述的用户设备,其中,所述计算机可读指令还配置为实现以下步骤:
    获取患者的抗排异支付数据,并对所述抗排异支付数据进行分词 处理,生成词语序列;
    将所述词语序列中的词语转化为词向量,生成对应的词向量序列;
    根据预设双向递归神经网络模型将所述词向量序列编码为句子矩阵;
    通过预设注意力模型将所述句子矩阵压缩为句向量,并将所述句向量作为标准化支付数据。
  13. 如权利要求12所述的用户设备,其中,所述计算机可读指令还配置为实现以下步骤:
    将所述词向量序列依次先正向后反向输入到预设双向递归神经网络模型中,以使所述预设双向递归神经网络模型对所述词向量序列进行编码,并输出句子矩阵。
  14. 如权利要求13所述的用户设备,其中,所述计算机可读指令还配置为实现以下步骤:
    通过预设注意力模型从所述句子矩阵中提取上下文向量;
    根据所述上下文向量将所述句子矩阵压缩为句向量,并将所述句向量作为标准化支付数据。
  15. 一种存储介质,其中,所述存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如以下步骤:
    获取患者的抗排异支付数据,对所述抗排异支付数据进行标准化处理,获得标准化支付数据;
    根据所述标准化支付数据统计所述患者的周期性支付费用;
    通过预设基于单元的孤立点检测算法判断所述周期性支付费用是否超过第一预设阈值。
  16. 如权利要求15所述的存储介质,其中,所述计算机可读指令还配置为实现以下步骤:
    根据第一预设阈值确定单元格边长,根据所述单元格边长将所述周期性支付费用所处的数据空间划分为若干单元格,并将所述周期性支付费用映射为所述单元格中的数据点;
    判断所述单元格中的数据点是否为孤立点;
    若所述单元格中的数据点为孤立点,则认定所述孤立点对应的周 期性支付费用超过所述第一预设阈值。
  17. 如权利要求16所述的存储介质,其中,所述计算机可读指令还配置为实现以下步骤:
    遍历各单元格,统计各单元格中的第一数据点数目、各单元格的第一层邻居中的第二数据点数目以及各单元格的第二层邻居中的第三数据点数目;
    根据所述第一数据点数目、所述第二数据点数目以及所述第三数据点数目判断所述单元格中的数据点是否为孤立点。
  18. 如权利要求17所述的存储介质,其中,所述计算机可读指令还配置为实现以下步骤:
    若所述第一数据点数目与所述第二数据点数目的和大于第二预设阈值,则认定所述单元格中的数据点不是孤立点;
    若所述第一数据点数目、所述第二数据点数目以及所述第三数据点数目的和不大于所述第二预设阈值,则认定所述单元格中的数据点为孤立点;
    否则,逐个判断所述单元格中的数据点是否为孤立点。
  19. 如权利要求15所述的存储介质,其中,所述计算机可读指令还配置为实现以下步骤:
    获取患者的抗排异支付数据,并对所述抗排异支付数据进行分词处理,生成词语序列;
    将所述词语序列中的词语转化为词向量,生成对应的词向量序列;
    根据预设双向递归神经网络模型将所述词向量序列编码为句子矩阵;
    通过预设注意力模型将所述句子矩阵压缩为句向量,并将所述句向量作为标准化支付数据。
  20. 一种基于大数据的支付超量识别装置,其中,所述基于大数据的支付超量识别装置包括:
    处理模块,用于获取患者的抗排异支付数据,对所述抗排异支付数据进行标准化处理,获得标准化支付数据;
    统计模块,用于根据所述标准化支付数据统计所述患者的周期性 支付费用;
    挖掘模块,用于通过预设基于单元的孤立点检测算法判断所述周期性支付费用是否超过第一预设阈值。
PCT/CN2019/095412 2018-12-13 2019-07-10 基于大数据的支付超量识别方法、设备、存储介质及装置 WO2020119105A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811530549.6 2018-12-13
CN201811530549.6A CN109658265A (zh) 2018-12-13 2018-12-13 基于大数据的支付超量识别方法、设备、存储介质及装置

Publications (1)

Publication Number Publication Date
WO2020119105A1 true WO2020119105A1 (zh) 2020-06-18

Family

ID=66113212

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/095412 WO2020119105A1 (zh) 2018-12-13 2019-07-10 基于大数据的支付超量识别方法、设备、存储介质及装置

Country Status (2)

Country Link
CN (1) CN109658265A (zh)
WO (1) WO2020119105A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109658265A (zh) * 2018-12-13 2019-04-19 平安医疗健康管理股份有限公司 基于大数据的支付超量识别方法、设备、存储介质及装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106445988A (zh) * 2016-06-01 2017-02-22 上海坤士合生信息科技有限公司 一种大数据的智能处理方法和***
CN108563626A (zh) * 2018-01-22 2018-09-21 北京颐圣智能科技有限公司 医疗文本命名实体识别方法和装置
CN109658265A (zh) * 2018-12-13 2019-04-19 平安医疗健康管理股份有限公司 基于大数据的支付超量识别方法、设备、存储介质及装置

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708172B (zh) * 2012-05-02 2014-04-23 广州中大微电子有限公司 一种用于挖掘rfid数据孤立点的方法
CN103514576A (zh) * 2013-09-06 2014-01-15 深圳民太安信息技术有限公司 一种社保就诊违规套现的筛查方法
CN104240251B (zh) * 2014-09-17 2017-04-12 中国测绘科学研究院 一种基于密度分析的多尺度点云噪声检测方法
CN105117790A (zh) * 2015-07-29 2015-12-02 北京嘀嘀无限科技发展有限公司 车费预估方法及装置
CN106126507B (zh) * 2016-06-22 2019-08-09 哈尔滨工业大学深圳研究生院 一种基于字符编码的深度神经翻译方法及***
CN107563400A (zh) * 2016-06-30 2018-01-09 中国矿业大学 一种基于网格的密度峰值聚类方法及***
CN107092596B (zh) * 2017-04-24 2020-08-04 重庆邮电大学 基于attention CNNs和CCR的文本情感分析方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106445988A (zh) * 2016-06-01 2017-02-22 上海坤士合生信息科技有限公司 一种大数据的智能处理方法和***
CN108563626A (zh) * 2018-01-22 2018-09-21 北京颐圣智能科技有限公司 医疗文本命名实体识别方法和装置
CN109658265A (zh) * 2018-12-13 2019-04-19 平安医疗健康管理股份有限公司 基于大数据的支付超量识别方法、设备、存储介质及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHAO, FENG ET AL.: "Improvement and Application of Cell-based Outliers Detection Algorithm", COMPUTER ENGINEERING, vol. 35, no. 19, 31 October 2009 (2009-10-31), DOI: 1,2.2,4.1,4.2 *

Also Published As

Publication number Publication date
CN109658265A (zh) 2019-04-19

Similar Documents

Publication Publication Date Title
WO2019218699A1 (zh) 欺诈交易判断方法、装置、计算机设备和存储介质
CN109920174B (zh) 图书借阅方法、装置、电子设备及存储介质
CN109189769A (zh) 数据标准化处理方法、装置、计算机设备和存储介质
WO2019085064A1 (zh) 医疗理赔拒付方法、装置、终端设备及存储介质
US9792484B2 (en) Biometric information registration apparatus and biometric information registration method
CN110610431A (zh) 基于大数据的智能理赔方法及智能理赔***
CN113918884A (zh) 业务量预测模型构建方法和业务量预测方法
CN110309930A (zh) 基于人脸识别的预约就诊方法、装置、设备及存储介质
CN113642639B (zh) 活体检测方法、装置、设备和存储介质
CN109190925B (zh) 保单推荐方法、装置、计算机设备及存储介质
WO2020119105A1 (zh) 基于大数据的支付超量识别方法、设备、存储介质及装置
CN113344437B (zh) 理赔业务处理方法、装置、计算机设备和存储介质
US20230316416A1 (en) Determining Body Characteristics Based on Images
CN110489434B (zh) 一种信息处理方法及相关设备
CN112183380A (zh) 基于人脸识别的客流量分析方法和***、电子设备
CN109542947B (zh) 数据统计方法、装置、计算机设备和存储介质
CN111161088A (zh) 票据处理方法、装置和设备
CN111652433B (zh) 养老费用测算装置
CN113469635A (zh) 基于考勤数据的人员管理方法、装置、设备及存储介质
CN106570576A (zh) 数据预测方法及预测装置
KR20210126408A (ko) 질병 빅데이터를 활용한 ai 손해사정 산출 장치 및 방법
KR102344848B1 (ko) 안면 인식을 이용한 환자 식별 장치 및 방법
CN113221762B (zh) 代价平衡决策方法、保险理赔决策方法、装置和设备
US20200294065A1 (en) System and Method for Identifying Suspicious Healthcare Behavior
CN116152936B (zh) 一种带交互式活体检测的人脸身份认证***及其方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19896489

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 15/10/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19896489

Country of ref document: EP

Kind code of ref document: A1