WO2018171324A1 - 一种资费数据的确定方法及装置 - Google Patents

一种资费数据的确定方法及装置 Download PDF

Info

Publication number
WO2018171324A1
WO2018171324A1 PCT/CN2018/073850 CN2018073850W WO2018171324A1 WO 2018171324 A1 WO2018171324 A1 WO 2018171324A1 CN 2018073850 W CN2018073850 W CN 2018073850W WO 2018171324 A1 WO2018171324 A1 WO 2018171324A1
Authority
WO
WIPO (PCT)
Prior art keywords
group
tariff
bill
charging
billing
Prior art date
Application number
PCT/CN2018/073850
Other languages
English (en)
French (fr)
Inventor
李正兵
汪芳山
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP18770450.7A priority Critical patent/EP3591894B1/en
Publication of WO2018171324A1 publication Critical patent/WO2018171324A1/zh
Priority to US16/573,217 priority patent/US10750031B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/14Charging, metering or billing arrangements for data wireline or wireless communications
    • H04L12/1403Architecture for metering, charging or billing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M15/00Arrangements for metering, time-control or time indication ; Metering, charging or billing arrangements for voice wireline or wireless communications, e.g. VoIP
    • H04M15/80Rating or billing plans; Tariff determination aspects
    • H04M15/8022Determining tariff or charge band
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/14Charging, metering or billing arrangements for data wireline or wireless communications
    • H04L12/1432Metric aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/14Charging, metering or billing arrangements for data wireline or wireless communications
    • H04L12/1485Tariff-related aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M15/00Arrangements for metering, time-control or time indication ; Metering, charging or billing arrangements for voice wireline or wireless communications, e.g. VoIP
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M15/00Arrangements for metering, time-control or time indication ; Metering, charging or billing arrangements for voice wireline or wireless communications, e.g. VoIP
    • H04M15/41Billing record details, i.e. parameters, identifiers, structure of call data record [CDR]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M15/00Arrangements for metering, time-control or time indication ; Metering, charging or billing arrangements for voice wireline or wireless communications, e.g. VoIP
    • H04M15/70Administration or customization aspects; Counter-checking correct charges
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M15/00Arrangements for metering, time-control or time indication ; Metering, charging or billing arrangements for voice wireline or wireless communications, e.g. VoIP
    • H04M15/70Administration or customization aspects; Counter-checking correct charges
    • H04M15/73Validating charges
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M15/00Arrangements for metering, time-control or time indication ; Metering, charging or billing arrangements for voice wireline or wireless communications, e.g. VoIP
    • H04M15/80Rating or billing plans; Tariff determination aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M15/00Arrangements for metering, time-control or time indication ; Metering, charging or billing arrangements for voice wireline or wireless communications, e.g. VoIP
    • H04M15/80Rating or billing plans; Tariff determination aspects
    • H04M15/8083Rating or billing plans; Tariff determination aspects involving reduced rates or discounts, e.g. time-of-day reductions or volume discounts
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/24Accounting or billing

Definitions

  • the present invention relates to the field of communications, and in particular, to a method and an apparatus for determining tariff data.
  • the operator charges the user the corresponding fee through the charging system. Due to the continuous development of the operation service, the function of the old billing system cannot meet the development of the operation service. Therefore, the old billing system needs to be upgraded and upgraded to the new billing system.
  • the process of replacing the old billing system with the new billing system is as follows: 1. Importing the old billing system data into the new billing system; 2. Comparing the bills generated by the new billing system with the bills generated by the old billing system; The old billing system data is updated. If the new billing system is under the same billing conditions and the deduction fee is the same, the old billing system can be replaced by the new billing system without affecting the user experience. However, since the data structure in the old billing system is different from the data structure in the new billing system, the data in the old billing system may be imported into the new billing system, which may result in data in the old billing system. Lost.
  • the old billing system can be replaced by the new billing system only when the old billing system and the new billing system generate the same billing fee.
  • the correctness of the tariff data in the old billing system is very important.
  • it is necessary to compare the cost of each round of bills, and then need to use a computer to process a large number of words. Single, multiple rounds of repetition, and repeatedly confirm with the operators that the package tariff data in the old billing system is correct, which will consume a lot of computing resources and human resources, and the efficiency is very low.
  • the embodiment of the present invention provides a method and a device for determining tariff data.
  • the method for determining tariff data is applied to a device for determining tariff data, and the method for determining the tariff data is used to analyze a large number of bills. Reverse the performance of the tariff data, greatly reducing the manual dependence, improving the efficiency of determining the data of the old billing system, saving computing resources and manpower.
  • a method for determining tariff data including:
  • the bill set may be a pre-processed standard data format bill, and each bill in the bill set includes a billing condition and a billing unit and a deduction fee under the billing condition. Then, the multiple bills in the bill set are grouped according to the preset rules to obtain the group bill; the reason that the dialog bill group is grouped is: the value corresponding to the billing unit included in the bill and the deduction fee.
  • the corresponding cost value even if the tariff of the same service under the same package, will generate different tariff rates according to the different charging elements included in the charging conditions, which will result in the bill collection. There are various tariff rates mixed together. For example, in a voice service, multiple charging values may occur for the same charging duration.
  • the bill a is the calling bill of Shenzhen to Shenzhen
  • the duration of the call is 10 minutes
  • the corresponding fee value is 0.5 yuan
  • the bill b is also the calling bill of Shenzhen to Shenzhen.
  • the duration of the call is 10 minutes, but the corresponding cost is 1 yuan. That is to say, in different bills, the same call duration, the corresponding cost value may be Different, and the reason may be due to busy hours, different rates of leisure time, or family and ordinary number tariffs. Therefore, the purpose of grouping multiple bills in a dialog set is to separate bills of different tariff rates.
  • the obtained group CDR needs to satisfy the grouping condition, and the grouping condition is that the same value corresponding to the charging unit in the group CDR has a corresponding relationship with the unique fee value corresponding to the charging fee; when the grouped CDRs satisfy the grouping condition
  • the cost value data corresponding to the deduction fee in the group bill can be analyzed to obtain the data feature; the data feature has a corresponding relationship with the target tariff model, and then the data feature can be selected from the preset tariff model according to the data feature.
  • the target tariff model corresponding to the data feature the tariff model is used to calculate the deduction fee according to the charging unit; the tariff data includes the tariff model, the tariff parameter corresponding to the tariff model, and the charging condition. After the target tariff model is determined, the bill can be grouped.
  • the value corresponding to the charging unit and the cost value corresponding to the deduction fee determine the parameter value corresponding to each tariff parameter in the target tariff model; finally, the tariff data is determined according to the parameter value, the target tariff model, and the charging condition.
  • the data in a large number of bills in the old billing system is reversely derived, and the tariff data is derived.
  • the traditional billing system is not required to be manually corrected by the manual to correct the old billing system. Single, thereby obtaining the tariff data of the old billing system, greatly reducing the manual dependency, improving the efficiency of determining the data of the old billing system, and saving computing resources and manpower.
  • the charging condition includes a plurality of charging elements, and the plurality of bills in the bill set are grouped according to a preset rule, and the specific manner of obtaining the group bill can be:
  • the plurality of bills are grouped according to the preset billing elements of the plurality of billing elements to obtain a group bill, and the preset billing element may be preset by experience knowledge, and the same preset in the group bill.
  • the element values corresponding to the billing element are the same.
  • the preset charging element may be a service type, a service flow, a calling attribution, a called attribution, a call type, etc., according to an empirical value, if the preset is used in each bill in the group bill.
  • the element values of the same charging element in the fee element are the same, and grouping according to the preset charging element can make the group CDR satisfy the grouping condition, improve the grouping efficiency, and improve the processing efficiency of the subsequent steps.
  • the same value corresponding to the charging unit has a corresponding relationship with at least two cost values corresponding to the deduction charge, it can be understood that if the group CDR does not satisfy the grouping condition, the group vocabulary is required.
  • the packet is further forwarded, and the CDR after the direct packet satisfies the grouping condition, and the target charging element in the group CDR is determined.
  • the target charging element is other than the preset charging element among all the charging elements included in the CDR. For the charging element, the target charging element corresponds to at least two different element values.
  • the charging element corresponding to “hour” includes two values of “1” and “16”;
  • the bills with the same target element value are grouped into one group, and at least two target sub-group bills are obtained.
  • the bills with the feature value of “1” are grouped into two groups to obtain two target sub-group bills; when each target sub-group bill is
  • the target charging element is used as the splitting point of the group of bills, that is, according to the target charging element.
  • At least two target sub-group CDRs obtained by further grouping the group CDRs are group CDRs, and the target element value is a component value corresponding to the target charging element, and optionally, more included in the CDR
  • the sorting rule is the degree of influence on the charging fee. If the group bill does not satisfy the grouping condition, the target is extracted from the plurality of charging elements in order.
  • the charging element uses the target charging element as a splitting point to improve the correct probability of extracting the splitting point.
  • the similar group bills are merged.
  • the similar group bills in the group bills may have different preset billing elements among the plurality of billing elements.
  • the group bills with the same tariff rate means that the value corresponding to the billing duration in the group bill is the same as the cost of the debit fee. For example, in a group bill, the billing time 30 corresponds to the deduction fee of 20, and the billing time is 40.
  • the other group bills also include the billing time 30 corresponding to the deduction charge 20; the billing duration 40 corresponds to the charge 40 mapping relationship, then the two group bills are merged, in this embodiment, The group bills with the same tariff rate are combined, and then the combined group bills can be processed, which can improve the processing efficiency of the subsequent processing steps.
  • the entropy of the data in the group CDR is calculated according to the following formula.
  • the parameter in the entropy is redefined, and the degree of confusion of the tariff rate in the group CDR is determined by the calculation of the entropy.
  • the group of CDRs may be a group CDR after the initial grouping, or may be a group CDR after the group CDRs are merged;
  • D is the number of bills in the group bill
  • Di is the number of times the billing unit i appears
  • p ij is the billing unit i
  • the probability of the deduction fee j is present;
  • the billing unit is determined If the entropy is less than or equal to the threshold, the number of the cost value corresponding to the deduction fee is greater than or equal to the threshold value, and the cost value corresponding to the deduction fee corresponding to the deduction fee is determined.
  • the number is one.
  • the target charging element includes the first charging element and the second charging element
  • the CDRs with the same target charging element value are grouped into one group, and at least two target sub-group CDRs are obtained.
  • the specific method includes: dividing the bills with the same first element value into a group, and obtaining at least two first sub-group bills, where the first element value is the element value corresponding to the first billing element; if the second element value is the same Dividing into a group, obtaining at least two second sub-groups, the second element value is an element value corresponding to the second billing element; and, further, calculating a first sub-group bill relative to the group bill An information gain, and calculating a second information gain of the second subset of the CDRs relative to the group CDR; wherein the information gain is calculated by entropy of the grouped CDRs minus the entropy of the grouped CDRs after the grouping The larger the information gain, the smaller the entropy will be according to the sub-group CDR
  • the same charging unit will be reduced for different deduction charges.
  • Calculation of information gain It is possible to verify the probability that the same charging unit in the group bill corresponds to different deduction charges, and improve the efficiency of verifying whether a charging element is a splitting point; if the first information gain is greater than the second information gain, the second charging is performed.
  • the element is deleted from the plurality of charging elements; the second charging element (such as the day of the week) is deleted from the charging condition, and the redundant charging element is deleted, so that the amount of data can be effectively reduced, and one charging element is reduced. It can reduce the amount of data that needs to be processed, and can also effectively reduce the amount of storage of the finalized tariff data, and make the billing elements of the tariff data reduce interference more clear.
  • each bill in the bill set carries a package identifier, and the package identifier is used to indicate the tariff package to which the bill belongs, and further, according to the package identifier carried by the bill.
  • the package identifier is used to indicate the tariff package to which the bill belongs, and further, according to the package identifier carried by the bill.
  • the rate is integrated in a tree structure for user viewing and subsequent analysis.
  • the determined package tariff data or tariff data is saved, and the tariff data is saved to the storage server to support the subsequent new billing server rating and the bill comparison server to compare the old and new bills. Fee situation.
  • the specific implementation method may be: obtaining different fee values corresponding to the deduction charge in the group bill; ordering different cost values in order; determining the sort The difference between two consecutive cost values, the difference is the hopping deduction value; according to the number of the difference, the number of different hopping deductions can be determined.
  • the amount of the hop-off deduction value can be reversed by the cost value corresponding to the large deduction fee.
  • the specific implementation method of selecting the target tariff model corresponding to the data feature from the preset tariff model according to the data feature is: if the number of the hop-off deduction value is one, the slave preset In the tariff model, the first tariff model is selected as the target tariff model, and the first tariff model is a simple tariff model; the format of the first tariff model is: Where y is the deduction fee; x is the charging duration; unitFee is the hopping deduction; pulse is the hop, hop is the minimum charging duration unit; ceil is the ceiling function, and is recorded; in the model, The unitFee is one; if the number of hopping deductions is at least two, the second tariff model is selected from the preset tariff model as the target tariff model, and the second tariff model is the sub-file tariff model, and the second tariff model
  • the format is: When the billing duration exceeds the binning point, the tariff model is:
  • unitFee1 represents a long jump times 0 deduction from the time when the charging breakPoint1; long jump from 0 to breakPoint1 views at the time of pulse 1 represents a charging; unitFee 2 represents a charging time longer than hop times during breakPoint1 deduction value; Pulse 2 indicates the hop time when the billing duration is greater than breakPoint1, and the splitting point is the billing duration corresponding to the demarcation point of the different hopping deductions.
  • the charging unit includes a charging duration of the voice service, a charging traffic of the data service, and a number of charging packets of the short message and the multimedia message service.
  • an embodiment of the present invention provides a device for determining tariff data, which has a function performed by a determining device that implements actual tariff data in the above method.
  • This function can be implemented in hardware or in hardware by executing the corresponding software.
  • the hardware or software includes one or more modules corresponding to the functions described above.
  • the structure of the determining device of the tariff data includes a memory, a network interface, and a processor.
  • the memory is used to store computer executable program code and is coupled to a network interface.
  • the program code includes instructions that, when executed by the processor, cause the determining device to perform the information or instructions involved in the above method.
  • an embodiment of the present invention provides a computer readable storage medium, comprising instructions, when executed on a computer, causing a computer to perform the method of the above first aspect.
  • an embodiment of the present invention provides a computer program product comprising instructions that, when run on a computer, cause the computer to perform the method of the first aspect above.
  • FIG. 1 is a schematic structural diagram of a system for determining tariff data according to an embodiment of the present invention
  • FIG. 2 is a flow chart of steps of an embodiment of a method for determining tariff data according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of steps of preprocessing in an embodiment of the present invention.
  • FIG. 4 is a schematic flowchart of grouping multiple bills in a dialog list according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of an embodiment of a device for determining tariff data according to an embodiment of the present invention
  • FIG. 6 is a schematic structural diagram of another embodiment of a device for determining tariff data according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of another embodiment of a device for determining tariff data according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of another embodiment of a device for determining tariff data according to an embodiment of the present invention.
  • the embodiment of the invention provides a method for determining and charging tariff data, which is used for analyzing a large number of bills and reversing the tariff data, thereby greatly reducing the manual dependency and improving the efficiency of determining the tariff data of the old billing system. , saving computing resources and manpower.
  • Old billing system data Data stored in the old billing system, including user data, system data, and bill data.
  • User data including the package ordered by the user, the funds of the user account, the phone number of the user, and the area to which the user belongs.
  • System data including number analysis data, tax rate information, authentication data, etc.
  • the number analysis data is used to support the new system to analyze the user's attribution, the place of visit, whether roaming, etc. according to the user's calling number or Home Location Register (HLR) number;
  • tax rate information is used to support Taxing user consumption business, for example, deducting all voice calls, and increasing the tax by 10%.
  • CDR refers to the original communication record information, which can also be called a detailed list, Call Detail Record (CDR).
  • the CDRs include the CDRs of the SMS service, the CDRs of the MMS service, the CDRs of the voice service, and the CDRs of the network service.
  • a user may generate two bills each time, such as a bill of the calling user and a bill of the called user. In the billing system, one user can correspond to multiple bills.
  • the bill mainly records the following information:
  • the billing condition and the billing rate can also be understood as the billing rate under a certain billing condition; wherein the billing condition includes a plurality of billing elements.
  • Billing elements including package ID, service type (including voice, SMS, MMS, traffic), business process (calling/called/previous), main/called attribution, main/called access, main/be Call the operator, master/called number, master/being number, master/by-group account number, master/called account balance, master/called call duration, master/called call duration, master/be Call charge, master/called billing time (including information extracted from billing time, such as billing week, hour, minute, etc.) tax rate information, traffic consumed by the caller, and SMS sent by the caller Number and so on.
  • service type including voice, SMS, MMS, traffic
  • business process calling/called/previous
  • main/called attribution main/called access
  • main/be Call the operator master/called number, master/being number, master/by-group account number, master/called account balance, master/called call duration, master/called call duration, master/be Call charge
  • billing time including information extracted from billing time, such as billing week, hour, minute, etc. tax rate information, traffic consumed by the caller
  • Billing rate is used to indicate the deduction fee corresponding to the billing unit.
  • the billing unit includes the billing duration of the voice service, the billing traffic of the network service, the number of billing messages of the short message and the MMS service, and the like.
  • the unit of the deduction time can be counted as “seconds”, and the billing fee can be counted as “minutes”.
  • the unit of the deduction amount can be counted as "bar", and the deduction fee can be calculated as "minute”.
  • the unit of deduction charge in Table 3 above can be counted as “minutes”, and the unit of billing flow can be counted as “megabytes”.
  • Group bill Groups tens of thousands and hundreds of millions of bills in the old billing system to obtain the grouped bills.
  • the group bill includes multiple bills.
  • group CDR and the “sub-group CDR” are used, and the “sub-group CDR” is also a form of the CDR, which is in the embodiment of the present invention. It involves grouping multiple bills, taking two groups as an example. After the initial grouping, it is called “group bill”. On the basis of the initial grouping, that is, the bills after grouping the group bills are called “Subgroup CDRs, the distinction between "group CDR" and "sub-group CDR” in the example is for convenience of description. In the embodiment of the present invention, multiple groups may be involved, where "group CDR" or "sub-group” A CDR can be understood as a grouped CDR. It can be understood that a group CDR can include a sub-group CDR.
  • the old billing system since the old billing system stores a large number of bills of various services of the users in the foregoing Tables 1 to 3, in the embodiment of the present invention, a large number of bills in the old billing system can be analyzed and processed. Therefore, the package tariff data is reversely pushed to solve the problem that the traditional method requires multiple rounds of repeated determination of the correctness of the tariff data in the old billing system, thereby causing waste of computing resources and human resources.
  • FIG. 1 is a schematic structural diagram of a system for determining tariff data.
  • the system includes determining means 110 for tariff data, and the determining means 110 of the tariff data may exist in the form of a server, a storage server 120, an accounting server 130, and a bill comparison server 140.
  • the determining device 110 of the tariff data is configured to acquire old billing system data, the old billing system data includes a large amount of bill data, and the tuition data or the further billing fee data is reversely derived according to the large amount of bill data.
  • the storage server 120 is used for storing the bill data and the tariff data in the embodiment of the present invention.
  • the storage server may be an oracle database or a distributed file storage system (HADO).
  • HADO distributed file storage system
  • the billing server 130 the tariff data collection point, generates a bill for use by the service service, and deducts the various services provided by the operator according to the tariff data.
  • the bill comparison server 140 is configured to compare the deduction fee of the new billing system and the deduction fee of the old billing system under the same billing condition, and compare the comparison of the bills to determine the new billing system and Whether the deduction fee of the old billing system is the same.
  • the deduction fee of the new billing system and the deduction fee of the old billing system are substantially the same under the same billing condition.
  • a method for determining the tariff data in the embodiment of the present invention is performed by the determining device 110 of the tariff data.
  • the determining device 110 obtains a bill set from the storage server 120, and the bill set includes a large number of bills, and the bill is The tens of thousands or even hundreds of millions of CDRs in the collection are reverse-derived to obtain the tariff data.
  • the plurality of CDRs in the CDR set are grouped according to preset rules to obtain a group CDR, and the group CDR is satisfied to be grouped.
  • the condition is that the grouping condition is that the same value corresponding to the charging unit in the group bill has a corresponding relationship with the unique cost value corresponding to the deduction fee.
  • each bill includes a billing unit and a deduction fee corresponding to the billing unit.
  • the bill a is the calling bill for Shenzhen to Shenzhen.
  • the call duration is 10 minutes, and the corresponding deduction fee is 0.5 yuan.
  • the bill b is also the calling bill for Shenzhen to Shenzhen. It is 10 minutes, but the corresponding deduction fee is 1 yuan. That is to say, in different bills, the same call duration may be different, and the reason may be due to busy time and different time charges. , or the family number is different from the ordinary number.
  • a large number of bills in the dialog set are required to be grouped, so that the same value corresponding to the billing unit in the group bill is corresponding to the unique fee value corresponding to the deduction fee.
  • the grouped bill A and the group bill B the group bill A includes the bill a
  • the group bill B includes the bill b
  • the group bill A includes four bills, in which the four words
  • the call duration is 10 minutes
  • the corresponding deduction fee is 0.5 yuan.
  • the passage time of 10 minutes does not correspond to the cost value of other deduction charges.
  • the bill a and the bill b are preset. The rules are divided into different groups.
  • the cost value corresponding to the deduction fee of the group CDRs can be analyzed, and the CDRs of the group CDRs can be obtained.
  • a data feature the data feature is used to select a target tariff model corresponding to the data feature from the preset tariff model, and the tariff model is configured to calculate a chargeback fee according to the charging unit, and then determine each of the target tariff models according to the data feature.
  • the parameter value corresponding to the tariff parameter; finally, the tariff data can be determined according to the parameter value, the target tariff model and the charging condition.
  • the data in a large number of bills in the old billing system is reversely derived, and the tariff data is derived.
  • the traditional billing system is not required to be manually corrected by the manual to correct the old billing system. Single, thereby obtaining the tariff data of the old billing system, greatly reducing the manual dependency, improving the efficiency of determining the data of the old billing system, and saving computing resources and manpower.
  • One embodiment of the method for determining the tariff data provided by the embodiment of the present invention includes :
  • Step 201 Obtain a bill set, and each bill in the bill set includes a billing condition and a billing unit and a deduction fee under the billing condition.
  • the bill set in this embodiment may be a pre-processed standard data format bill. Please understand with reference to FIG. 3, which is a schematic diagram of the steps of preprocessing in the embodiment of the present invention.
  • a standard data structure is provided to adapt to the change and convert the existing data of the old system into a standard data structure. And import the converted standard format data into the storage server, and use the tariff data deduction.
  • the pre-processing step is mainly to convert the format of the data, and the specific steps include: 1. acquiring old billing system data (including user data, system data and bill data); 2. acquiring old billing system data. Convert data in standard format; 3. Store data in standard format to a storage server.
  • the pre-processing process converts the format of the old charging data
  • the standard format data is stored in the storage system, and the step of determining the tariff data is performed, if the old charging system data has been If the data is in a standard format, or the old billing system data is the same as the new billing system data, the pre-processing step may not be performed.
  • the CDR list is obtained from the storage server.
  • the CDR set can be a CDR of a standard data format in the storage server.
  • the CDR set includes a large number of CDRs.
  • Each CDR can be a CDR according to Tables 1 to 3. Show.
  • Step 202 Group the plurality of bills in the bill set according to the preset rule to obtain a plurality of group bills, where the plurality of group bills need to meet the grouping condition, and the grouping condition is: the same unit corresponding to the billing unit The value has a corresponding relationship with the unique cost value corresponding to the deduction fee.
  • step 203 is performed; if the group CDR does not satisfy the grouping condition, the group CDR is further grouped until the group CDR after the group CDR grouping satisfies the grouping condition.
  • FIG. 4 is a schematic flowchart of grouping multiple bills in a dialog list according to an embodiment of the present invention.
  • the specific steps of grouping multiple bills in the bill set according to preset rules may be:
  • Step a initial grouping, grouping a plurality of bills in the bill set according to preset billing elements of the plurality of billing elements, and obtaining a plurality of group bills, each group call of the plurality of group bills
  • the element values corresponding to a single preset charging element are the same.
  • the preset charging element is preset according to the empirical knowledge, and the voice service is taken as an example for description. It is assumed that the initial grouping conditions for a certain package are: service type, service flow, calling place attribution, called attribution, and call type.
  • the preset billing element three group bills are obtained, which are respectively constructed for the local voice caller group bill of Shenzhen to Shenzhen, and the inter-provincial (domestic) long-distance voice caller group bill of Shenzhen to Beijing, Shenzhen. Hit the US international long distance voice caller group bills and so on.
  • the following is an example of a group CDR in the three group CDRs. For example, the CDR of the calling fee group in Shenzhen is used as an example.
  • the CDRs in the group are as shown in Table 4 below:
  • a plurality of bills in the dialog set are grouped, and the number of group bills (such as 3) is obtained, and the value corresponding to the deduction period in the group bills in the example in Table 4 and the deduction fee are corresponding.
  • the cost value, as well as the plurality of billing elements, are examples for convenience of explanation and do not result in a limited description of the present application.
  • Step b Determine whether the group CDR meets the following grouping conditions: the same value corresponding to the charging unit in the group CDR has a corresponding relationship with the unique fee value corresponding to the charging fee.
  • the group bills in Table 4 above are taken as an example.
  • the group bills in Table 4 include 4 bills, and the element values corresponding to the preset billing elements in the four group bills are the same, that is, the The four bills are the caller's voice bills in Shenzhen.
  • the billing fee corresponding to the billing time 30 is 30; in the second bill, the billing fee corresponding to the billing time 20 is 10 In the third bill, the billing fee corresponding to the billing time 40 is 40; in the fourth bill, the billing fee corresponding to the billing time 20 is 20.
  • the deduction charge corresponding to each billing period has only one corresponding relationship. That is to say, the group bill in Table 4 satisfies the same value corresponding to the billing unit in the group bill and the unique cost value corresponding to the deduction fee has Correspondence relationship.
  • the billing time 30 corresponds to the values of the two deduction charges, 20 and 10 respectively.
  • the billing duration 40 corresponds to the cost value of the two deduction charges, which are 40 and 20 respectively. That is to say, in the example of the group bill of Table 5, the group bill does not satisfy the bill in the group bill.
  • the same value corresponding to the fee unit has a corresponding relationship with the unique cost value corresponding to the deduction fee.
  • the above conditions are used to satisfy the conditions for the group CDRs obtained by the grouping.
  • the following is a description of how to verify whether the group CDRs have the corresponding relationship between the same value corresponding to the charging unit and the unique cost value corresponding to the charging fee. Be explained.
  • D represents the number of bills in the group bill
  • Di represents the number of times the billing unit i appears
  • p ij represents the billing unit i
  • the probability of the deduction fee j appears.
  • the entropy can be used to verify the degree of confusion of the same charging duration (x) corresponding to different deduction charges (y) in the group CDR after the grouping, and the goal of the grouping is to make the entropy 0 or bring the entropy close to 0, so that the group CDR
  • the same value corresponding to the billing duration has only one deduction fee corresponding to the value.
  • the billing duration in the group of bills is 30 and the deduction fee has only one value (such as 30).
  • the calculation process of the formula is divided into two parts. Representing the entropy of the CDRs for the CDRs with a billing duration of 30. Represents the entropy generated by the bill for the group bill with a billing duration of 40.
  • the corresponding billing time is 30, of which Indicates the ratio of the number of CDRs with a billing duration of 30 to the total number of CDRs.
  • the numerator indicates that there are two bills with a billing duration of 30, and the denominator 4 indicates that there are a total of four bills for the group.
  • the first one middle Indicates the ratio of the billing duration (x) to 30, the deduction charge (y) of 20, and the number of bills to the number of bills with a billing duration of 30, and the second middle Indicates the ratio of the number of bills (x) is 30, and the billing fee (y) is 10, which is the ratio of the number of bills with a billing duration of 30.
  • the entropy is greater than the threshold, the number of cost values corresponding to the same value of the billing duration and the deduction fee is greater than or equal to two.
  • the number of "same values" is at least one.
  • the billing duration includes 1, 5, 10, and 15, wherein the billing time is 1, the corresponding deduction fee is 1 and 2, respectively; and the billing time 5 corresponds to the deduction fee of 5 and 6; the charging fee corresponding to the charging duration 10 is 10 and 11 respectively; that is, the three values corresponding to the billing duration of the group do not satisfy the grouping condition, and it is also understood that if the entropy is greater than the threshold In practical applications, at least one value corresponding to the charging duration does not satisfy the grouping condition.
  • the threshold is set to be greater than and close to 0 (such as 0.001).
  • the same value and deduction fee may exist.
  • the charging duration includes 1, 5, 10, and 15, wherein the charging fee corresponding to the charging duration 1 is 1, and the charging fee corresponding to the charging duration 5 is 5, and the charging duration is 5
  • the corresponding deduction charge of 15 is 15, and the deduction charge corresponding to the charging duration of 10 is 10 and 12, that is to say, a plurality of values in the billing duration includes a value of 10 corresponding to two cost values.
  • the situation is within the allowable range of error.
  • only four values are used as an example. In practical applications, there may be tens of thousands of values, that is, in a large number of values of the billing duration, the pole can be allowed.
  • the amount of the cost value corresponding to the small amount of the value and the deduction fee is greater than or equal to two.
  • the entropy of the group CDR in Table 5 is calculated. If the entropy of the CDR is greater than the threshold, the same value and the charge for the charge are included in the CDR.
  • the value has at least two correspondences. As shown in Table 5, the billing time 30 corresponds to the deduction fee of 20, and also corresponds to the deduction fee of 10.
  • step 303 can be directly performed.
  • the CDRs in the group are as shown in the example in Table 5, it indicates that the CDRs of the group do not meet the conditions that the corresponding value corresponding to the charging unit has a corresponding relationship with the unique cost value corresponding to the charging fee. Therefore, it is necessary to continue the group CDR. The grouping is continued until the condition that the same value corresponding to the charging unit and the unique fee value corresponding to the charging fee have a corresponding relationship.
  • the process of grouping multiple bills in the dialog set in the embodiment of the present invention may be performed by first grouping the plurality of bills according to the preset billing elements to obtain a group bill; And verifying whether the group of bills satisfies the grouping condition by calculating the entropy of the group of bills.
  • the group CDRs need to be further grouped to obtain the grouped CDRs.
  • a plurality of bills in the dialog set are grouped, and the number of times of the packet is two, the first time is the initial group, and the second time is the grouping of the group bills, and the sub-group bills are obtained, which needs to be understood.
  • only two groups are taken as an example for description.
  • the number of times that multiple bills are grouped in the dialog list is not limited, and the first initial group may satisfy the grouping.
  • the first initial group may satisfy the grouping.
  • no subsequent grouping is required, or, after the second grouping, the grouped CDR still does not satisfy the grouping condition, and the third and fourth groupings need to be performed until the grouped CDRs satisfy the grouping condition, regardless of The principle of the packet is the same as that of the second packet.
  • the specific process of the subsequent packet can be understood by referring to the process of grouping the group ticket in this embodiment.
  • a group CDR is used as an example.
  • each group CDR is required to be verified, and each group CDR is verified as to whether the grouping condition is the same.
  • the similar group CDRs may be further merged.
  • the similar group CDRs in the group CDRs may have different preset charging elements among the plurality of charging elements, but the fee fee is different.
  • the same group of CDRs As an example, in the example of the group CDR corresponding to Table 4, the value corresponding to the charging duration in the group CDR after the grouping and the cost value corresponding to the charging fee are obtained, and the value corresponding to the charging duration and the deduction fee are saved.
  • the corresponding cost value mapping relationship (such as the group bill corresponding to Table 4, the billing time 30 corresponds to the deduction fee 20, the billing time 40 corresponds to the charge 40), and the billing duration in the other group bill C is assumed.
  • the mapping relationship of the deduction fee is the same as the group CDR corresponding to the table 4, that is, the charging time 30 corresponds to the deduction fee 20; the charging duration 40 corresponds to the deduction fee 40, and the group CDR corresponding to Table 4 is explained.
  • the group bill C is a tariff rate, then the group bill corresponding to the table 4 is combined with the group bill C.
  • the group bills with the same tariff rate can be merged, and then the pair can be The combined group bills are processed to improve the processing efficiency of subsequent processing steps.
  • the combined group CDRs (such as the group CDRs corresponding to Table 4 and the group CDRs after the combination of the group CDRs C) are verified, that is, the group CDRs and group CDRs corresponding to Table 4 are made.
  • the step b is repeated, and the entropy of the combined group bills is calculated to determine whether the merged group bills meet the grouping condition. If the combined group bills meet the grouping conditions, step 303 is continued. If the merged group bill does not satisfy the grouping condition, step c is performed.
  • Step c If the group CDR does not satisfy the grouping condition, determine a target charging element in the group CDR, and determine an information gain corresponding to the sub-group CDR with the target charging element as a split point, and the target charging element corresponds to at least Two different feature values.
  • the bills with the same target element value are grouped into one group, and at least two target sub-group bills are obtained.
  • the group CDR in this step may be a group CDR after the initial grouping, or may be a group CDR after the grouping.
  • each target sub-group CDR has a corresponding relationship with the unique cost value corresponding to the deduction fee
  • at least two target sub-group CDRs are group CDRs
  • the target element value is the target.
  • the target charging element may be a charging element, or may include at least two charging elements, where the target charging element is a charging element, and the target charging element includes at least two
  • the fee elements are exemplified by examples.
  • the target charging element is exemplified as one charging element.
  • the hour to which the calling billing time belongs is extracted from a plurality of charging elements as the target charging element, and the target charging element can also be understood as a splitting point, as shown in Table 6:
  • the unit of the charging duration may be counted in “seconds”, and the deduction fee may be “minutes”. "The meter does not repeat the description below.
  • the entropy is calculated for the target sub-group CDRs shown in Table 7 and Table 8, and it is determined whether the target sub-group CDR satisfies the grouping condition, and if the grouping condition is met, the grouping is not required, if the grouping is not satisfied
  • the condition then needs to extract another billing element of the plurality of billing elements as a splitting point (such as the day-of-day billing element to which the time belongs) to continue grouping the target sub-group bill until the grouping condition is met.
  • the target charging element includes at least two charging elements as an example.
  • the target charging element includes the first charging element and the second charging element
  • the first element is used.
  • the bills with the same value are grouped into one group, and at least two first sub-group bills are obtained.
  • the first element value is the element value corresponding to the first billing element; and the bills with the same second element value are grouped into one group, and at least two are obtained.
  • the second sub-group CDR, the second element value is the element value corresponding to the second charging element.
  • Table 9 below is an example of the target charging element including two charging elements.
  • the first charging element may take "hour” as an example, and the second charging element may be " "Week" is an example.
  • Step d When the target charging element includes at least two charging elements, determine a target charging element corresponding to the maximum information gain as a splitting point, and determine a sub-group CDR after the splitting of the group CDR.
  • the entropy is calculated for the first sub-group CDRs in Tables 10 and 11 according to Equation 1, and the entropy of the first sub-group CDR is zero.
  • the first information gain of the first subset of CDRs is 1.
  • the information gain is calculated by subtracting the entropy (0) of the grouped CDRs from the grouped CDRs, and obtaining the first information gain of the first CDR is 1.
  • the grouped CDRs are the group CDRs shown in Table 9, and the grouped CDRs are shown in Table 10 and Table 11.
  • Entropy is calculated according to formula 1 for the second sub-group CDRs in Table 12 and Table 13.
  • the entropy of the second sub-group CDR is 1, and the second information gain of the second sub-group CDR is 0.
  • the information gain is calculated by subtracting the entropy of the grouped CDRs (1) from the entropy of the grouped CDRs (1), and obtaining the second information gain of the second subset of CDRs.
  • the first information gain is greater than the second information gain, so "hour” is taken as the splitting point of the packet.
  • the entropy of the first sub-group CDRs in Table 10 and Table 11 is 0, which is smaller than the threshold (for example, the threshold is 0.001), and the first sub-group CDRs corresponding to Table 10 and Table 11 stop splitting.
  • the second charging element (week) is deleted from the charging condition, and the redundant charging element is deleted, which can be effective. Reducing the amount of data and reducing a billing element can reduce a data that needs to be processed, and can also effectively reduce the amount of storage of the finalized tariff data, and make the billing elements of the tariff data reduce interference more clear. That is, the charging conditions after the packet are: service type, service flow, calling home, calling home, domestic long distance, and the calling billing time belongs to the hour.
  • the similar group CDRs can be combined, and it can be understood that the group CDRs and the sub-group CDRs satisfying the grouping conditions or the at least two sub-group CDRs satisfying the grouping condition can be combined to obtain the combined group CDR.
  • the similar group bills here refer to the group bills with the same tariff rate, and the same tariff rate is the same as the mapping relationship between the billing duration and the deduction fee.
  • the merged group bill can be repeatedly executed in step b to verify whether the merged group bill meets the grouping condition. If the grouping condition is met, step 203 is performed to merge the similar group bills, which can effectively improve the subsequent pair. The efficiency of processing each group of bills.
  • Step 203 Perform analysis on the group bill to obtain data features.
  • the group bill in this step may be: a group bill after the initial grouping, or a group bill after the initial grouped group bill is merged, or at least two children.
  • the different fee values corresponding to the chargeback fee in the set of CDRs are obtained. Please understand in conjunction with Table 14 below.
  • Billing duration (x) Deduction fee (y) 1 20 15 20 30 20 31 40 60 40 71 60 72 60 ... ...
  • the different cost values corresponding to the deduction fee are arranged in order, please understand in conjunction with Table 15 below.
  • the ordering here includes the arrangement from large to small or from small to large.
  • the cost values may be arranged from small to large as an example.
  • the difference is the jump deduction value.
  • the difference has only one fixed value of 20, and it is determined that the number of the jump deduction value is one.
  • the difference between the cost values is calculated, and the difference is the hopping deduction value, and the difference obtained is: 20, 20, 10, respectively. 10, 10; the number of different differences is 2 (20 and 10 respectively), and the number of jump deductions is determined to be two.
  • the voice service is taken as an example for description. Therefore, the charging unit is illustrated by the charging duration, and does not cause a limited description of the application.
  • the charging unit may be different.
  • the charging unit is the number of charging
  • the charging unit is the charging traffic.
  • Step 204 Select a target tariff model corresponding to the data feature from the preset tariff model according to the data feature, and the tariff model is used to calculate the deduction fee according to the charging unit.
  • the first tariff model is selected from the preset tariff models as the target tariff model.
  • the preset tariff model includes a plurality of tariff models.
  • the simple tariff model and the sub-file tariff model are taken as an example. In practical applications, the specific format of each tariff model in the preset tariff model is not limited. And quantity.
  • the first tariff model is a simple tariff model, and the format of the first tariff model is as follows:
  • y represents the deduction fee
  • x represents the billing time
  • unitFee represents the hopping deduction
  • pulse is the hop
  • hop indicates the minimum billing time unit
  • ceil represents the ceiling function, and is recorded.
  • unitFee is one.
  • the second tariff model is selected from the preset tariff model as the target tariff model.
  • the second tariff model is a split tariff model, and the format of the second tariff model is as follows:
  • split tariff model including only one breakpoint is taken as an example.
  • the split tariff model does not limit the number of split points.
  • the splitting point is the billing duration corresponding to the demarcation point of the different hopping deductions.
  • the hop-off deductions are 20, 20, 10, 10, and 10, respectively.
  • the hop-off deductions change (from 20 to 10) If the corresponding charging duration is 60, it indicates that the charging duration 60 is a binning point.
  • the tariff model is:
  • the tariff model is:
  • unitFee1 represents a long jump times 0 deduction from the time when the charging breakPoint1; long jump from 0 to breakPoint1 views at the time of pulse 1 represents a charging; unitFee 2 represents a charging time longer than hop times during breakPoint1 deduction value; Pulse 2 indicates the hop count when the billing duration is greater than breakPoint1.
  • the tariff rate is: within 3 minutes: 50 minutes / 60 seconds, more than 3 minutes, 10 minutes / 30 seconds, less than 30 seconds by 30 seconds; where unitFee1 is 0.5 yuan, pulse 1 is 60 seconds, breakPoint1 is 3 minutes, unitFee 2 is 0.1 yuan, and pulse 2 is 30 seconds.
  • Step 205 Determine, according to the data feature, a parameter value corresponding to each tariff parameter in the target tariff model.
  • the tariff rate may include a tariff model and a tariff parameter corresponding to the tariff model.
  • the tariff parameters include hops and hops. E.g, In this simple tariff model, the corresponding tariff parameters and parameter values are: the hop-off deduction is 100, and the hop is 60.
  • the hop is determined according to the minimum value and the maximum value corresponding to the charging duration corresponding to the same cost value shown in the above table 20. Since the hopping indicates the minimum charging duration unit, that is to say, the deduction is more than one minimum duration unit (jump), in the example of Table 20, the corresponding fee value is deducted for more than one 20 second duration.
  • a large number of bills may be included in a group bill. Each bill includes a billing duration and a deduction fee corresponding to the billing duration. The same bill value included in a large bill may correspond to multiple bills.
  • the cost value 20 corresponds to the billing duration of 1, 15 and 30, when the billing duration is 31, the cost value changes, the cost value 30 is the critical value of the deduction fee change Value, so that it is possible to determine a minimum charging duration unit according to the maximum and minimum values corresponding to the charging duration corresponding to the same cost value (it can be understood that the cost value corresponding to more than one charging duration deduction fee varies) . See Table 21 below:
  • the set of calculated hops is (30, 2).
  • the parameter value candidate set determined according to the above steps 1) and 2) includes the first parameter value and the second parameter value, as shown in Table 22 below:
  • the first parameter value and the second parameter value in the parameter value candidate set are respectively brought into the selected target tariff model (such as the simple tariff model), and according to the charging duration (x) and the deduction fee saved in step 202 (y The mapping relationship between the first parameter value and the second parameter value are verified respectively.
  • the first parameter value is determined to be a parameter value corresponding to the final target tariff model.
  • Step 206 Determine tariff data according to the parameter value, the target tariff model, and the charging condition.
  • the rate of the fee for the data of the group such as The physical meaning is 20 minutes / 30 seconds, less than 30 seconds in 30 seconds.
  • the tariff data can be as shown in Table 23 below:
  • Step 207 Combine the tariff data under the same package identifier according to the package identifier carried by the bill, and obtain the package tariff data.
  • Each bill in the bill collection carries a package identifier, and the package identifier is used to indicate the tariff package to which the bill belongs, and the package tariff data is obtained.
  • the tariffs corresponding to the respective billing conditions under the same package are integrated.
  • the tariff rates corresponding to the voice, data, short message and multimedia message in the package A are integrated in a tree structure, which is convenient for the user to view and analyze.
  • the tariff data under the package A may be: may include voice tariffs, data service tariffs, SMS and multimedia messaging service tariffs.
  • voice package monthly function fee 20 yuan, can call up to 60 minutes, exceed the time according to 0.5 yuan / minute billing
  • SMS package monthly function fee 10 yuan, can send up to 200 text messages, the number of articles exceeds 0.1 yuan /Article billing.
  • Data package monthly function fee of 20 yuan, up to 50 megabytes of traffic, exceeding the traffic according to March / megabytes.
  • the charging conditions corresponding to the two different tariff rates are only different in time intervals, they may be combined together, as shown in the tariff data in Table 24 below.
  • step 207 is an optional step, and may not be performed.
  • the determined package tariff data or tariff data is saved, and the tariff data is saved to the storage server to support the subsequent new billing server rating and the bill comparison server to compare the old bills.
  • the data in a large number of bills in the old billing system are reversed and the tariff data is derived. It is not necessary to manually determine the old billing system bills through the artificial correction of the old billing bills in the traditional manner, thereby obtaining the tariff data of the old billing system, greatly reducing the manual dependency, and improving the determination of the old billing system. Data efficiency, saving computing resources and manpower.
  • the method for determining the tariff data in the embodiment of the present invention is described in detail above.
  • the following apparatus for determining the tariff data applied to the method for determining the tariff data is described.
  • the determining device may exist in the form of a server.
  • an embodiment of the apparatus for determining tariff data 500 in the embodiment of the present invention includes:
  • the obtaining module 501 is configured to obtain a bill set, where each bill in the bill set includes a billing condition and a billing unit and a deduction fee under the billing condition;
  • the grouping module 502 is configured to group the plurality of bills in the bill set obtained by the obtaining module 501 according to a preset rule to obtain a group bill, and the same value corresponding to the billing unit in the group bill corresponds to the deduction fee.
  • the unique cost value has a corresponding relationship;
  • the data feature analysis module 503 is configured to analyze the group CDRs determined by the grouping module 502 to obtain data features.
  • the tariff model determining module 504 is configured to select a target tariff model corresponding to the data feature from the preset tariff model according to the data feature determined by the data feature analysis module 503, and the tariff model is configured to calculate the deduction fee according to the charging unit;
  • the parameter value determining module 505 is configured to determine a parameter value corresponding to each tariff parameter in the target tariff model according to the value corresponding to the charging unit in the group CDR determined by the grouping module 502 and the cost value corresponding to the charging fee;
  • the tariff data determining module 506 is configured to determine the tariff data according to the parameter value determined by the parameter value determining module 505, the target tariff model determined by the tariff model determining module 504, and the charging condition.
  • the grouping module 502 is further configured to group the plurality of bills in the bill set according to the preset billing elements in the plurality of billing elements, to obtain the group bill, and the same preset in the group bill.
  • the element values corresponding to the billing element are the same.
  • the grouping module 502 is further specifically configured to:
  • the target charging element in the group CDR is determined, and the target charging element corresponds to at least two different element values;
  • each target sub-group CDR has a corresponding relationship with the unique cost value corresponding to the deduction fee
  • at least two target sub-group CDRs are group CDRs
  • the target element value is the target.
  • the grouping module 502 is further specifically configured to:
  • D represents the number of bills in the group bill
  • Di represents the number of times the billing unit i appears
  • p ij represents the billing unit i
  • the probability of the deduction fee j appears
  • the entropy is greater than the threshold, the number of the cost values corresponding to the same value of the charging unit and the deduction fee is greater than or equal to two;
  • the entropy is less than or equal to the threshold, the number of cost values corresponding to the same value of the charging unit and the deduction fee is one.
  • an embodiment of the present invention further provides an apparatus for determining a tariff data 600 according to the embodiment of FIG.
  • the grouping module 502 is further configured to:
  • the calculating module 507 is configured to calculate a first information gain of the first sub-group CDR determined by the grouping module 502 with respect to the group CDR, and calculate a second sub-group CDR determined by the grouping module 502 with respect to the second CDR of the group CDR Information gain
  • the deleting module 508 is configured to delete the second charging element from the plurality of charging elements when the calculating module 507 determines that the first information gain is greater than the second information gain.
  • an embodiment of the present invention further provides an apparatus for determining a tariff data 700.
  • the embodiment of the present invention further includes:
  • Each bill in the bill set carries a package identifier, and the package identifier is used to indicate the tariff package to which the bill belongs;
  • the determining device of the tariff data further includes a merging module 509;
  • the merging module 509 is configured to combine the tariff data determined by the tariff data determining module 506 under the same package identifier according to the package identifier carried by the CDR to obtain the package tariff data.
  • the hopping is used to indicate the minimum charging unit unit, and the hopping deduction value is used to indicate the cost value corresponding to the hop.
  • the data feature analysis module 503 is further configured to:
  • the tariff model determining module 504 is further specifically configured to:
  • the first tariff model is selected from the preset tariff model as the target tariff model;
  • the second tariff model is selected from the preset tariff model as the target tariff model.
  • determining means of the tariff data in FIGS. 5 to 7 is presented in the form of a functional module.
  • a “module” herein may refer to an application-specific integrated circuit (ASIC), circuitry, a processor and memory that executes one or more software or firmware programs, integrated logic circuitry, and/or other functions that provide the functionality described above. Device.
  • ASIC application-specific integrated circuit
  • the determining means of the tariff data in Figs. 5 to 7 can take the form shown in Fig. 8.
  • FIG. 8 is a device for determining a tariff data according to an embodiment of the present invention.
  • the apparatus for determining the tariff data may be in the form of a server.
  • the determining apparatus 800 is described by taking a server as an example. 800 may vary considerably depending on configuration or performance, and may include one or more one or more processors 800 and memory 832, one or more storage media 830 that store application 842 or data 844 (eg, one or A storage device in Shanghai). Among them, the memory 832 and the storage medium 830 may be short-term storage or persistent storage.
  • the program stored on storage medium 830 may include one or more modules (not shown), each of which may include a series of instruction operations in the server. Still further, the processor 822 can be configured to communicate with the storage medium 830, executing a series of instruction operations in the storage medium 830 on the server 800.
  • Server 800 may also include one or more power supplies 826, one or more wired or wireless network interfaces 850, one or more input and output interfaces 858, and/or one or more operating systems 841, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, etc.
  • operating systems 841 such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, etc.
  • the network interface 850 is configured to obtain a bill set, and each bill in the bill set includes a billing condition and a billing unit and a deduction fee under the billing condition;
  • the processor 822 is configured to group the plurality of bills in the bill set obtained by the network interface 850 according to a preset rule to obtain a group bill, and the same value corresponding to the billing unit in the group bill corresponds to the deduction fee.
  • the unique cost value has a corresponding relationship; the group bill is analyzed to obtain the data feature; the target tariff model corresponding to the data feature is selected from the preset tariff model according to the data feature, and the tariff model is used to calculate the deduction according to the billing unit Cost; determining the parameter value corresponding to each tariff parameter in the target tariff model according to the value corresponding to the charging unit in the group bill and the cost value corresponding to the deduction fee; determining the tariff according to the parameter value, the target tariff model, and the charging condition data.
  • the charging condition includes multiple charging elements
  • the processor 822 is further configured to group the multiple bills in the bill set according to the preset billing elements in the plurality of billing elements to obtain the group bill, and the same preset billing element in the group bill.
  • the corresponding feature values are the same.
  • the processor 822 is further configured to: when the same value corresponding to the charging unit has a corresponding relationship with at least two cost values corresponding to the deduction charge, determine a target charging element in the group CDR, and target charging The element corresponds to at least two different element values; the CDRs with the same target element value are grouped into one group, and at least two target sub-group CDRs are obtained; when the charging unit of each target sub-group CDR corresponds to the same value and When the unique fee value corresponding to the deduction charge has a corresponding relationship, at least two target sub-group CDRs are group CDRs, and the target component value is a component value corresponding to the target charging component.
  • the processor 822 is further configured to calculate an entropy of the data in the group CDR according to the following formula;
  • D represents the number of bills in the group bill
  • Di represents the number of times the billing unit i appears
  • p ij represents the billing unit i
  • the probability of the deduction fee j appears
  • the quantity of the cost value corresponding to the same value corresponding to the charging unit is determined to be greater than or equal to two; if the entropy is less than or equal to the threshold, the same value corresponding to the charging unit is determined.
  • the number of cost values corresponding to the deduction fee is one.
  • the processor 822 is further configured to group the CDRs with the same first element value into one group to obtain at least two first sub-groups.
  • a bill the first element value is an element value corresponding to the first billing element; the bills having the same second element value are grouped into one group, and at least two second sub-group bills are obtained, and the second element value is the second billing
  • the feature value corresponding to the feature calculating the first information gain of the first sub-group CDR relative to the group CDR, and calculating the second information gain of the second sub-group CDR relative to the group CDR; if the first information gain is greater than The second information gain removes the second charging element from the plurality of charging elements.
  • each bill in the bill set carries a package identifier, and the package identifier is used to indicate the tariff package to which the bill belongs.
  • the processor 822 is further configured to combine the tariff data under the same package identifier according to the package identifier carried by the bill to obtain the package tariff data.
  • the hop is used to indicate the minimum charging unit unit, and the hop deduction value is used to indicate the cost value corresponding to the hop;
  • the processor 822 is further configured to obtain different cost values corresponding to the deduction charge in the group CDR; and arrange different cost values in order; determine a difference between the consecutive consecutive two cost values, and the difference is The deductible deduction value; determine the number of deductions for different hops.
  • the processor 822 is further configured to: when the number of the hop-off deductions is one, select the first tariff model from the preset tariff model as the target tariff model; and when the number of hop-off deductions is at least two Then, the second tariff model is selected from the preset tariff model as the target tariff model.
  • the embodiment of the present invention further provides a computer storage medium for storing computer software instructions used in the determining device for the tariff data shown in FIG. 8 above, which includes a program designed to execute the foregoing method embodiment.
  • the acquisition of resources can be achieved by executing a stored program.
  • a computer program product includes one or more computer instructions.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions can be stored in a computer readable storage medium or transferred from one computer readable storage medium to another computer readable storage medium, for example, computer instructions can be wired from a website site, computer, server or data center (eg, Coax, fiber, digital subscriber line (DSL) or wireless (eg, infrared, wireless, microwave, etc.) is transmitted to another website, computer, server, or data center.
  • a website site eg, computer, server or data center
  • DSL digital subscriber line
  • wireless eg, infrared, wireless, microwave, etc.
  • the computer readable storage medium can be any available media that can be stored by the computer or a data storage device such as a server, data center, or the like that includes one or more available media.
  • Useful media can be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media (eg, Solid State Disk (SSD)).
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of cells is only a logical function division.
  • multiple units or components may be combined or integrated. Go to another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Meter Arrangements (AREA)

Abstract

本发明实施例公开了一种资费数据的确定方法及装置。本发明实施例方法包括:获取话单集合,话单集合中的每个话单包括计费条件及在计费条件下的计费单位和扣费费用;将话单集合中的多个话单按照预置规则进行分组,得到组话单,组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系;对组话单进行分析,得到数据特征;从预置的资费模型中选择与数据特征对应的目标资费模型,资费模型用于根据计费单位计算扣费费用;根据组话单中的计费单位对应的数值与扣费费用对应的费用值确定目标资费模型中的各资费参数对应的参数值;根据参数值、目标资费模型及计费条件,确定资费数据。本发明实施例还提供了一种资费数据的确定装置。

Description

一种资费数据的确定方法及装置
本申请要求于2017年3月21日提交中国专利局、申请号为201710173476.9,发明名称为“一种资费数据的确定方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及通信领域,尤其涉及一种资费数据的确定方法及装置。
背景技术
用户在使用运营商提供的服务时,运营商会通过计费***向用户收取相应的费用。由于运营业务的不断发展,旧计费***功能不能满足运营业务的发展,因此就需要将旧计费***进行升级,升级到新计费***。
新计费***取代旧计费***的过程为:1、将旧计费***数据导入到新计费***;2、新计费***产生的话单和旧计费***产生的话单进行对比;3、更新旧计费***数据,如果新、旧计费***在相同的计费条件下,扣费费用一致,则可以在不影响用户体验的情况下用新计费***取代旧计费***。但是,由于旧计费***中的数据结构与新计费***中的数据结构不同,在旧计费***中的数据导入到新计费***的过程中,可能会导致旧计费***中的数据丢失。这就需要将在相同的计费条件下的旧计费***产生的话单与新计费***产生的话单分别产生的扣费费用进行比较,进行多轮的数据验证,不断的修正旧计费***中的数据,只有当旧计费***和新计费***产生的话单的扣费费用一致,则旧计费***才能被新计费***替换。
因此,旧计费***中的资费数据的正确性就非常重要,传统方式中,获取正确的就计费***的数据,需要每一轮话单扣费费用比对,然后需要用计算机处理大量的话单,多轮反复,再通过与运营人员反复确认旧计费***中的套餐资费数据是否正确,这都会耗费大量的计算资源、人力资源,效率十分低下。
发明内容
本发明实施例提供了一种资费数据的确定方法及装置,该一种资费数据的确定方法应用于一种资费数据的确定装置,该资费数据的确定方法用于通过对大量的话单进行分析,反向推演出资费数据,极大的降低人工依赖,提升确定旧计费***的数据的效率,节约计算资源与人力。
第一方面,本发明实施例中提供了一种资费数据的确定方法,包括:
从存储服务器获取话单集合,话单集合可以为经过预处理的标准数据格式的话单,话单集合中的每个话单包括计费条件及在计费条件下的计费单位和扣费费用;然后,将话单集合中的多个话单按照预置规则进行分组,得到组话单;对话单集合进行分组的原因在于:对于话单中所包括计费单位对应的数值及扣费费用对应的费用值,即使是同一个套餐下的同一个业务的资费,也会依据计费条件中包括的计费要素的不同,而产生不同的资费费率,这就导致在话单集合中会存在各种资费费率混在一起。例如,在语音业务中,相同的计费时长会可能出现多个费用值,例如,话单a为深圳打深圳的主叫资费话单,通话时长为10分钟,对应的费用值为0.5元,话单b也为深圳打深圳的主叫资费话单,通过时长为10分钟,但是对应的 费用值为1元,也就是说,在不同的话单中,同一个通话时长,对应的费用值可能不同,而原因可能是由于忙时,闲时的资费不同,或者亲情号和普通的号码资费不同。因此,对话单集合中的多个话单进行分组的目的就是要将不同的资费费率的话单分开。得到的组话单需要满足分组条件,该分组条件为组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系;当分组后的组话单满足分组条件后,可以对组话单中的扣费费用对应的费用值数据进行分析,得到数据特征;该数据特征与目标资费模型具有对应关系,然后,可以根据数据特征从预置的资费模型中选择与数据特征对应的目标资费模型,资费模型用于根据计费单位计算扣费费用;资费数据包括资费模型,资费模型对应的资费参数及计费条件,确定了目标资费模型后,就可以组话单中的计费单位对应的数值与扣费费用对应的费用值确定目标资费模型中的各资费参数对应的参数值;最后,根据参数值、目标资费模型及计费条件,确定资费数据。本实施例中通过旧计费***中大量的话单中的数据进行反向推演,推演出资费数据,不需要传统方式中通过人为的对旧计费话单的不断修正来确定旧计费***话单,从而获取旧计费***的资费数据,极大的降低人工依赖,提升确定旧计费***的数据的效率,节约计算资源与人力。
在一种可能的实现方式中,计费条件包括多个计费要素,将话单集合中的多个话单按照预置规则进行分组,得到组话单的具体方式可以为:将话单集合中的多个话单按照多个计费要素中的预置计费要素进行分组,得到组话单,该预置计费要素可以是经验知识预先设定的,组话单中同一个预置计费要素所对应的要素值相同。如,该预置计费要素可以为业务类型,业务流程,主叫归属地,被叫归属地,通话类型等,按照经验值,若在组话单中的每个话单中上述预置计费要素的中同一个计费要素的要素值相同,根据该预置计费要素进行分组,可以使得该组话单满足分组条件,提高了分组效率,提高了后续步骤的处理效率。
在一种可能的实现方式中,当计费单位对应的同一个数值与扣费费用对应的至少两个费用值具有对应关系时,可以理解为组话单不满足分组条件,则需要对组话单进一步分组,直达分组后的话单满足该分组条件,确定组话单中的目标计费要素,该目标计费要素为话单中所包括的所有计费要素中除了预置计费要素之外的计费要素,目标计费要素对应至少两个不同的要素值,例如,在组话单中,“小时”对应的计费要素包括“1”和“16”这两个数值;然后,将目标要素值相同的话单分成一组,得到至少两个目标子组话单,将要素值为“1”的话单分为一组得到两个目标子组话单;当每个目标子组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系时,将目标计费要素作为该组话单的***点,也就是需要根据该目标计费要素进行分组,那么,对组话单进行进一步分组得到的至少两个目标子组话单为组话单,目标要素值为目标计费要素对应的要素值,可选的,在话单中包括的多个计费要素,可以预先对多个计费要素进行排序,排序的规则为将对扣费费用的影响程度,若组话单不满足分组条件,则从多个计费要素中按照顺序抽取目标计费要素,将目标计费要素作为***点,提高抽取***点的正确的概率。
在一种可能的实现方式中,将相似的组话单进行合并,这里的相似的组话单是指在组话单中,多个计费要素中可能有预置数量的计费要素不同,但是资费费率相同的组话单。资费费率相同是指在组话单中计费时长对应的数值与扣费费用的费用值的映射关系相同,例如,在一个组话单中计费时长30对应扣费20,计费时长40对应收费40,假设另一个组话单也包括计费时长30对应扣费费用20;计费时长40对应收费40的映射关系,则将这两个组话单进行合并,本实施例中可以将资费费率相同的组话单进行合并,然后可以对合并后的组话 单进行处理,可以提高后续处理步骤的处理效率。
在一种可能的实现方式中,根据如下公式计算组话单中数据的熵,本发明实施例中重新定义熵中的参数,通过熵的计算来判定组话单中资费费率的混乱程度,该组话单可以为初始分组之后的组话单,也可以为将组话单合并之后的组话单;
Figure PCTCN2018073850-appb-000001
其中,D表示组话单中的话单的数量,Di表示计费单位i出现的次数,p ij表示计费单位i,出现扣费费用j的概率;若熵大于阈值,确定计费单位所对应的同一个数值与扣费费用具有对应关系的费用值的数量大于或者等于2个;若熵小于或者等于阈值,确定计费单位所对应的同一个数值与扣费费用具有对应关系的费用值的数量为1个。
在一种可能的实现方式中,当目标计费要素包括第一计费要素和第二计费要素时,将目标计费要素值相同的话单分成一组,得到至少两个目标子组话单的具体方式包括:将第一要素值相同的话单分成一组,得到至少两个第一子组话单,第一要素值为第一计费要素对应的要素值;将第二要素值相同的话单分成一组,得到至少两个第二子组话单,第二要素值为第二计费要素对应的要素值;然后,进一步的,计算第一子组话单相对于组话单的第一信息增益,并计算第二子组话单相对于组话单的第二信息增益;其中,信息增益的计算方式为被分组的组话单的熵减去分组后的子组话单的熵,信息增益越大,表明按照所增加的目标计费要素分组得到的子组话单,熵会越小,该目标子组话单中,相同计费单位对应不同扣费费用的情况会减少,通过信息增益的计算,可以验证组话单中相同计费单位对应不同扣费费用的概率的大小,提高验证一个计费要素是否为***点的效率;若第一信息增益大于第二信息增益,则将第二计费要素从多个计费要素中删除;将该第二计费要素(如星期)从计费条件中删除,删除冗余的计费要素,可以使有效减少数据量,减少一个计费要素,就可以减少一项需要处理的数据,也可以有效减少最后确定资费数据的存储量,而且使得资费数据减少干扰的计费要素,更加清晰明了。
在一种可能的实现方式中,话单集合中的每个话单均携带套餐标识,套餐标识用于指示话单所归属的资费套餐,进一步的,还可以根据话单所携带的套餐标识,将同一个套餐标识下的资费数据进行合并,得到套餐资费数据;将相同套餐下的各个计费条件对应的资费整合在一起,如将套餐A下的语音、数据、短信、彩信对应的资费费率用树形结构整合,便于用户查看及后续分析。
在一种可能的实现方式中,将确定的套餐资费数据或者资费数据进行保存,将资费数据保存至存储服务器,以支持后续新计费服务器批价以及话单比对服务器对比新旧话单的扣费情况。
在一种可能的实现方式中,当数据特征包括扣费费用的跳次扣费值的数量时,跳次用于表示最小的计费单位单元,跳次扣费值用于表示该跳次对应的费用值,对组话单中进行分析,得到数据特征,的具体实现方法可以为:获取组话单中的扣费费用对应的不同的费用值;将不同的费用值按序排列;确定排序后的连续两个费用值之间的差值,该差值为跳次扣费值;根据该差值的数量,就可以确不同跳次扣费值的数量。本实施例中,通过大量的扣费费用对应的费用值可以反推出跳次扣费值的数量。
在一种可能的实现方式中,根据数据特征从预置的资费模型中选择与数据特征对应的目标资费模型的具体实现方法为:若跳次扣费值的数量为1个,则从预置的资费模型中选择第一资费模型作为目标资费模型,第一资费模型为简单资费模型;第一资费模型的格式为:
Figure PCTCN2018073850-appb-000002
其中,y表示扣费费用;x表示计费时长;unitFee表示跳次扣费值;pulse为跳次,跳次表示最小的计费时长单元;ceil表示天花板函数,记取上;在该模型中,unitFee为1个;若跳次扣费值的数量为至少2个,则从预置的资费模型中选择第二资费模型作为目标资费模型,第二资费模型为分档资费模型,第二资费模型的格式为:
Figure PCTCN2018073850-appb-000003
当计费时长超过分档点时,资费模型为:
Figure PCTCN2018073850-appb-000004
其中,unitFee1表示计费时长从0到breakPoint1时的跳次扣费值;pulse 1表示计费时长从0到breakPoint1时的跳次;unitFee 2表示计费时长大于breakPoint1时的跳次扣费值;pulse 2表示计费时长大于breakPoint1时的跳次,分档点为不同的跳次扣费值的分界点对应的计费时长。
在一种可能的实现方式中,计费单位包括语音业务的计费时长、数据业务的计费流量、短信及彩信业务的计费条数。
第二方面,本发明实施例提供了一种资费数据的确定装置,具有实现上述方法中实际资费数据的确定装置所执行的功能。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块。
第三方面,资费数据的确定装置的结构中包括存储器,网络接口和处理器。其中存储器用于存储计算机可执行程序代码,并与网络接口耦合。该程序代码包括指令,当该处理器执行该指令时,该指令使该确定装置执行上述方法中所涉及的信息或者指令。
第四方面,本发明实施例提供了一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行上述第一方面的方法。
第五方面,本发明实施例提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面的方法。
附图说明
图1为本发明实施例中一种确定资费数据的***的架构示意图;
图2为本发明实施例中一种资费数据的确定方法的一个实施例的步骤流程图;
图3为本发明实施例中预处理的步骤示意图;
图4为本发明实施例中对话单集合中的多个话单进行分组的流程示意图;
图5为本发明实施例中一种资费数据的确定装置的一个实施例的结构示意图;
图6为本发明实施例中一种资费数据的确定装置的另一个实施例的结构示意图;
图7为本发明实施例中一种资费数据的确定装置的另一个实施例的结构示意图;
图8为本发明实施例中一种资费数据的确定装置的另一个实施例的结构示意图。
具体实施方式
本发明实施例提供了一种资费数据的确定及装置,用于通过对大量的话单进行分析,反向推演出资费数据,极大的降低人工依赖,提升确定旧计费***的资费数据的效率,节约计算资源与人力。
本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、***、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
为了方便理解,首先对本发明实施例中涉及的词语进行解释说明:
旧计费***数据:旧计费***中存储的数据,其中包括用户数据,***数据和话单数据。
用户数据:包括用户订购的套餐、用户账户的资金,用户的电话号码,用户所属的区域等数据。
***数据:包括号码分析数据,税率信息,鉴权数据等。其中,号码分析数据用于支撑新***依据用户主叫号码或者归属位置寄存器(Home Location Register,HLR)号码,分析出用户的归属地、拜访地、是否漫游等相关信息;税率信息,用于支撑对用户消费业务计税,例如,对所有的语音通话扣费,增收10%的税。
话单:指原始通信记录信息,又可以称之为详单、呼叫详细记录(Call Detail Record,CDR)。根据业务的不同,话单又包括短信业务的话单,彩信业务的话单,语音业务的话单,网络业务的话单等等。例如,以语音业务为例,用户每次通话可能产生两个话单,如主叫用户的话单,被叫用户的话单。在计费***中,一个用户可以对应多个话单。
话单主要记录以下信息:
计费条件和计费费率,也可以理解话单为在某一个计费条件下的计费费率;其中,计费条件又包括多个计费要素。
计费要素:包括套餐ID,业务类型(包括语音、短信、彩信、流量),业务流程(主叫/被叫/前传)、主/被叫归属地、主/被叫拜访地、主/被叫所属运营商、主/被叫号码、主/被亲情号码编号、主/被集团账户编号、主/被叫账户余额、主/被叫计费时长、主/被叫通话时长、主/被叫扣费、主/被叫计费时间(包括从计费时间中抽取的信息,如计费所属星期、所属小时、所属分钟等)税率信息、主叫消耗的流量、主叫发送的短信条数等等。
计费费率:计费费率用于表示计费单位所对应的扣费费用。其中,计费单位包括语音业务的计费时长、网络业务的计费流量、短信及彩信业务的计费条数等等。
下面以语音业务的话单为例对话单进行示例性说明,请参阅下表1所示:
表1
Figure PCTCN2018073850-appb-000005
需要说明的是,上述表1中,扣费时长的单位可以以“秒”来计,计费费用可以以“分”来计。
下面以短信业务的话单为例对话单进行示例性说明,请参阅下表2所示:
表2
Figure PCTCN2018073850-appb-000006
上述表2中,扣费条数的单位可以以“条”来计,扣费费用可以以“分”来计。
下面以数据业务的话单为例对话单进行示例性说明,请参阅下表3所示:
表3
Figure PCTCN2018073850-appb-000007
上表3中的扣费费用的单位可以以“分”来计,计费流量的单位可以以“兆”来计。
需要说明的是,上述表1至表3中的话单只是示例性说明,并不造成对本申请的限定性说明。本发明实施例中,以语音业务为例进行说明,计费单位以计费时长为例进行说明。
组话单:对旧计费***中的数万及数亿的话单进行分组,得到的分组后的话单。组话单又包括多个话单。需要说明的是,在本发明实施例中,涉及到“组话单”,“子组话单”,其中,“子组话单”也是组话单的一种形式,在本发明实施例中,涉及对多个话单进行分组,以两次分组为例,初始分组后的称为“组话单”,在初始分组的基础上,也就是对组话单进行分组后的话单称为“子组话单”,在示例中“组话单”和“子组话单”的区分仅仅是为了方便说明,本发明实施例中可能涉及多次分组,其中“组话单”或“子组话单”都可以理解为分组后的话单,可以理解的是,组话单可以包括子组话单。
资费数据:包括计费条件和计费费率,用于记录在某种计费条件下的计费费率。
本发明实施例中,由于旧计费***中存储了大量如上述表1至表3中的用户各种业务的话单,本发明实施例中可以通过旧计费***中大量的话单,进行分析处理,从而反向推演出套餐资费数据,以解决传统方法中需要多轮反复确定旧计费***中资费数据的正确性,从而造成的计算资源和人力资源的浪费的问题。
本发明实施例中提供了一种资费数据的确定方法,该确定方法应用于一种确定资费数据的***,请参阅图1所示,图1为一种确定资费数据的***的架构示意图,该***包括资费数据的确定装置110,该资费数据的确定装置110可以以服务器的形态存在,存储服务器120、计费服务器130和话单比对服务器140。
资费数据的确定装置110:用于获取旧计费***数据,该旧计费***数据包括大量的话 单数据,并根据该大量的话单数据反向推演出资费数据或者进一步的套餐资费数据。
存储服务器120:用于对本发明实施例中话单数据和资费数据的存储,例如,该存储服务器可以为oracle数据库,或者,分布式文件存储***(Hadoop Distributed File System,HDFS)。
计费服务器130:资费数据采集点,生成业务服务使用的话单,并根据资费数据对运营商提供的各种业务的扣费。
话单比对服务器140:用于对比在相同计费条件下新计费***的扣费费用和旧计费***的扣费费用,通过该话单对比服务器的比对,确定新计费***和旧计费***的扣费费用是否一致。通过本发明实施例中提供的资费数据的确定方法,在相同计费条件下新计费***的扣费费用和旧计费***的扣费费用基本上一致。
本发明实施例中的一种资费数据的确定方法是由资费数据的确定装置110来执行,该确定装置110从存储服务器120获取话单集合,该话单集合中包括大量的话单,通过话单集合中数万乃至数亿计的话单进行反推得到资费数据;具体的,将该话单集合中的多个话单按照预置规则进行分组,得到组话单,并且使得组话单满足分组条件,该分组条件为:组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系。
可以理解的是,需要对话单集合中的话单进行分组的原因在于:大量的话单中,每个话单都包括计费单位及该计费单位对应的扣费费用。举个例子,话单a为深圳打深圳的主叫资费话单,通话时长为10分钟,对应的扣费费用为0.5元,话单b也为深圳打深圳的主叫资费话单,通过时长为10分钟,但是对应的扣费费用为1元,也就是说,在不同的话单中,同一个通话时长,对应的扣费费用可能不同,而原因可能是由于忙时,闲时的资费不同,或者亲情号和普通的号码资费不同。因此本发明实施例中需要对话单集合中大量的话单进行分组,以使得分组后的组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系。例如,分组后的组话单A和组话单B,组话单A包括话单a,而组话单B包括话单b,该组话单A包括4条话单,在这4条话单中,通话时长10分钟,对应的扣费费用为0.5元,通过时长10分钟并不对应其他的扣费费用的费用值,在分组的过程中,将话单a和话单b按照预置规则分到不同的组,这样,由于分组后的每个组话单中的资费费率不再混乱,就可以对组话单的扣费费用对应的费用值进行分析,得到该组话单的数据特征,该数据特征用于从预置的资费模型中选择与数据特征对应的目标资费模型,资费模型用于根据计费单位计算扣费费用,然后,根据数据特征确定目标资费模型中的各资费参数对应的参数值;最后,就可以根据参数值、目标资费模型及计费条件,确定资费数据。
本实施例中通过旧计费***中大量的话单中的数据进行反向推演,推演出资费数据,不需要传统方式中通过人为的对旧计费话单的不断修正来确定旧计费***话单,从而获取旧计费***的资费数据,极大的降低人工依赖,提升确定旧计费***的数据的效率,节约计算资源与人力。
请参阅图2所示,下面从资费数据的确定装置侧对本发明实施例提供的一种资费数据的确定方法进行详细描述,本发明实施例提供的一种资费数据的确定方法的一个实施例包括:
步骤201、获取话单集合,话单集合中的每个话单包括计费条件及在计费条件下的计费单位和扣费费用。
本实施例中的话单集合可以为经过预处理的标准数据格式的话单。请结合图3进行理解,图3为本发明实施例中的预处理的步骤示意图。
由于不同的旧计费***有不同的数据结构,同时新***的数据结构与旧计费***不同,提供一个标准的数据结构以适应这种变化将旧***已有数据,转化为标准的数据结构,并将转换后的标准格式的数据导入存储服务器中,供资费数据推演使用。
该预处理的步骤主要是对数据的格式进行转化,具体的步骤包括:1、获取旧计费***数据(包括用户数据,***数据和话单数据);2、将获取的旧计费***数据转化标准格式的数据;3、将标准格式的数据存储至存储服务器中。
需要说明的是,当该预处理的过程将旧计费数据的格式转化完成后,将标准格式的数据存储至该存储***,以待执行确定资费数据的步骤,若该旧计费***数据已经是标准格式的数据,或者该旧计费***数据与新计费***数据相同则可以不执行该预处理的步骤。
从存储服务器获取话单集合,该话单集合可以为存储服务器中的标准数据格式的话单,该话单集合中包括了大量的话单,每个话单可以如表1至表3中的话单所示。
步骤202、将话单集合中的多个话单按照预置规则进行分组,得到多个组话单,该多个组话单需要满足分组条件,该分组条件为:计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系。
若组话单满足该分组条件执行步骤203;若该组话单不满足分组条件,则对该组话单进一步进行分组,直到对组话单分组后的子组话单满足该分组条件。
请结合图4进行理解,图4为本发明实施例中对话单集合中的多个话单进行分组的流程示意图。
需要说明的是,对于话单中所包括的数据,即使是同一个套餐下的同一个业务的资费,也会依据计费条件中包括的计费要素的不同,而产生不同的资费费率,这就导致在话单集合中会存在各种资费费率混在一起,例如,在语音业务中,相同的计费时长会可能出现多个扣费费用,例如,用户在上午7点至上午10点这个时段进行语音业务,主叫10分钟,扣费费用为3元,而用户在晚上10点至12点这个时段进行语音业务,同样是主叫10分钟,但是扣费费用却为1元。对话单集合中的多个话单进行分组的目的就是要将不同的资费费率的话单分开。
将该话单集合中的多个话单按照预置规则进行分组的具体的步骤可以为:
步骤a、初始分组,将话单集合中的多个话单按照多个计费要素中的预置计费要素进行分组,得到多个组话单,多个组话单中的每个组话单同一个预置计费要素所对应的要素值相同。
预置计费要素依据经验知识预先设定,以语音业务为例进行说明,假设针对某个套餐的初始分组条件为:业务类型、业务流程、主叫归属地、被叫归属地、通话类型。依据该预置计费要素,得到3个组话单,分别为构建深圳打深圳的本地语音主叫资费组话单,深圳打北京的省际(国内)长途语音主叫资费组话单,深圳打美国的国际长途语音主叫资费组话单等等。下面以3个组话单中的一个组话单为例进行说明,以深圳打深圳的主叫资费组话单为例,该组话单如下表4所示:
表4
Figure PCTCN2018073850-appb-000008
Figure PCTCN2018073850-appb-000009
需要说明的是,对话单集合中的多个话单进行分组,得到组话单的数量(如3),和表4中示例的组话单中的扣费时长对应的数值及扣费费用对应的费用值,还有多个计费要素,均是为了方便说明而举的例子,并不造成对本申请的限定性说明。
步骤b、确定该组话单是否满足如下分组条件:组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系。
需要说明的是,需要遍历多个组话单,需要对多个组话单中的每个组话单进行验证,确定每个组话单是否满足分组条件。例如,本实施例中为了方便描述,可以以一个组话单为例进说明。
以上表4中的组话单为例,表4中的组话单包括4个话单,且该4个组话单中的预置计费要素对应的要素值均相同,也就是说,该4个话单均是深圳打深圳的主叫语音资费话单。
在上表4的示例中,该组话单的第一个话单中,计费时长30对应的计费费用为30;第二个话单中,计费时长20对应的计费费用为10,第三个话单中,计费时长40对应的计费费用为40;第四个话单中,计费时长20对应的扣费费用为20。每个计费时长对应的扣费费用只有一个对应关系,也就是说,表4中的组话单满足组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系。
再举一个组话单的例子:
表5
Figure PCTCN2018073850-appb-000010
从上述表5中的组话单的例子可以看出,该表5中的组话单中的4个话单中,计费时长30对应了两个扣费费用的数值,分别为20和10,而计费时长40对应了两个扣费费用的费用值,分别为40和20,也就是说在表5的组话单的例子中,该组话单并不满足组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系。
上面对分组得到的组话单的需要满足的条件进行了示例,下面就如何验证该组话单是否满足计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系的条件进行说明。
首先,根据如下公式1计算组话单中数据的熵;
Figure PCTCN2018073850-appb-000011
其中,D表示组话单中的话单的数量,Di表示计费单位i出现的次数,p ij表示计费单位i,出现扣费费用j的概率。
该熵可以用于验证分组后的组话单中相同计费时长(x)对应不同扣费(y)的混乱程度,分 组的目标是使熵为0或使熵接近0,这样该组话单中计费时长对应的同一个数值只有一个扣费费用的数值与之对应。例如,表4中的例子,该组话单中的计费时长为30与扣费费用对应的数值只有一个(如30)。
例如:参照上述公式1,以表5中的组话单对计算过程为例进行说明。
该组话单中不同计费时长有2个,熵的计算过程以及对应的计算结果如下:
Figure PCTCN2018073850-appb-000012
其中公式的计算过程分为两部分,
Figure PCTCN2018073850-appb-000013
代表计费时长为30的话单对该组话单产生的熵、
Figure PCTCN2018073850-appb-000014
代表计费时长为40的话单对该组话单产生的熵。
假设
Figure PCTCN2018073850-appb-000015
对应计费时长为30,其中
Figure PCTCN2018073850-appb-000016
表示计费时长为30的话单数量占总话单数量的比率,分子表示计费时长为30的话单有2条,分母4表示该组话单总共有4条;
Figure PCTCN2018073850-appb-000017
中的第1个
Figure PCTCN2018073850-appb-000018
中的
Figure PCTCN2018073850-appb-000019
表示计费时长(x)为30、扣费费用(y)为20的话单数量占计费时长为30的话单数量的比率,第2个
Figure PCTCN2018073850-appb-000020
中的
Figure PCTCN2018073850-appb-000021
表示计费时长(x)为30、扣费费用(y)为10的话单数量占计费时长为30的话单数量的比率。
若熵大于阈值,计费时长所对应的同一个数值与扣费费用具有对应关系的费用值的数量大于或者等于2个。该“同一个数值”的数量至少为一个。
例如,该组话单中,计费时长包括1、5、10、15,其中计费时长1,对应的扣费费用分别为1和2;计费时长5对应的扣费费用分别为5和6;计费时长10对应的扣费费用分别为10和11;也就是在该组话单计费时长所对应的3个数值不满足分组条件,也可以理解的是,当若熵大于阈值时,在实际应用中,计费时长对应的至少一个数值不满足分组条件。
若熵小于或者等于阈值,该计费时长所对应的同一个数值与扣费费用具有对应关系的费用值的数量为1个。在实际应用中,考虑到误差或者故障的情况下,例如会将该阈值设置为大于且接近于0(如0.001),当熵小于或者等于0.001时,可能会存在同一个数值与扣费费用具有对应关系的费用值的数量大于或者等于2个的情况,这种情况在实际应用中是允许的,这样设置,更适合实际应用。例如,在一个组话单中,计费时长包括1、5、10、15,其中,计费时长1对应的扣费费用为1,计费时长5对应的扣费费用为5,计费时长15对应的扣费费用为15,而计费时长10对应的扣费费用为10和12,也就是说在计费时长的多个数值中包括了一个数值10对应了两个费用值,这种情况是误差允许的范围内,这里只是以4个数值为例进行说明,在实际应用中,该数值可能有成千上万个,也就是说在计费时长的大量的数值中,可以允许极少量的数值与扣费费用具有对应关系的费用值的数量大于或者等于2个。
需要说明的是,上述对应计费时长对应的数值与扣费费用对应的费用值均为举例说明,并不造成对本申请的限定性说明。
若阈值为0.001,则计算表5中组话单的熵,该组话单的熵为1大于阈值,则表明在该组话单中,计费时长对应的同一个数值与扣费费用的费用值有至少两个对应关系,如表5所示,计费时长30即对应了扣费费用20,也对应了扣费费用10。
表4中的组话单的熵的计算过程,此处不再赘述。
表4中的组话单的熵为0,则表明表4中组话单满足:计费单位对应的同一个数值与扣 费费用对应的唯一费用值具有对应关系的条件,因此,不需要再继续分组,可以直接执行步骤303。
若该组话单如表5中的示例,则表明该组话单不满足计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系的条件,因此需要继续对该组话单继续分组,直到满足计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系的条件为止。
可以理解的是,本发明实施例中的对话单集合中的多个话单进行分组的过程,可以是首先根据预置计费要素对该多个话单进行初始分组,得到组话单;然后,对通过计算该组话单的熵来验证该组话单是否满足上述分组条件。对该组话单的验证结果包括两种,第一种是该组话单满足上述分组条件,则无需继续分组,则可以直接执行步骤303。第二种是该组话单不满足上述分组条件,说明该组话单下存在多种资费费率混合的情况。需要对该组话单进行进一步分组,得到分组后的子组话单。
本实施例中,对话单集合中的多个话单进行分组,分组的次数为2次,第一次为初始分组,第二次为对组话单进行分组,得到子组话单,需要理解的是,本实施例中,只是以两次分组为例进行说明,而在实际应用中,对话单集合中对多个话单进行分组的次数并不限定,可能第一次初始分组就满足分组条件,不需要后续的分组,或者,在第二次分组后,分组后的话单依然不满足分组条件,则需要进行第三次和第四次分组,直到分组后的话单满足分组条件为止,无论几次分组,分组的原理与第二次分组的原理相同,后续的分组的具体过程可以参照本实施例中对组话单进行分组的过程进行理解。
本发明实施例中以一个组话单进行举例说明,但是在实际应用中,需要对每个组话单进行验证,验证每个组话单是否上述分组条件。
可选的,可以进一步将相似的组话单进行合并,这里的相似的组话单是指在组话单中,多个计费要素中可能有预置数量的计费要素不同,但是资费费率相同的组话单。举一个例子,在表4对应的组话单的例子中,获取分组后的组话单中计费时长对应的数值和扣费费用对应的费用值,保存计费时长对应的数值与扣费费用对应的费用值的映射关系(如表4对应的组话单中,计费时长30对应扣费费用20,计费时长40对应收费40),假设另一个组话单C中的计费时长与扣费费用的映射关系与表4对应的组话单相同,即也包括计费时长30对应扣费费用20;计费时长40对应扣费费用40的映射关系,说明表4对应的组话单和组话单C用的是一个资费费率,则将表4对应的组话单和组话单C进行合并,本实施例中可以将资费费率相同的组话单进行合并,然后可以对合并后的组话单进行处理,可以提高后续处理步骤的处理效率。
进一步的,对合并后的组话单(如表4对应的组话单和组话单C合并之后的组话单)进行验证,也就是说,使表4对应的组话单和组话单C合并之后的组话单再重复执行步骤b,计算合并后的组话单的熵,确定合并后的组话单是否满足分组条件,若合并后的组话单满足分组条件则继续执行步骤303;若合并后的组话单不满足分组条件,则执行步骤c。
步骤c、若该组话单不满足上述分组条件,确定组话单中的目标计费要素,确定以目标计费要素为***点的子组话单对应的信息增益,目标计费要素对应至少两个不同的要素值。将目标要素值相同的话单分成一组,得到至少两个目标子组话单。
需要说明的是,本步骤中的组话单可以是初始分组后的组话单,也可以是合并之后的组话单。
进一步的,下面以表5中的组话单为例,对表5中的组话单继续分组,得到子组话单的 步骤进行说明:
当每个目标子组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系时,至少两个目标子组话单为组话单,目标要素值为目标计费要素对应的要素值。
其中,该目标计费要素可以为一个计费要素,也可以包括至少两个计费要素,下面分别对该目标计费要素为一个计费要素时,和该目标计费要素包括至少两个计费要素为例进行分别举例说明。
在一种可能的实现方式中,对该目标计费要素为一个计费要素进行示例性说明。
例如,这里假设以从多个计费要素中抽取主叫计费时间所属小时作为目标计费要素,该目标计费要素也可以理解为***点,请参阅表6所示:
表6
Figure PCTCN2018073850-appb-000022
将上述表6中所示的组话单进行分组,将“小时”这个目标要素对应的同是“1”的分为一组,将“小时”对应“2”的分为一组,得到下面两个目标子组话单。如下表7和下表8所示。
表7
Figure PCTCN2018073850-appb-000023
需要说明的是,本发明实施例中以语音业务为例的话单示例中,在没有注明单位的情况下,计费时长的单位均可以以“秒”计,扣费费用均可以以“分”计,以下不重复说明。
表8
Figure PCTCN2018073850-appb-000024
然后,按照公式1对表7和表8所示的目标子组话单计算熵,确定该目标子组话单是否满足该分组条件,如果满足该分组条件则不需要继续分组,如果不满足分组条件则需要再抽取多个计费要素中的另一个计费要素作为***点(如时间所属的星期计费要素)对该目标子组话单继续分组,直到满足该分组条件为止。
需要说明的是,上述表7和表8中所示的目标子组话单满足该分组条件,因此不需要再继续分组。
在另一种可能的实现方式中,该目标计费要素包括至少两个计费要素为例进行说明,该 目标计费要素包括第一计费要素和第二计费要素时,将第一要素值相同的话单分成一组,得到至少两个第一子组话单,第一要素值为第一计费要素对应的要素值;并将第二要素值相同的话单分成一组,得到至少两个第二子组话单,第二要素值为第二计费要素对应的要素值。请参阅下表9所示,下表9为该目标计费要素包括两个计费要素的示例说明,该第一计费要素可以以“小时”为例,该第二计费要素可以以“星期”为例。
表9
Figure PCTCN2018073850-appb-000025
步骤d、当目标计费要素包括至少两个计费要素时,确定最大信息增益对应的目标计费要素作为***点,确定组话单***后的子组话单。
计算第一子组话单相对于组话单的第一信息增益,并计算第二子组话单相对于组话单的第二信息增益。若第一信息增益大于第二信息增益,则将第二计费要素从多个计费要素中删除。
例如,如上表9中的例子,将“小时”作为***点,进行进一步分组,第一子组话单为表10和表11所示。
表10
Figure PCTCN2018073850-appb-000026
表11
Figure PCTCN2018073850-appb-000027
按照公式1对表10和表11中的第一子组话单计算熵,该第一子组话单的熵为0。该第一子组话单的第一信息增益为1。信息增益的计算方式为被分组的组话单的熵(1)减去分组后的子组话单的熵(0),得到该第一子组话单的第一信息增益为1。本实施例中,被分组的组 话单为表9所示的组话单,分组后的子组话单为表10和表11所示。
如上述表9中的例子,将“星期”作为***点,进行进一步分组,第二子组话单为表12和表13所示。
表12
Figure PCTCN2018073850-appb-000028
表13
Figure PCTCN2018073850-appb-000029
按照公式1对表12和表13中的第二子组话单计算熵,该第二子组话单的熵为1,该第二子组话单的第二信息增益为0。信息增益的计算方式为被分组的组话单的熵(1)减去分组后的子组话单的熵(1),得到第二子组话单的第二信息增益为0。
需要说明的是,信息增益越大,表明按照所增加的目标计费要素分组得到的子组话单,熵会越小,可以理解的是,该目标子组话单中,相同计费单位(x)对应不同扣费费用(y)的概率会降低。
确定信息增益最大的目标计费要素作为***点,确定子组话单。本实施例中,第一信息增益大于第二信息增益,因此将“小时”作为分组的***点。本实施例中,表10和表11中的第一子组话单的熵为0,小于阈值(如,阈值为0.001),表10和表11对应的第一子组话单停止***。
在计算信息增益时,发现所属星期对组话单的进一步分组并不起作用,因此将该第二计费要素(星期)从计费条件中删除,删除冗余的计费要素,可以使有效减少数据量,减少一个计费要素,就可以减少一项需要处理的数据,也可以有效减少最后确定资费数据的存储量,而且使得资费数据减少干扰的计费要素,更加清晰明了。即分组后的计费条件为:业务类型,业务流程,主叫归属地,被叫归属地,国内长途,主叫计费时间所属小时。
进一步的,可以将相似组话单合并,可以理解为可以将满足分组条件的组话单和子组话单,或者满足分组条件的至少两个子组话单进行合并,得到合并后的组话单,这里的相似组话单是指具有相同资费费率的组话单,相同的资费费率为计费时长和扣费费用的映射关系相同。然后,该合并后的组话单可以重复执行步骤b,以验证合并后的组话单是否满足分组条件,若满足分组条件则执行步骤203,将相似组话单进行合并,可以有效提高后续对各组话单进行处理的效率。
步骤203、对组话单中进行分析,得到数据特征。
结合步骤202的理解,本步骤中的组话单可以为:满足该分组条件的,初始分组后的组话单,或者将初始分组后的组话单合并之后的组话单,或者至少两个子组话单合并之后的组 话单,或者将初始分组的组话单和***后的子组话单进行合并后的组话单。
当数据特征包括扣费费用的跳次扣费值时,在第一个例子中,获取该组话单中的扣费费用对应的不同的费用值。请结合下表14进行理解。
表14
计费时长(x) 扣费费用(y)
1 20
15 20
30 20
31 40
60 40
71 60
72 60
然后,将扣费费用对应的不同的费用值按序排列,请结合下表15进行理解。这里的按序排列包括从大到小或者从小到大进行排列,本实施例中可以以将费用值从小到大进行排列为例进行说明。
表15
扣费费用(y)
20
40
60
然后,确定排序后的连续两个费用值之间的差值,请结合下表16进行理解。
表16
  费用值差值
20-0 20
40-20 20
60-40 20
 
该差值为跳次扣费值。
从上述表16可以看出,差值只有一个固定值20,则确定该跳次扣费值的数量为一个。
在第二个例子中:
获取的计费时长对应的扣费费用如下表17所示:
表17
计费时长(x) 扣费费用(y)
1 20
15 20
30 20
31 40
60 40
61 50
72 50
90 50
91 60
120 70
然后,将扣费费用对应的不同的费用值按序排列,请结合下表18进行理解:
表18
扣费费用(y)
20
40
50
60
70
然后,确定排序后的连续两个费用值之间的差值,请结合下表19进行理解。
表19
  费用值差值
20-0 20
40-20 20
50-40 10
60-50 10
70-60 10
 
从上表19可以看出,对扣费费用(y)排序后,计算费用值之间的差值,该差值为跳次扣费值,得到的差值分别为:20、20、10、10、10;不同的差值的数量为2个(分别为20和10),确定跳次扣费值的数量为2个。
需要说明的是,本发明实施例中是以语音业务为例进行说明,因此计费单位均是以计费时长进行示例说明的,并不造成对本申请的限定性说明。当然,在不同的业务中,计费单位可以不同,如在短信或彩信业务中,计费单位为计费条数,在数据业务中,计费单位为计费流量。
步骤204、根据数据特征从预置的资费模型中选择与数据特征对应的目标资费模型,资费模型用于根据计费单位计算扣费费用。
在步骤203中的第一个例子中,当该跳次扣费值的数量为1个时,则从预置的资费模型中选择第一资费模型作为目标资费模型。预置的资费模型中包括多个资费模型,本实施例中以简单资费模型和分档资费模型为例进行说明,在实际应用中,并不限定预置的资费模型中各资费模型的具体格式及数量。
该第一资费模型为简单资费模型,第一资费模型的格式如下:
Figure PCTCN2018073850-appb-000030
其中,y表示扣费费用;x表示计费时长;unitFee表示跳次扣费值;pulse为跳次,跳次表示最小的计费时长单元;ceil表示天花板函数,记取上。在该模型中,unitFee为1个。
举一个例子,对应计费***中经常会有60秒扣费1元(等价于100分),不足60秒按60秒记。代入上述公式中为:
Figure PCTCN2018073850-appb-000031
当x=30,则y=100。
在步骤203中的第二例子中,当跳次扣费值的数量为至少2个时,则从预置的资费模型中选择第二资费模型作为目标资费模型。
第二资费模型为分档资费模型,该第二资费模型的格式如下:
为了方便说明,本实施例中以只包括一个分档点(breakPoint)的分档资费模型为例进行说明,在实际应用中,分档资费模型并不限定分档点的个数。
分档点为不同的跳次扣费值的分界点对应的计费时长。
结合表17和表19进行理解,在表19中,跳次扣费值分别为20、20、10、10、10,在表17中,跳次扣费值发生变化时(从20到10)对应的计费时长为60,则表明计费时长60为分档点。
当计费时长小于分档点时,资费模型为:
Figure PCTCN2018073850-appb-000032
当计费时长超过分档点时,资费模型为:
Figure PCTCN2018073850-appb-000033
其中,unitFee1表示计费时长从0到breakPoint1时的跳次扣费值;pulse 1表示计费时长从0到breakPoint1时的跳次;unitFee 2表示计费时长大于breakPoint1时的跳次扣费值;pulse 2表示计费时长大于breakPoint1时的跳次。
举一个例子,如资费费率为:3分钟以内:50分/60秒,3分钟以上,10分/30秒,不足30秒按30秒记;其中unitFee1为0.5元,pulse 1为60秒,breakPoint1为3分钟,unitFee 2为0.1元,pulse 2为30秒。
步骤205、根据数据特征确定目标资费模型中的各资费参数对应的参数值。
资费费率可以包括资费模型及该资费模型对应的资费参数。该资费参数包括跳次和跳次扣费值。例如,
Figure PCTCN2018073850-appb-000034
在这个简单资费模型中,对应的各资费参数及参数值为:跳次扣费值为100,跳次为60。
1)跳次的确定:确定同一个费用值所对应的计费时长的最小值和最大值,以上述表14中所示出的例子进行说明。请结合下表20进行理解:
表20
y x(最小值,最大值)
20 (1,30)
40 (31,60)
60 (71,72)
根据上表20中所示出的同一个费用值对应的计费时长对应的最小值和最大是值确定跳次。由于跳次表示最小计费时长单元,也就是说超过一个最小时长单元(跳次)才扣费,在表20的例子中,超过一个20秒时长就会扣取相应的费用值。由于组话单中的可能包括大量的话单,每个话单中都包括一个计费时长及该计费时长对应的扣费费用,大量话单中包括的同一个费用值可能对应多个计费时长的数值,如在表14中,费用值20对应了计费时长为1,15和30,当计费时长为31时,费用值发生了变化,费用值30为扣费费用发生变化的临界值,这样就很可能根据同一个费用值对应的计费时长对应的最大值和最小值确定出一个最小计费时长单元(可以理解为超过一个计费时长扣费费用对应的费用值发生变化)。请参阅下表21所示:
表21
y 跳次
20 30=30-1+1
40 30=60-31+1
60 2=72-71+1
从上表21所示,计算出的跳次的集合为(30,2)。
2)跳次扣费值的确定:在表15的示例中,扣费费用的费用值之间的差值为固定值20,则将该差值作为跳次扣费值。
根据上述步骤1)和步骤2)确定出的参数值候选集合包括第一参数值和第二参数值,如下表22所示:
表22
第一参数值 unitFee=20,pulse=30
第二参数值 unitFee=20,pulse=2
将参数值候选集中的第一参数值和第二参数值分别带入所选择的目标资费模型(如简单资费模型),并依据在步骤202中保存的计费时长(x)与扣费费用(y)之间的映射关系,分别验证第一参数值和第二参数值的正确性。第一参数值(unitFee=20,pulse=30)符合保存的计费时长与扣费费用的映射关系,最后,确定第一参数值为最终的目标资费模型对应的参数值。
步骤206、根据参数值、目标资费模型及计费条件,确定资费数据。
推演出该组数据的资费费率,如
Figure PCTCN2018073850-appb-000035
物理意义为20分/30秒,不足30秒以30秒记。
例如,资费数据可为下表23所示:
表23
Figure PCTCN2018073850-appb-000036
Figure PCTCN2018073850-appb-000037
需要说明的是,上述资费数据只是为了方便说明,而举的例子,并不造成对本申请的限定性说明。
步骤207、根据话单所携带的套餐标识,将同一个套餐标识下的资费数据进行合并,得到套餐资费数据。
话单集合中的每个话单均携带套餐标识,套餐标识用于指示话单所归属的资费套餐,得到套餐资费数据。将相同套餐下的各个计费条件对应的资费整合在一起,如将套餐A下的语音、数据、短信、彩信对应的资费费率用树形结构整合,便于用户查看及后续分析。
例如,该套餐A下的资费数据可以为:可以包括语音资费、数据业务资费、短信及彩信业务资费。例如:语音套餐,月功能费20元,最多可拨打60分钟电话,超出时长按照0.5元/分钟计费;短信套餐:月功能费10元,最多可发送200条短信,超出条数按照0.1元/条计费。数据套餐:月功能费20元,最多可获50兆流量,超出流量按照3月/兆计费。
可选的,如果两个不同资费费率对应的计费条件,仅仅是时间区间不同,则可以合在一起,参见下表24的资费数据。
表24
Figure PCTCN2018073850-appb-000038
需要说明的是,步骤207为可选步骤,也可以不执行。
进一步的,将确定的套餐资费数据或者资费数据进行保存,将资费数据保存至存储服务器,以支持后续新计费服务器批价以及话单比对服务器对比新旧话单的扣费情况。
本发明实施例中,由于旧计费***中大量的话单包括计费要素,计费时长及扣费费用,通过旧计费***中大量的话单中的数据进行反向推演,推演出资费数据,不需要传统方式中通过人为的对旧计费话单的不断修正来确定旧计费***话单,从而获取旧计费***的资费数据,极大的降低人工依赖,提升确定旧计费***的数据的效率,节约计算资源与人力。
上面对本发明实施例中一种资费数据的确定方法进行详细描述,下面的对该资费数据的确定方法所应用的资费数据的确定装置进行描述,该确定装置可以以服务器的形态存在,请参阅图5所示,本发明实施例中提供了一种资费数据的确定装置500的一个实施例包括:
获取模块501,用于获取话单集合,话单集合中的每个话单包括计费条件及在计费条件下的计费单位和扣费费用;
分组模块502,用于将获取模块501获取的话单集合中的多个话单按照预置规则进行分组,得到组话单,组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系;
数据特征分析模块503,用于对分组模块502确定的组话单中进行分析,得到数据特征;
资费模型确定模块504,用于根据数据特征分析模块503确定的数据特征从预置的资费模型中选择与数据特征对应的目标资费模型,资费模型用于根据计费单位计算扣费费用;
参数值确定模块505,用于根据分组模块502确定的组话单中的计费单位对应的数值与扣费费用对应的费用值确定目标资费模型中的各资费参数对应的参数值;
资费数据确定模块506,用于根据参数值确定模块505确定的参数值、资费模型确定模块504确定的目标资费模型及计费条件,确定资费数据。
可选的,分组模块502,还用于将话单集合中的多个话单按照多个计费要素中的预置计费要素进行分组,得到组话单,组话单中同一个预置计费要素所对应的要素值相同。
可选的,分组模块502还具体用于:
当计费单位对应的同一个数值与扣费费用对应的至少两个费用值具有对应关系时,确定组话单中的目标计费要素,目标计费要素对应至少两个不同的要素值;
将目标要素值相同的话单分成一组,得到至少两个目标子组话单;
当每个目标子组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系时,至少两个目标子组话单为组话单,目标要素值为目标计费要素对应的要素值。
可选的,分组模块502还具体用于:
根据如下公式计算组话单中数据的熵;
Figure PCTCN2018073850-appb-000039
其中,D表示组话单中的话单的数量,Di表示计费单位i出现的次数,p ij表示计费单位i,出现扣费费用j的概率;
若熵大于阈值,计费单位所对应的同一个数值与扣费费用具有对应关系的费用值的数量大于或者等于2个;
若熵小于或者等于阈值,计费单位所对应的同一个数值与扣费费用具有对应关系的费用值的数量为1个。
请参阅图6所示,在图5对应的实施例的基础上,本发明实施例还提供了一种资费数据的确定装置600的一个实施例包括:
还包括计算模块507和删除模块508;
当目标计费要素包括第一计费要素和第二计费要素时,分组模块502还具体用于:
将第一要素值相同的话单分成一组,得到至少两个第一子组话单,第一要素值为第一计费要素对应的要素值;
将第二要素值相同的话单分成一组,得到至少两个第二子组话单,第二要素值为第二计费要素对应的要素值;
计算模块507,用于计算分组模块502确定的第一子组话单相对于组话单的第一信息增益,并计算分组模块502确定的第二子组话单相对于组话单的第二信息增益;
删除模块508,用于当计算模块507确定第一信息增益大于第二信息增益时,则将第二计费要素从多个计费要素中删除。
请参阅图7所示,在图5对应的实施例的基础上,本发明实施例还提供了一种资费数据的确定装置700的一个实施例包括:
话单集合中的每个话单均携带套餐标识,套餐标识用于指示话单所归属的资费套餐;
资费数据的确定装置还包括合并模块509;
合并模块509,用于根据话单所携带的套餐标识,将同一个套餐标识下的资费数据确定模块506确定的资费数据进行合并,得到套餐资费数据。
可选的,跳次用于表示最小的计费单位单元,跳次扣费值用于表示该跳次对应的费用值,数据特征分析模块503还具体用于:
获取组话单中的扣费费用对应的不同的费用值;
将不同的费用值按序排列;
确定排序后的连续两个费用值之间的差值,差值为跳次扣费值;
确定不同跳次扣费值的数量。
可选的,资费模型确定模块504还具体用于:
若跳次扣费值的数量为1个,则从预置的资费模型中选择第一资费模型作为目标资费模型;
若跳次扣费值的数量为至少2个,则从预置的资费模型中选择第二资费模型作为目标资费模型。
进一步的,图5至图7中的资费数据的确定装置是以功能模块的形式来呈现。这里的“模块”可以指特定应用集成电路(application-specific integrated circuit,ASIC),电路,执行一个或多个软件或固件程序的处理器和存储器,集成逻辑电路,和/或其他可以提供上述功能的器件。在一个简单的实施例中,图5至图7中的资费数据的确定装置可以采用图8所示的形式。
图8是本发明实施例提供的一种资费数据的确定装置800,该资费数据的确定装置可以以服务器的形态存在,本发明实施例中,该确定装置800以服务器为例进行说明,该服务器800可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上一个或一个以上处理器800和存储器832,一个或一个以上存储应用程序842或数据844的存储介质830(例如一个或一个以上海量存储设备)。其中,存储器832和存储介质830可以是短暂存储或持久存储。存储在存储介质830的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对服务器中的一系列指令操作。更进一步地,处理器822可以设置为与存储介质830通信,在服务器800上执行存储介质830中的一系列指令操作。
服务器800还可以包括一个或一个以上电源826,一个或一个以上有线或无线网络接口850,一个或一个以上输入输出接口858,和/或,一个或一个以上操作***841,例如Windows Server,Mac OS X,Unix,Linux,FreeBSD等等。
网络接口850,用于获取话单集合,话单集合中的每个话单包括计费条件及在计费条件下的计费单位和扣费费用;
处理器822,用于将网络接口850获取的话单集合中的多个话单按照预置规则进行分组,得到组话单,组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系;对组话单进行分析,得到数据特征;根据数据特征从预置的资费模型中选择与数据特 征对应的目标资费模型,资费模型用于根据计费单位计算扣费费用;根据组话单中的计费单位对应的数值与扣费费用对应的费用值确定目标资费模型中的各资费参数对应的参数值;根据参数值、目标资费模型及计费条件,确定资费数据。
可选的,计费条件包括多个计费要素;
处理器822,还用于将话单集合中的多个话单按照多个计费要素中的预置计费要素进行分组,得到组话单,组话单中同一个预置计费要素所对应的要素值相同。
可选的,处理器822,还用于当计费单位对应的同一个数值与扣费费用对应的至少两个费用值具有对应关系时,确定组话单中的目标计费要素,目标计费要素对应至少两个不同的要素值;将目标要素值相同的话单分成一组,得到至少两个目标子组话单;当每个目标子组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系时,至少两个目标子组话单为组话单,目标要素值为目标计费要素对应的要素值。
可选的,处理器822,还用于根据如下公式计算组话单中数据的熵;
Figure PCTCN2018073850-appb-000040
其中,D表示组话单中的话单的数量,Di表示计费单位i出现的次数,p ij表示计费单位i,出现扣费费用j的概率;
若熵大于阈值,确定计费单位所对应的同一个数值与扣费费用具有对应关系的费用值的数量大于或者等于2个;若熵小于或者等于阈值,确定计费单位所对应的同一个数值与扣费费用具有对应关系的费用值的数量为1个。
可选的,当目标计费要素包括第一计费要素和第二计费要素时,处理器822,还用于将第一要素值相同的话单分成一组,得到至少两个第一子组话单,第一要素值为第一计费要素对应的要素值;将第二要素值相同的话单分成一组,得到至少两个第二子组话单,第二要素值为第二计费要素对应的要素值;计算第一子组话单相对于组话单的第一信息增益,并计算第二子组话单相对于组话单的第二信息增益;若第一信息增益大于第二信息增益,则将第二计费要素从多个计费要素中删除。
可选的,话单集合中的每个话单均携带套餐标识,套餐标识用于指示话单所归属的资费套餐。
处理器822,还用于根据话单所携带的套餐标识,将同一个套餐标识下的资费数据进行合并,得到套餐资费数据。
可选的,当数据特征包括扣费费用的跳次扣费值的数量时,跳次用于表示最小的计费单位单元,跳次扣费值用于表示该跳次对应的费用值;
处理器822,还用于获取组话单中的扣费费用对应的不同的费用值;将不同的费用值按序排列;确定排序后的连续两个费用值之间的差值,差值为跳次扣费值;确定不同跳次扣费值的数量。
处理器822,还用于当跳次扣费值的数量为1个时,则从预置的资费模型中选择第一资费模型作为目标资费模型;当跳次扣费值的数量为至少2个时,则从预置的资费模型中选择第二资费模型作为目标资费模型。
本发明实施例还提供了一种计算机存储介质,用于储存为上述图8所示的资费数据的确定装置中所用的计算机软件指令,其包含用于执行上述方法实施例所设计的程序。通过执行存储的程序,可以实现对资源的获取。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。
计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本发明实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘Solid State Disk(SSD))等。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的***,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的***,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
以上,以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。

Claims (20)

  1. 一种资费数据的确定方法,其特征在于,包括:
    获取话单集合,所述话单集合中的每个话单包括计费条件及在所述计费条件下的计费单位和扣费费用;
    将所述话单集合中的多个话单按照预置规则进行分组,得到组话单,所述组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系;
    对所述组话单的扣费费用对应的费用值进行分析,得到数据特征;
    根据所述数据特征从预置的资费模型中选择与所述数据特征对应的目标资费模型,所述资费模型用于根据计费单位计算扣费费用;
    根据所述组话单中的计费单位对应的数值与扣费费用对应的费用值确定所述目标资费模型中的各资费参数对应的参数值;
    根据所述参数值、所述目标资费模型及所述计费条件,确定资费数据。
  2. 根据权利要求1所述的资费数据的确定方法,其特征在于,所述计费条件包括多个计费要素,所述将所述话单集合中的多个话单按照预置规则进行分组,得到组话单,包括:
    将所述话单集合中的多个话单按照所述多个计费要素中的预置计费要素进行分组,得到组话单,所述组话单中同一个预置计费要素所对应的要素值相同。
  3. 根据权利要求2所述的资费数据的确定方法,其特征在于,所述方法还包括:
    当所述计费单位对应的同一个数值与所述扣费费用对应的至少两个费用值具有对应关系时,确定所述组话单中的目标计费要素,所述目标计费要素对应至少两个不同的要素值;
    将目标要素值相同的话单分成一组,得到至少两个目标子组话单;
    当每个所述目标子组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系时,所述至少两个所述目标子组话单为所述组话单,所述目标要素值为所述目标计费要素对应的要素值。
  4. 根据权利要求3所述的资费数据的确定方法,其特征在于,所述方法还包括:
    根据如下公式计算所述组话单中数据的熵;
    Figure PCTCN2018073850-appb-100001
    其中,D表示所述组话单中的话单的数量,Di表示计费单位i出现的次数,p ij表示计费单位i,出现扣费费用j的概率;
    若所述熵大于阈值,确定计费单位所对应的同一个数值与所述扣费费用具有对应关系的费用值的数量大于或者等于2个;
    若所述熵小于或者等于阈值,确定计费单位所对应的同一个数值与所述扣费费用具有对应关系的费用值的数量为1个。
  5. 根据权利要求3中所述的资费数据的确定方法,其特征在于,当所述目标计费要素包括第一计费要素和第二计费要素时,将目标计费要素值相同的话单分成一组,得到至少两个目标子组话单,包括:
    将第一要素值相同的话单分成一组,得到至少两个第一子组话单,所述第一要素值为所述第一计费要素对应的要素值;
    将第二要素值相同的话单分成一组,得到至少两个第二子组话单,所述第二要素值为所述第二计费要素对应的要素值;
    所述方法还包括:
    计算所述第一子组话单相对于所述组话单的第一信息增益,并计算所述第二子组话单相对于所述组话单的第二信息增益;
    若所述第一信息增益大于第二信息增益,则将所述第二计费要素从所述多个计费要素中删除。
  6. 根据权利要求1至5中任一项所述的资费数据的确定方法,其特征在于,所述话单集合中的每个话单均携带套餐标识,所述套餐标识用于指示所述话单所归属的资费套餐,所述根据所述参数值、所述目标资费模型及所述计费条件,确定资费数据之后,所述方法还包括:
    根据话单所携带的套餐标识,将同一个套餐标识下的资费数据进行合并,得到套餐资费数据。
  7. 根据权利要求1至5中任一项所述的资费数据的确定方法,其特征在于,当所述数据特征包括所述扣费费用的跳次扣费值的数量时,跳次用于表示最小的计费单位单元,跳次扣费值用于表示该跳次对应的费用值,所述对所述组话单中进行分析,得到数据特征,包括:
    获取所述组话单中的扣费费用对应的不同的费用值;
    将所述不同的费用值按序排列;
    确定排序后的连续两个费用值之间的差值,所述差值为所述跳次扣费值;
    确定不同跳次扣费值的数量。
  8. 根据权利要求7所述的资费数据的确定方法,其特征在于,所述根据所述数据特征从预置的资费模型中选择与所述数据特征对应的目标资费模型,包括:
    若所述跳次扣费值的数量为1个,则从预置的资费模型中选择第一资费模型作为目标资费模型;
    若所述跳次扣费值的数量为至少2个,则从预置的资费模型中选择第二资费模型作为目标资费模型。
  9. 根据权利要求1至8任一项所述的资费数据的确定方法,其特征在于,所述计费单位包括语音业务的计费时长、数据业务的计费流量、短信及彩信业务的计费条数。
  10. 一种资费数据的确定装置,其特征在于,包括:
    获取模块,用于获取话单集合,所述话单集合中的每个话单包括计费条件及在所述计费条件下的计费单位和扣费费用;
    分组模块,用于将所述获取模块获取的所述话单集合中的多个话单按照预置规则进行分组,得到组话单,所述组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系;
    数据特征分析模块,用于对所述分组模块确定的所述组话单中的扣费费用对应的费用值进行分析,得到数据特征;
    资费模型确定模块,用于根据数据特征分析模块确定的所述数据特征从预置的资费模型中选择与所述数据特征对应的目标资费模型,所述资费模型用于根据计费单位计算扣费费用;
    参数值确定模块,用于根据组话单中的计费单位对应的数值与扣费费用对应的费用值确定所述目标资费模型中的各资费参数对应的参数值;
    资费数据确定模块,用于根据所述参数值确定模块确定的所述参数值、资费模型确定模 块确定的所述目标资费模型及所述计费条件,确定资费数据。
  11. 根据权利要求10所述的资费数据的确定装置,其特征在于,
    所述分组模块,还用于将所述话单集合中的多个话单按照所述多个计费要素中的预置计费要素进行分组,得到组话单,所述组话单中同一个预置计费要素所对应的要素值相同。
  12. 根据权利要求11所述的资费数据的确定装置,其特征在于,所述分组模块还具体用于:
    当所述计费单位对应的同一个数值与所述扣费费用对应的至少两个费用值具有对应关系时,确定所述组话单中的目标计费要素,所述目标计费要素对应至少两个不同的要素值;
    将目标要素值相同的话单分成一组,得到至少两个目标子组话单;
    当每个所述目标子组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系时,所述至少两个所述目标子组话单为所述组话单,所述目标要素值为所述目标计费要素对应的要素值。
  13. 根据权利要求12所述的资费数据的确定装置,其特征在于,所述分组模块还具体用于:
    根据如下公式计算所述组话单中数据的熵;
    Figure PCTCN2018073850-appb-100002
    其中,D表示所述组话单中的话单的数量,Di表示计费单位i出现的次数,p ij表示计费单位i,出现扣费费用j的概率;
    若所述熵大于阈值,确定计费单位所对应的同一个数值与所述扣费费用具有对应关系的费用值的数量大于或者等于2个;
    若所述熵小于或者等于阈值,确定计费单位所对应的同一个数值与所述扣费费用具有对应关系的费用值的数量为1个。
  14. 根据权利要求12中所述的资费数据的确定装置,其特征在于,当所述目标计费要素包括第一计费要素和第二计费要素时,所述分组模块还具体用于:
    将第一要素值相同的话单分成一组,得到至少两个第一子组话单,所述第一要素值为所述第一计费要素对应的要素值;
    将第二要素值相同的话单分成一组,得到至少两个第二子组话单,所述第二要素值为所述第二计费要素对应的要素值;
    还包括计算模块和删除模块;
    所述计算模块,还用于计算所述分组模块确定的所述第一子组话单相对于所述组话单的第一信息增益,并计算所述分组模块确定的所述第二子组话单相对于所述组话单的第二信息增益;
    所述删除模块,用于当所述计算模块确定所述第一信息增益大于第二信息增益时,则将所述第二计费要素从所述多个计费要素中删除。
  15. 根据权利要求10至14中任一项所述的资费数据的确定装置,其特征在于,所述话单集合中的每个话单均携带套餐标识,所述套餐标识用于指示所述话单所归属的资费套餐,还包括合并模块;
    所述合并模块,用于根据话单所携带的套餐标识,将同一个套餐标识下的所述资费数据确定模块确定的所述资费数据进行合并,得到套餐资费数据。
  16. 根据权利要求10至14中任一项所述的资费数据的确定装置,其特征在于,跳次用于表示最小的计费单位单元,跳次扣费值用于表示该跳次对应的费用值,所述数据特征分析模块还具体用于:
    获取所述组话单中的扣费费用对应的不同的费用值;
    将所述不同的费用值按序排列;
    确定排序后的连续两个费用值之间的差值,所述差值为所述跳次扣费值;
    确定不同跳次扣费值的数量。
  17. 根据权利要求16所述的资费数据的确定装置,其特征在于,资费模型确定模块还具体用于:
    若所述跳次扣费值的数量为1个,则从预置的资费模型中选择第一资费模型作为目标资费模型;
    若所述跳次扣费值的数量为至少2个,则从预置的资费模型中选择第二资费模型作为目标资费模型。
  18. 一种资费数据的确定装置,其特征在于,包括:
    存储器,用于存储计算机可执行程序代码;
    网络接口,以及
    处理器,与所述存储器和所述网络接口耦合;
    其中所述程序代码包括指令,当所述处理器执行所述指令时,所述指令使所述确定装置执行以下操作:
    获取话单集合,所述话单集合中的每个话单包括计费条件及在所述计费条件下的计费单位和扣费费用;
    将所述话单集合中的多个话单按照预置规则进行分组,得到组话单,所述组话单中的计费单位对应的同一个数值与扣费费用对应的唯一费用值具有对应关系;
    对所述组话单中进行分析,得到数据特征;
    根据所述数据特征从预置的资费模型中选择与所述数据特征对应的目标资费模型,所述资费模型用于根据计费单位计算扣费费用;
    根据所述组话单中的计费单位对应的数值与扣费费用对应的费用值确定所述目标资费模型中的各资费参数对应的参数值;
    根据所述参数值、所述目标资费模型及所述计费条件,确定资费数据。
  19. 一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行如权利要求1至9任一项所述的方法。
  20. 一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行如权利要求1至9任一项所述的方法。
PCT/CN2018/073850 2017-03-21 2018-01-23 一种资费数据的确定方法及装置 WO2018171324A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP18770450.7A EP3591894B1 (en) 2017-03-21 2018-01-23 Tariff data determination method and device
US16/573,217 US10750031B2 (en) 2017-03-21 2019-09-17 Tariff data determining method and apparatus for creating the same

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710173476.9A CN108632047B (zh) 2017-03-21 2017-03-21 一种资费数据的确定方法及装置
CN201710173476.9 2017-03-21

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/573,217 Continuation US10750031B2 (en) 2017-03-21 2019-09-17 Tariff data determining method and apparatus for creating the same

Publications (1)

Publication Number Publication Date
WO2018171324A1 true WO2018171324A1 (zh) 2018-09-27

Family

ID=63585893

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/073850 WO2018171324A1 (zh) 2017-03-21 2018-01-23 一种资费数据的确定方法及装置

Country Status (4)

Country Link
US (1) US10750031B2 (zh)
EP (1) EP3591894B1 (zh)
CN (1) CN108632047B (zh)
WO (1) WO2018171324A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378037A (zh) * 2020-09-10 2021-09-10 中国联合网络通信集团有限公司 资费配置的获取方法及装置
CN113452533A (zh) * 2020-03-24 2021-09-28 ***通信集团山东有限公司 计费自巡检、自愈合方法、装置、计算机设备和存储介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112135265B (zh) * 2019-06-25 2022-08-05 ***通信集团江西有限公司 话单处理方法、装置和计算机设备
CN112995934A (zh) * 2019-12-16 2021-06-18 ***通信集团湖南有限公司 话单生成方法及装置、话单计费结果验证方法及装置
CN113645050B (zh) * 2020-05-11 2024-02-23 ***通信集团湖北有限公司 大额流量用户话单梯度合并方法、装置及计算设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101827346A (zh) * 2009-03-04 2010-09-08 ***通信集团上海有限公司 一种计费***自动测试方法及装置
CN102231876A (zh) * 2011-06-24 2011-11-02 华为软件技术有限公司 一种计费***的兼容性测试方法及装置
CN103841541A (zh) * 2014-01-24 2014-06-04 华为技术有限公司 计费***的切换方法及装置
US20150319313A1 (en) * 2014-04-30 2015-11-05 Sandvine Incorporated Ulc System and method for managing online charging sessions
CN105245396A (zh) * 2014-07-11 2016-01-13 华为软件技术有限公司 一种资费验证方法及***

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7515697B2 (en) * 1997-08-29 2009-04-07 Arbinet-Thexchange, Inc. Method and a system for settlement of trading accounts
AU2003238798A1 (en) * 2002-05-28 2003-12-12 Voxtime, Inc. Dynamic pricing and yield management in mobile communications
CN100396074C (zh) * 2005-10-05 2008-06-18 华为技术有限公司 一种提供预付费业务的方法、装置及***
CN101296093B (zh) * 2007-04-26 2011-02-09 华为技术有限公司 一种计费***、方法及议价设备
US8762359B2 (en) * 2008-08-22 2014-06-24 Neustring Fze Method of analyzing data traffic in a telecommunication network
US8527377B2 (en) * 2008-12-12 2013-09-03 Verizon Patent And Licensing Inc. Point code to billing ID
CN102710433B (zh) * 2012-04-28 2015-11-25 华为技术有限公司 一种在线升级处理方法、相关装置和***
EP3116208A1 (en) * 2015-07-09 2017-01-11 Alcatel Lucent Multiple destinations information for advice of charge
CN106878031B (zh) * 2017-01-23 2020-03-27 北京思特奇信息技术股份有限公司 加快处理电信套餐用户话单的方法及***

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101827346A (zh) * 2009-03-04 2010-09-08 ***通信集团上海有限公司 一种计费***自动测试方法及装置
CN102231876A (zh) * 2011-06-24 2011-11-02 华为软件技术有限公司 一种计费***的兼容性测试方法及装置
CN103841541A (zh) * 2014-01-24 2014-06-04 华为技术有限公司 计费***的切换方法及装置
US20150319313A1 (en) * 2014-04-30 2015-11-05 Sandvine Incorporated Ulc System and method for managing online charging sessions
CN105245396A (zh) * 2014-07-11 2016-01-13 华为软件技术有限公司 一种资费验证方法及***

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113452533A (zh) * 2020-03-24 2021-09-28 ***通信集团山东有限公司 计费自巡检、自愈合方法、装置、计算机设备和存储介质
CN113378037A (zh) * 2020-09-10 2021-09-10 中国联合网络通信集团有限公司 资费配置的获取方法及装置
CN113378037B (zh) * 2020-09-10 2023-05-30 中国联合网络通信集团有限公司 资费配置的获取方法及装置

Also Published As

Publication number Publication date
EP3591894B1 (en) 2021-04-07
EP3591894A1 (en) 2020-01-08
CN108632047A (zh) 2018-10-09
EP3591894A4 (en) 2020-03-04
US10750031B2 (en) 2020-08-18
CN108632047B (zh) 2020-10-09
US20200014803A1 (en) 2020-01-09

Similar Documents

Publication Publication Date Title
WO2018171324A1 (zh) 一种资费数据的确定方法及装置
CN108805632B (zh) 一种计费方法和装置
US20120254000A1 (en) Systems and methods for improved billing and ordering
CN105050068B (zh) 一种计费话单记录检测校正方法、装置及话单处理***
WO2017152787A1 (zh) 一种自动缴费的方法和装置
US10944874B2 (en) Telecommunication system for monitoring and controlling of a network providing resource to a user
US20150181045A1 (en) Flexibile event rating
CN101840423B (zh) 基于成对下单原理与数据挖掘技术的话单准确性稽核方法
US9838862B2 (en) Mobile digital cellular telecommunication system with advanced functionality for rating correction
US10348910B2 (en) Method and system for providing a personalized product catalog enabling rating of communication events within a user device
EP2413279A1 (en) Account reconciliation server
CN105848127B (zh) 一种精确补单方法和装置
US9398441B2 (en) Method and apparatus for identifying re-subscribed user
EP3331196B1 (en) Telecommunication system for monitoring and controlling of a network providing resource to a user
CN104219064B (zh) 计费方法和装置
CN1322706C (zh) 一种实时分段计费的方法
US8660917B2 (en) Multipoint billing quality control and certification
CN105141432B (zh) 云服务订单处理方法与装置
CN110300000B (zh) 计费方式变更方法、装置、电子设备及可读存储介质
Ghotekar Analysis and Data Mining of Call Detail Records using Big Data Technology
CN112270537A (zh) 一种多渠道账单的入库方法、***及存储介质
CN109756637A (zh) 话单数据统计方法、装置、计算机装置及可读存储介质
CN105873003B (zh) 一种计费信息处理方法及装置
US11461297B1 (en) Ensuring database integrity using a data flow in a graph, such as for use by a wireless telecommunications service provider
US20220279325A1 (en) System and method for online charging telecommunication services in real time

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18770450

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018770450

Country of ref document: EP

Effective date: 20191001