CN110888864A - Automatic user data collection method and device - Google Patents

Automatic user data collection method and device Download PDF

Info

Publication number
CN110888864A
CN110888864A CN201911183543.0A CN201911183543A CN110888864A CN 110888864 A CN110888864 A CN 110888864A CN 201911183543 A CN201911183543 A CN 201911183543A CN 110888864 A CN110888864 A CN 110888864A
Authority
CN
China
Prior art keywords
collection
dimension
feedback data
target
confidence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911183543.0A
Other languages
Chinese (zh)
Other versions
CN110888864B (en
Inventor
郭啸
张君
史岩
陈琦
龙佩
杨荟生
范闯
苏星康
陆康
尹淇翰
任慧琛
林玉鑫
李甲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Guotenglianxin Technology Co Ltd
Original Assignee
Beijing Guotenglianxin Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Guotenglianxin Technology Co Ltd filed Critical Beijing Guotenglianxin Technology Co Ltd
Priority to CN201911183543.0A priority Critical patent/CN110888864B/en
Publication of CN110888864A publication Critical patent/CN110888864A/en
Application granted granted Critical
Publication of CN110888864B publication Critical patent/CN110888864B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a method and a device for automatically collecting user data, wherein at least one collection problem corresponding to each target collection dimension is output, and a corresponding confidence coefficient is set aiming at feedback data of the collection problem corresponding to each target collection dimension; calculating the overall confidence degree corresponding to each target collection dimension according to the feedback data of the collection type problem corresponding to each target collection dimension and the confidence degree of the feedback data; outputting a verification problem corresponding to each target collection dimension; verifying the confidence degree of the feedback data of the collection problems corresponding to the target collection dimensionality by using the feedback data of the verification problems corresponding to each target collection dimensionality to obtain a verification result of the feedback data of the collection problems corresponding to the target collection dimensionality; correcting the overall confidence of the target collection dimension based on the verification result of the feedback data of the collection problem corresponding to the target collection dimension; and outputting the collected feedback data of the collection type problems and the corrected overall confidence.

Description

Automatic user data collection method and device
Technical Field
The present invention relates to the field of data collection, and in particular, to a method and an apparatus for automatically collecting user data.
Background
In many fields, such as the field of financial wind control research, it is necessary to perform corresponding work according to the data of the user, so collecting the data of the user has become an essential important process in these fields.
In some fields of the prior art, in order to directionally and deeply collect data of a user in terms of risks and the like, and to evaluate the reliability of the collected data of the user in real time and adjust the direction of data collection, the data of the user is collected by manually inquiring the user. Because the reliability of information acquisition and related risk factors can be judged manually through the answers of the users, further and deeply questioning is performed in a targeted manner, data on multiple dimensions are collected, and the reliability of the data of the users can be verified timely through further and deeply questioning.
However, the manual method is not only inefficient, but also has high requirements for the business quality of the inquirers, and the consistency is poor, so that the reliability of the obtained user data and the reliability degree of the user data cannot be effectively ensured, and standardized large-scale copying is difficult.
Disclosure of Invention
Based on the defects of the prior art, the invention provides a method and a device for automatically collecting user data, which are used for solving the problem that in the prior art, the reliability of the standardization and the targeted acquisition of the user data and the reliability degree of the user data cannot be effectively ensured by collecting the user data in a manual inquiry mode.
In order to achieve the purpose, the invention provides the following technical scheme:
one aspect of the present invention provides a method for automatically collecting user data, including:
when a data collection request is received, outputting at least one collection type problem corresponding to each target collection dimension, and setting corresponding confidence coefficient aiming at the feedback data of each collection type problem corresponding to each target collection dimension; wherein the confidence level is used for explaining the credibility of the feedback data of the collection type problem;
calculating to obtain an overall confidence corresponding to each target collection dimension according to the feedback data of the collection problem corresponding to each target collection dimension and the confidence of the feedback data;
outputting a verification problem corresponding to each target collection dimension;
receiving feedback data of the verification problem corresponding to each target collection dimension input by a user, verifying the confidence of the feedback data of the collection problem corresponding to the target collection dimension by using the feedback data of the verification problem corresponding to each target collection dimension, and obtaining a verification result of the feedback data of the collection problem corresponding to each target collection dimension;
correcting the overall confidence corresponding to the target collection dimension based on the verification result of the feedback data of the collection problem corresponding to each target collection dimension;
and outputting feedback data of the collection type problem corresponding to each target collection dimension and the corrected overall confidence corresponding to each target collection dimension.
Optionally, in the above method, when a data collection request is received, outputting at least one collection class problem corresponding to each target collection dimension, and setting a corresponding confidence level for feedback data of each collection class problem corresponding to each target collection dimension, includes:
selecting a plurality of collection dimensions when a data collection request is received, and taking each selected collection dimension as a target collection dimension;
screening at least one basic collection problem from a plurality of collection problems in a local question bank aiming at each target collection dimension;
outputting the basic collection problems by respectively adopting the output mode of each basic collection problem;
acquiring feedback data of each basic collection type problem, and setting the confidence coefficient of the feedback data of each basic collection type problem according to a confidence coefficient determination method corresponding to each basic collection type problem;
screening a plurality of deep collection problems which have an association relation with any one basic collection problem from a plurality of collection problems in the local question bank based on the confidence of the feedback data of each basic collection problem;
respectively adopting an output mode corresponding to each deep collection problem to output the deep collection problems;
and acquiring the feedback data of each deep collection type problem, and setting the confidence coefficient of the feedback data of each deep collection type problem according to the confidence coefficient determining method corresponding to each deep collection type problem.
Optionally, in the above method, the outputting the verification problem corresponding to each target collection dimension includes:
for each target collection dimension, screening a plurality of verification problems in a local question bank to obtain a verification problem corresponding to the target collection dimension, or determining a verification problem corresponding to the target collection dimension from a cloud database to output; and the verification problem corresponding to each target collection dimension is obtained by screening based on the feedback data of the collection type problem corresponding to each target collection dimension.
Optionally, in the above method, the verifying, by using the feedback data of the verification problem corresponding to each target collection dimension, the confidence of the feedback data of the collection problem corresponding to the target collection dimension to obtain the verification result of the feedback data of the collection problem corresponding to each target collection dimension includes:
verifying the correctness of the feedback data of the verification problem corresponding to each target collection dimension respectively;
if the correctness of the feedback data of the verification problem corresponding to the target collection dimension cannot be verified, determining that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is unverifiable;
if the feedback data of the verification problem corresponding to the target collection dimension is verified to be wrong, determining that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is low in confidence;
and if the feedback data of the verification problem corresponding to the target collection dimension is verified to be correct, determining that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is high in confidence.
Optionally, in the above method, the modifying, based on the verification result of the feedback data of the collection-class problem corresponding to each target collection dimension, the overall confidence degree corresponding to the target collection dimension includes:
if the verification result of the feedback data of the collection type problem corresponding to the target collection dimension is that the feedback data cannot be verified, calculating the product of a preset first correction coefficient corresponding to the verification problem and the overall confidence corresponding to the target collection dimension to obtain the overall confidence corresponding to the target collection dimension after correction; wherein the first correction coefficient is less than 1;
if the confidence coefficient of the feedback data of the collection type problem corresponding to the target collection dimension is lower, calculating the product of a preset second correction coefficient corresponding to the verification problem and the overall confidence coefficient corresponding to the target collection dimension to obtain the overall confidence coefficient corresponding to the corrected target collection dimension; wherein the second correction coefficient is less than 1;
if the verification result of the feedback data of the collection type problem corresponding to the target collection dimension is high in confidence, calculating the product of a preset third correction coefficient corresponding to the verification problem and the overall confidence corresponding to the target collection dimension to obtain the overall confidence corresponding to the corrected target collection dimension; wherein the third correction factor is equal to 1.
Optionally, in the above method, when a data collection request is received, for each target collection dimension, filtering out at least one basic collection problem from a plurality of collection problems in a local question bank, where the method includes:
determining the total number of the basic collection problems to be selected;
determining the selected number of the basic collection problems corresponding to each target collection dimension according to the total number of the basic collection problems to be screened and the importance coefficient of each collection problem corresponding to each target collection dimension; the larger the importance coefficient of the collection problem corresponding to the target collection dimension is, the larger the selection number of the basic collection problems corresponding to the target collection dimension is;
and screening the plurality of collection problems corresponding to each target collection dimension to obtain basic collection problems meeting the selection quantity of the basic collection problems corresponding to each target collection dimension.
Optionally, in the above method, the screening, based on the confidence of the feedback data of each basic collection problem, a plurality of deep collection problems associated with any one of the basic collection problems from a plurality of collection problems in the local question bank includes:
respectively aiming at each basic collection problem, calculating the quantity to be selected of deep collection problems which have incidence relation with the basic collection problems according to the confidence of the feedback data of the basic collection problems; the lower the confidence of the feedback data of the basic collection problem is, the more the number to be selected of the deep collection problems which are required to be selected and have the incidence relation with the basic collection problem are calculated;
and screening the deep collection problems meeting the requirement of the number to be selected from the collection problems which are associated with each basic collection problem respectively.
Another aspect of the present invention provides an apparatus for automatically collecting user data, including:
the system comprises a collecting unit, a judging unit and a judging unit, wherein the collecting unit is used for outputting at least one collection type problem corresponding to each target collection dimension when a data collection request is received, and setting corresponding confidence coefficient aiming at the feedback data of each collection type problem corresponding to each target collection dimension; wherein the confidence level is used for explaining the credibility of the feedback data of the collection type problem;
the first calculation unit is used for calculating and obtaining an overall confidence coefficient corresponding to each target collection dimension according to the feedback data of the collection type problem corresponding to each target collection dimension and the confidence coefficient of the feedback data;
the first output unit is used for outputting the verification problem corresponding to each target collection dimension;
the verification unit is used for receiving feedback data of the verification problem corresponding to each target collection dimension, which is input by a user, verifying the confidence coefficient of the feedback data of the collection problem corresponding to the target collection dimension by using the feedback data of the verification problem corresponding to each target collection dimension, and obtaining the verification result of the feedback data of the collection problem corresponding to each target collection dimension;
the correction unit is used for correcting the overall confidence corresponding to the target collection dimension based on the verification result of the feedback data of the collection problem corresponding to each target collection dimension;
and the second output unit is used for outputting feedback data of the collection type problem corresponding to each target collection dimension and the corrected overall confidence corresponding to each target collection dimension.
Optionally, in the above apparatus, the collecting unit includes:
the device comprises a selecting unit, a calculating unit and a calculating unit, wherein the selecting unit is used for selecting a plurality of collecting dimensions when receiving a data collecting request and taking each selected collecting dimension as a target collecting dimension;
the first screening unit is used for screening at least one basic collection problem from a plurality of collection problems in a local question bank aiming at each target collection dimension;
the third output unit is used for outputting the basic collection problems by respectively adopting the output mode of each basic collection problem;
the first setting unit is used for acquiring the feedback data of each basic collection type problem and setting the confidence coefficient of the feedback data of each basic collection type problem according to the confidence coefficient determining method corresponding to each basic collection type problem;
the second screening unit is used for screening a plurality of deep collection problems which have an association relation with any one basic collection problem from a plurality of collection problems in the local question bank based on the confidence degree of the feedback data of each basic collection problem;
the fourth output unit is used for outputting the deep collection problems by respectively adopting the output mode corresponding to each deep collection problem;
and the second setting unit is used for acquiring the feedback data of each deep collection type problem and setting the confidence coefficient of the feedback data of each deep collection type problem according to the confidence coefficient determining method corresponding to each deep collection type problem.
Optionally, in the above apparatus, the first output unit includes:
the first output subunit is configured to, for each target collection dimension, screen a plurality of verification problems in a local question bank to obtain a verification problem corresponding to the target collection dimension, or determine a verification problem output corresponding to the target collection dimension from a cloud database;
and the verification problem corresponding to each target collection dimension is obtained by screening based on the feedback data of the collection type problem corresponding to each target collection dimension.
Optionally, in the above apparatus, the verification unit includes:
the verification subunit is used for respectively verifying the correctness of the feedback data of the verification problem corresponding to each target collection dimension;
when the correctness of the feedback data of the verification problem corresponding to the target collection dimension cannot be verified, the verification subunit determines that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is unverifiable; when the feedback data of the verification problem corresponding to the target collection dimension is verified to be wrong, determining that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is low in confidence; and when the feedback data of the verification problem corresponding to the target collection dimension is verified to be correct, determining that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is high in confidence.
Optionally, in the above apparatus, the correction unit includes:
the first correcting unit is used for calculating the product of a preset first correction coefficient corresponding to the verification problem and the overall confidence corresponding to the target collection dimension when the verification result of the feedback data of the collection type problem corresponding to the target collection dimension is that the feedback data cannot be verified, so as to obtain the overall confidence corresponding to the target collection dimension after correction; wherein the first correction coefficient is less than 1;
the second correcting unit is used for calculating the product of a preset second correction coefficient corresponding to the verification problem and the overall confidence corresponding to the target collection dimension when the confidence of the verification result of the feedback data of the collection type problem corresponding to the target collection dimension is lower, so as to obtain the corrected overall confidence corresponding to the target collection dimension; wherein the second correction coefficient is less than 1;
a third correcting unit, configured to calculate, when a verification result of the feedback data of the collection-type problem corresponding to the target collection dimension is higher than a confidence level, a product of a preset third correction coefficient corresponding to the verification problem and an overall confidence level corresponding to the target collection dimension, so as to obtain an overall confidence level corresponding to the target collection dimension after correction; wherein the third correction factor is equal to 1.
Optionally, in the above apparatus, the first screening unit includes:
the first determining unit is used for determining the total number of the basic collection problems to be selected;
a second determining unit, configured to determine, according to the total number of the basic collection problems to be screened and the importance coefficient of each collection problem corresponding to each target collection dimension, the selected number of the basic collection problems corresponding to each target collection dimension; the larger the importance coefficient of the collection problem corresponding to the target collection dimension is, the larger the selection number of the basic collection problems corresponding to the target collection dimension is;
the first screening subunit is configured to screen, from the multiple collection problems corresponding to each target collection dimension, basic collection problems that satisfy the selected number of the basic collection problems corresponding to each target collection dimension.
Optionally, in the above apparatus, the second screening unit includes:
the second calculation unit is used for calculating the number to be selected of the deep collection problems which are associated with the basic collection problems according to the confidence of the feedback data of the basic collection problems respectively aiming at each basic collection problem; the lower the confidence of the feedback data of the basic collection problem is, the more the number to be selected of the deep collection problems which are required to be selected and have the incidence relation with the basic collection problem are calculated;
and the second screening subunit is used for screening the deep collection problems meeting the quantity requirement of the to-be-selected data from the collection problems which are associated with each basic collection problem respectively.
The invention provides a method and a device for automatically collecting user data, which can output at least one collection problem corresponding to each target collection dimension when a data collection request is received by presetting a plurality of problems, thereby collecting feedback data of collection problems on a plurality of target collection dimensions. The confidence level of the feedback data of each collection type problem is set, so that the credibility of each feedback data is clarified. And verifying the confidence of the feedback data of the collection type problems corresponding to the target collection dimensionality by outputting the verification problems corresponding to each target collection dimensionality in time, and correspondingly correcting the overall confidence corresponding to the target collection dimensionality based on the verification result, thereby effectively ensuring the confidence of collecting accurate user data and each user data.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic flow chart of a method for automatically collecting user data according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart illustrating another method for automated collection of user data according to an embodiment of the present invention;
FIG. 3 is a flow chart illustrating another method for automated collection of user data according to an embodiment of the present invention;
FIG. 4 is a flow chart illustrating another method for automated collection of user data according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of an apparatus for automatically collecting user data according to another embodiment of the present invention;
fig. 6 is a schematic structural diagram of a collecting unit according to another embodiment of the present invention;
fig. 7 is a schematic structural diagram of a first screening unit according to another embodiment of the present invention;
fig. 8 is a schematic structural diagram of a second screening unit according to another embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The embodiment of the invention provides an automatic user data collection method, as shown in fig. 1, comprising the following steps:
s101, when a data collection request is received, outputting at least one collection type problem corresponding to each target collection dimension, and setting corresponding confidence degrees aiming at feedback data of each collection type problem corresponding to each target collection dimension.
Wherein the confidence level is used to illustrate the credibility of the feedback data of the collection type problem. The collection-like problem refers to a problem for collecting user data.
Specifically, a plurality of collection dimensions may be preset according to the requirement of the data to be collected, for example, if the data of the user needs to be collected to evaluate the credit risk of the user, the collection dimensions of income, assets, marital status, and the like may be set. Therefore, when a data collection request is received, part of collection dimensions can be selected from a plurality of preset collection dimensions, and the selected collection dimensions are determined as target collection dimensions to be used as the target collection dimensions of the data collection. Of course, all preset collection dimensions may be set as target collection dimensions. And after the target collection dimensions are determined, outputting at least one collection type problem corresponding to each target collection dimension to collect the data of the user.
Optionally, outputting the collection-type question corresponding to each target collection dimension may be outputting in a manner of voice or text, so that the user obtains the output collection-type question, thereby obtaining feedback data of the user on the collection-type question. Therefore, the collection-like problem may be voice data or other types of data such as text.
Optionally, the feedback data of the collected questions may be obtained by collecting voice data of the user, text data output by the user on a user interface, or feedback data obtained from a third-party channel after authorization of the user. For example, the collection-type question is to ask the user for the annual tax amount, the microphone may collect the voice reply of the user to obtain the annual tax amount of the user, or the user may be prompted to input the annual tax amount on the user interface, so as to obtain the annual tax amount input by the user, or the user may be authorized on the user interface, so as to obtain the annual tax amount of the user from the tax bureau.
Specifically, after feedback data of the user for the collection type problems is acquired, a corresponding confidence coefficient is set for each feedback data. Optionally, the confidence corresponding to the feedback number may be determined according to a source channel of the feedback data, feedback time of the feedback data, a intonation of the user, a speech rate of the user, or a combination of one or more manners such as a movement condition of a point of regard of the user.
Optionally, in another implementation of the present invention, as shown in fig. 2, a specific implementation manner of step S101 includes:
s201, when a data collection request is received, selecting a plurality of collection dimensions, and taking each selected collection dimension as a target collection dimension.
S202, aiming at each target collection dimension, screening at least one basic collection problem from a plurality of collection problems in the local question bank.
It should be noted that, in the embodiment of the present invention, a local question bank including a plurality of questions needs to be established in advance. Specifically, a plurality of questions are set for each target collection dimension, the attribute of each question is set, and then the questions and the attribute are stored in a local question bank. Optionally, the attributes of the question include an output mode, a question feedback mode, a purpose, a target collection dimension corresponding to the question, an importance coefficient and a support coefficient of the question, an association relationship with other questions, and a confidence confirmation mode. And attributes such as correction coefficients.
Wherein the purpose of the questions includes both collection and verification purposes, all questions can be divided into collection-like questions according to the purpose of the questions for collecting data of the user, such as directly asking the user for income, etc., and verification questions, feedback data for verifying the user, such as knowing the birth date of the user, asking the user for the zodiac, etc. The importance factor of the question is determined according to the importance level of the feedback data of the question, for example, the income reflects the credit risk of the user more than the marital status, so the importance factor of the income-related question is larger than that of the marital-related question. The support factor is then determined as the magnitude of the degree of support for the confidence of the collected data in the corresponding target collection dimension. For example, the monthly payroll of the user is more supportive to the income target collection dimension than to the nature of the enterprise where the user is located or the work position, so the monthly payroll problem of the user is more supportive than to the nature and position of the enterprise where the user is located.
It should be noted that only the verification problem needs to be set with a correction coefficient, which is set according to the degree of influence of the problem on the confidence of the verified data. Specifically, the degree of influence of a problem on the confidence of the verified data can be determined from the past big data. That is, when the question is answered correctly, if it is determined that the probability of the error of the verified data is small, or when the question is answered incorrectly, and the probability of the error of the corresponding verified data is high, it is determined that the influence degree of the question on the verified data is high.
Specifically, when a data collection request is received, at least one collection problem can be obtained by screening from a plurality of collection problems in the local question bank as a basic collection problem according to the attribute of the problem for each target collection dimension.
Optionally, in another embodiment of the present invention, as shown in fig. 3, a specific implementation manner of step S202 includes:
s301, determining the total number of the problems of the selected basic collection class.
It should be noted that, in order to output at least one basic collection problem corresponding to each target collection dimension, the total number of the basic collection problems to be selected is determined to be not less than the number of the target collection dimensions.
S302, determining the selected number of the basic collection problems corresponding to each target collection dimension according to the total number of the basic collection problems to be screened and the importance coefficient of each collection problem corresponding to each target collection dimension.
The larger the importance coefficient of the collection problem corresponding to the target collection dimension is, the larger the selection number of the basic collection problems corresponding to the target collection dimension is.
Since the importance coefficient of the question indicates the importance of the question feedback data, the number of questions with high importance coefficients screened should be larger. Therefore, in the embodiment of the present invention, the number of the selected basic collection problems corresponding to each target collection dimension is determined by the importance coefficient of the collection problems corresponding to the target collection dimension. That is, the importance coefficient of the collection problem corresponding to the target collection dimension may also be understood as determining the ratio of the selected number of the basic collection problems corresponding to the target collection dimension to the total number.
Specifically, for example, k problems are sequentially selected from m target collection dimensions, and then the probability of selecting the jth target collection dimension to the ith problem is:
Figure BDA0002291873630000111
wherein, αjImportance coefficient for jth target Collection dimension αrB, adjusting parameters for presetting the weight of each target collection dimension. n is the maximum number of collection questions chosen in the jth target collection dimension.
Figure BDA0002291873630000121
The number of jth target collection dimensions was chosen for the first 1 st to i-1 st questions.
And S303, screening the plurality of collection problems corresponding to each target collection dimension to obtain the basic collection problems meeting the selected number of the basic collection problems corresponding to each target collection dimension.
Specifically, the collection problems satisfying the selected number of the basic collection problems corresponding to each target collection dimension are obtained by screening from a plurality of collection problems corresponding to each target collection dimension at random as the basic collection problems.
Optionally, in the embodiment of the present invention, randomly selecting the collection-class problem is not equal probability random, that is, the probability of selecting each collection-class problem is not the same. The probability of each question being selected is related to the support coefficient of the attribute of the question, and the larger the support coefficient in the attribute of the question is, the larger the probability of the question being selected is.
Specifically, for example, in the above example, after one of the target collection dimensions is determined from m target collection dimensions, one collection problem is selected from n collection problems, and the probability of selecting the s-th collection problem belonging to the target collection dimension is:
Figure BDA0002291873630000122
wherein, βsSupport factor for the s-th Collection class problem, βrFor the support factor of each collection-like problem,
Figure BDA0002291873630000123
an intensity weight adjustment parameter is supported for the target collection dimension.
And S203, outputting the basic collection problems by respectively adopting the output mode of each basic collection problem.
That is to say, in the embodiment of the present invention, the output manner of each basic collection problem may be different, and different problems may be adopted to achieve fraud prevention to a certain extent, and improve the user experience, and the like.
Optionally, when each basic collection type problem is output, a corresponding feedback mode of the basic collection type problem may be output accordingly.
S204, obtaining the feedback data of each basic collection type problem, and setting the confidence coefficient of the feedback data of each basic collection type problem according to the confidence coefficient determining method corresponding to each basic collection type problem.
Because the target collection dimensions corresponding to each basic collection problem are different and the acquisition modes of the feedback data are also different, the confidence level of each basic collection problem should be determined by respectively adopting a confidence level determination method corresponding to each basic collection problem, so that the accuracy of the obtained execution degree is ensured. For example, if the basic collection-type question is about the income of the user, the proportion of the acquisition channel of the feedback data is larger when the basic collection-type question is determined. For the questions which need to be answered by the user subjectively, the proportion of the time or the speech rate fed back by the user is larger.
S205, based on the confidence of the feedback data of each basic collection type problem, a plurality of deep collection type problems which have an association relation with any basic collection type problem are obtained through screening from a plurality of collection type problems in the local question bank.
That is to say, in the embodiment of the present invention, the output collection problems are classified into basic collection problems and deep collection problems, and the deep collection problems are screened and output from the local question bank after the feedback data of the basic collection problems is acquired. After the reply data of the basic collection type problems are acquired, the deep collection type problems are further determined according to the confidence degree of the feedback data of the basic collection type problems, so that the data of the user can be further collected through the deep collection type problems, and the false and accidental feedback data collected to the user can be further reduced.
Optionally, in another embodiment of the present invention, as shown in fig. 4, a specific implementation manner of step S205 includes:
s401, respectively aiming at each basic collection problem, calculating to obtain the quantity to be selected of the deep collection problems having the incidence relation with the basic collection problems according to the confidence degree of the feedback data of the basic collection problems.
The lower the confidence of the feedback data of the basic collection problem is, the more the number of the to-be-selected deep collection problems which are required to be selected and have an association relation with the basic collection problem are calculated. Because the lower the confidence of the feedback data of the basic collection problem, the lower the confidence level of the feedback data, and the higher the possibility that the feedback data is false, more collection problems with association need to be selected, and the data of the user in this aspect needs to be further collected.
Specifically, for each basic collection problem, the set confidence of the feedback data of each basic collection problem is used as a weight parameter, and the total number of deep collection problems to be selected is used for determining the number of deep collection problems selected from the collection problems having an association relation with each basic collection problem. Assume that the confidence of feedback data for a certain underlying collection class problem is psiiThen to
Figure BDA0002291873630000131
Selecting a quantity weight; wherein the content of the first and second substances,
Figure BDA0002291873630000132
is a preset confidence reliability weight. Then, according to the total number of deep collection problems to be selected and the selection number weight corresponding to the basic collection problem, determining the number of the collection problems selected from the relationship spectrum of the basic collection problem, that is, determining to select the collection problems having an association relationship with the basic collection problem as the number of the deep collection problems.
More specifically, for example, after obtaining the confidence levels of the k basic collection problems in the above example, it is determined that the deep collection problems to be selected are l, and if the confidence level of each basic collection problem is ψrThen the confidence of the ith base gather class problem is psiiSelecting the number to be selected of deep collection problems from the collection problems in the relation spectrum of the ith basic collection problem as follows:
Figure BDA0002291873630000141
wherein, thetaiIs composed of
Figure BDA0002291873630000142
And (3) taking the integer after rounding down, wherein the difference value of the integer and the decimal part is as follows:
Figure BDA0002291873630000143
therefore, when
Figure BDA0002291873630000144
Less than 1, thetaiTherefore, when the confidence of the feedback data of the base collection class problem is sufficiently large, the number of the problems in the deep collection class selected from the collection class problems associated with the base collection class problem is 0.
The number to be selected is rounded down, so that the number to be selected is calculated with a high probability, and the sum of the number to be selected of the deep collection problems corresponding to each basic collection problem is less than l, that is, the sum of the number to be selected of the deep collection problems is less than l
Figure BDA0002291873630000145
And l' is not equal to 0. At this time, from εiSelecting one more deep collection problem from the relationship spectrum of the basic collection problems corresponding to the first l' maximum values
S402, screening the collection problems which are associated with the basic collection problems respectively to obtain the deep collection problems meeting the quantity requirements to be selected.
For each basic collection problem, randomly selecting the collection problem which meets the corresponding number to be selected from the collection problems which have an association relation with the basic collection problem as a deep collection problem.
And S206, outputting the deep collection problems by respectively adopting an output mode corresponding to each deep collection problem.
Similarly, each deep collection problem is output in the corresponding output mode, instead of adopting a uniform output mode, and the specific implementation process of this step may refer to the specific implementation mode of step S203, which is not described herein again.
S207, obtaining the feedback data of each deep collection type problem, and setting the confidence coefficient of the feedback data of each deep collection type problem according to the confidence coefficient determining method corresponding to each deep collection type problem.
The deep collection problem and the basic collection problem are obtained by screening the collection problems at the same time, so that the two problems belong to the same type, and therefore, the specific implementation manner of step S207 may refer to the specific implementation process of step S204, which is not described herein again.
S102, calculating to obtain an overall confidence degree corresponding to each target collection dimension according to the feedback data of the collection type problem corresponding to each target collection dimension and the confidence degree of the feedback data.
The overall confidence corresponding to the target collection dimension may be understood as the overall confidence of all data of the user collected through the collection-class problem corresponding to the target collection dimension.
Specifically, for each collection problem, the total confidence of the feedback data of all the collection problems corresponding to each target collection dimension is calculated according to the output feedback data of the collection problems corresponding to each target collection dimension and the confidence of the feedback data. All the collection problems corresponding to each target collection dimension comprise basic collection problems obtained by screening and deep collection problems.
Optionally, the overall confidence corresponding to each target collection dimension may be calculated by taking the feedback data of the collection-type problem and the confidence of the feedback data corresponding to each target collection dimension as a set, determining two pieces of feedback data with a similarity degree meeting a preset condition from the set in sequence, merging the two pieces of feedback data into one piece of feedback data, merging the two pieces of confidence of the two pieces of feedback data into one confidence, and finally calculating a geometric mean for all the confidences in each set to obtain the overall confidence corresponding to each target collection dimension.
And S103, outputting the verification problem corresponding to each target collection dimension.
It should be noted that the verification problem is a problem for verifying the feedback data of the collection-type problem. For example, where the feedback data is the name of a university that the user has read at the subject, the verification question may be asking the user for a workout of the school that the user has read at the subject. Therefore, the verification problem needs to be obtained by screening according to the feedback data of the collected problems.
Therefore, after step S102 is executed, the verification problem corresponding to each target collection dimension is obtained by screening based on the feedback data of the collection type problem corresponding to each target collection dimension. And output to the user.
It should be noted that, the lower the overall confidence corresponding to the target collection dimension, the more the feedback data of the collection-class problem corresponding to the collection problem needs to be verified, and therefore, the greater the number of verification problems corresponding to the target collection dimension should be output.
Optionally, the number of the output verified questions corresponding to each target collection dimension may specifically be determined by using a determination manner of the number to be selected of the deep collection-class questions in step S401 in the embodiment corresponding to fig. 4, based on the overall confidence degree corresponding to each target collection dimension. It is thereby achieved that the lower the overall confidence level for a target collection dimension, the greater the number of verification problems determined to be output for that target collection dimension.
Accordingly, when the overall confidence corresponding to the target collection dimension is high enough, the determined number of verification problems corresponding to the target collection dimension may be zero. Therefore, the confidence reliability weights specified and preset
Figure BDA0002291873630000161
Is related to the size of (a).
Optionally, another embodiment of the present invention provides a specific manner of step S103, including: and aiming at each target collection dimension, screening the verification problems corresponding to the target collection dimension from a plurality of verification problems in the local question bank, or determining the verification problems corresponding to the target collection dimension from the cloud database and outputting the verification problems.
And screening the verification problem corresponding to each target collection dimension based on the feedback data of the collection problem corresponding to each target collection dimension.
That is to say, in the embodiment of the present invention, the verification problem may be obtained by screening from a local question bank, or may be obtained by screening from a cloud database. Specifically, if the verification problem is obtained by screening from the local question bank, a plurality of verification problems need to be set in the local question bank in advance according to the collection problem, and an association relationship between the verification problem and the collection problem is established. Subsequently, according to the feedback data of the collected problems, the verification problem of the feedback data of the collected problems can be obtained by screening from the local question bank.
Specifically, the verification problems are screened from the cloud database, after feedback data of the collection problems are obtained, a plurality of verification problems corresponding to the feedback data are collected from the internet according to the feedback data to form the cloud database, and then the verification problems are screened from the cloud database.
Alternatively, the number of verification questions screened from the local question bank and the number of verification questions screened from the cloud database may be respectively the same as the total number of verification questions. And screening verification problems from a local question bank or screening problems from a cloud database can be correspondingly implemented by adopting the embodiment corresponding to fig. 4, so that the screening method for deeply collecting problems is provided. Specifically, reference may be made to the specific implementation process of step S401 and step S402, which is not described herein again.
S104, receiving feedback data of the verification problems corresponding to each target collection dimension input by a user, verifying the confidence degree of the feedback data of the collection problems corresponding to the target collection dimension by using the feedback data of the verification problems corresponding to each target collection dimension, and obtaining the verification result of the feedback data of the collection problems corresponding to each target collection dimension.
Specifically, whether the confidence of the feedback data of the collection-type problem corresponding to the target collection dimension is enough to be credible is verified through the received feedback data of the verification problem corresponding to each target collection dimension input by the user.
And S105, correcting the overall confidence corresponding to the target collection dimension based on the verification result of the feedback data of the collection type problem corresponding to each target collection dimension.
Specifically, the correction of the overall confidence corresponding to the target collection dimension is to increase or decrease the overall confidence corresponding to the target collection dimension based on the verification result of the feedback data of the collection-type problem corresponding to each target collection dimension, or not change the overall confidence corresponding to the target collection dimension. When the verification result shows that the credibility of the feedback data of the collection problems corresponding to the target collection dimension is high, the overall confidence corresponding to the target collection dimension is not changed or increased correspondingly, and when the verification result shows that the credibility of the feedback data of the collection problems corresponding to the target collection dimension is low, the overall confidence corresponding to the target collection dimension is reduced.
Optionally, in another embodiment of the present invention, a specific implementation manner of step S104 includes: and verifying the correctness of the feedback data of the verification problem corresponding to each target collection dimension respectively.
Specifically, the correctness of the feedback data of the verification question corresponding to each target collection dimension is determined by searching the correct answer of the verification question and comparing the correct answer of the verification question with the feedback data input by the user.
Optionally, for each target collection dimension, when the feedback data of each verification problem corresponding to the verification target collection dimension is correct, it is determined that the feedback data of the verification problem corresponding to the target collection dimension is correct; if the feedback data of any verification problem is wrong, determining that the feedback data of the verification problem corresponding to the target collection dimension is wrong; and if the feedback data with the preset number of verification problems cannot be verified, determining that the feedback data with the verification problems corresponding to the target collection dimension cannot be verified. Of course, this is only one optional way, and the correctness of the feedback data of the verification problem corresponding to one dimension may also be determined by other rules.
For example, the correctness of the feedback data of the verification problem corresponding to one dimension can be determined by setting the correctness. That is, when the accuracy of the feedback data of the verification problem corresponding to the verification target collection dimension meets the requirement, it is determined that the feedback data of the verification problem corresponding to the target collection dimension is correct. When the accuracy of the feedback data of the verification problem corresponding to the verification target collection dimension does not meet the requirement, determining that the feedback data of the verification problem corresponding to the target collection dimension is wrong; and when the feedback data with the preset number of verification problems cannot be verified, determining that the feedback data of the verification problems corresponding to the target collection dimension cannot be verified.
It should be noted that, in the embodiment of the present invention, if the correctness of the feedback data of the verification problem corresponding to the target collection dimension cannot be verified, it is determined that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is unverifiable; if the feedback data of the verification problem corresponding to the target collection dimension is verified to be wrong, determining that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is low in confidence; and if the feedback data of the verification problem corresponding to the target collection dimension is verified to be correct, determining that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is high in confidence.
Accordingly, in the embodiment of the present invention, a specific implementation manner of step S105 includes:
and if the verification result of the feedback data of the collection type problem corresponding to the target collection dimension is that the feedback data cannot be verified, calculating the product of a first correction coefficient corresponding to a preset verification problem and the overall confidence corresponding to the target collection dimension to obtain the overall confidence corresponding to the corrected target collection dimension.
And if the verification result of the feedback data of the collection type problem corresponding to the target collection dimension is low in confidence, calculating the product of a second correction coefficient corresponding to the preset verification problem and the overall confidence corresponding to the target collection dimension to obtain the overall confidence corresponding to the corrected target collection dimension.
And if the verification result of the feedback data of the collection type problem corresponding to the target collection dimension is higher in confidence, calculating the product of a third correction coefficient corresponding to the preset verification problem and the overall confidence corresponding to the target collection dimension to obtain the overall confidence corresponding to the corrected target collection dimension.
Wherein the first correction coefficient and the second correction coefficient are both smaller than 1, and the first correction coefficient is larger than the second correction coefficient; the third correction factor is equal to 1.
That is to say, in the embodiment of the present invention, when the verification result is both the verification-impossible and the verification-incorrect, the overall confidence corresponding to the target collection dimension is reduced, and the reduction amplitude when the verification result is the verification-incorrect is greater than the reduction amplitude when the verification result is the verification-impossible. And when the verification result is correct, not changing the overall confidence corresponding to the target collection dimension. Of course, this is only one alternative way, and other correction coefficients may also be set to perform corresponding correction on the overall confidence corresponding to the target collection dimension, which all fall within the protection scope of the present invention.
And S106, outputting feedback data of the collection type problems corresponding to each target collection dimension and the corrected overall confidence corresponding to each target collection dimension.
And finally, outputting the collected feedback data of the collection problems corresponding to the dimensions and the corrected overall confidence degrees collected to the collection dimensions of the first targets.
According to the automatic collection method of the user data, provided by the embodiment of the invention, by presetting a plurality of problems, when a data collection request is received, at least one collection problem corresponding to each target collection dimension can be output, so that feedback data of collection problems in the plurality of target collection dimensions are collected. The confidence level of the feedback data of each collection type problem is set, so that the credibility of each feedback data is clarified. And verifying the confidence of the feedback data of the collection type problems corresponding to the target collection dimensionality by outputting the verification problems corresponding to each target collection dimensionality in time, and correspondingly correcting the overall confidence corresponding to the target collection dimensionality based on the verification result, thereby effectively ensuring the confidence of collecting accurate user data and each user data.
Another embodiment of the present invention provides an apparatus for automatically collecting user data, as shown in fig. 5, including:
the collecting unit 501 is configured to, when a data collection request is received, output at least one collection type problem corresponding to each target collection dimension, and set a corresponding confidence level for feedback data of each collection type problem corresponding to each target collection dimension.
Wherein the confidence level is used to illustrate the credibility of the feedback data of the collection type problem.
It should be noted that, the specific working process of the collection unit 501 may refer to step S101 in the foregoing method embodiment accordingly, and is not described herein again.
The first calculating unit 502 is configured to calculate an overall confidence corresponding to each target collection dimension according to the feedback data of the collection-type problem corresponding to each target collection dimension and the confidence of the feedback data.
It should be noted that, the specific working process of the first calculating unit 502 may refer to step S102 in the foregoing method embodiment accordingly, and is not described herein again.
A first output unit 503, configured to output a verification problem corresponding to each target collection dimension.
It should be noted that, the specific working process of the first output unit 503 may refer to step S103 in the foregoing method embodiment accordingly, and is not described herein again.
The verification unit 504 is configured to receive feedback data of the verification problem corresponding to each target collection dimension, which is input by a user, and verify a confidence of the feedback data of the collection problem corresponding to the target collection dimension by using the feedback data of the verification problem corresponding to each target collection dimension, so as to obtain a verification result of the feedback data of the collection problem corresponding to each target collection dimension.
It should be noted that, the specific working process of the verification unit 504 may refer to step S104 in the above method embodiment accordingly, and is not described herein again.
A correcting unit 505, configured to correct the overall confidence corresponding to the target collection dimension based on the verification result of the feedback data of the collection-type problem corresponding to each target collection dimension.
It should be noted that, the specific working process of the correcting unit 505 may refer to step S105 in the foregoing method embodiment accordingly, which is not described herein again.
A second output unit 506, configured to output the feedback data of the collection-class problem corresponding to each target collection dimension and the corrected overall confidence corresponding to each target collection dimension.
It should be noted that, the specific working process of the second output unit 506 may refer to step S106 in the foregoing method embodiment accordingly, and is not described herein again.
Optionally, in another embodiment of the present invention, as shown in fig. 6, the collecting unit 501 specifically includes:
the selecting unit 601 is configured to select multiple collection dimensions as target collection dimensions when a data collection request is received.
The first filtering unit 602 is configured to, when a data collection request is received, filter, for each target collection dimension, at least one basic collection problem from a plurality of collection problems in the local question bank.
It should be noted that, the specific working process of the first filtering unit 602 may refer to step S202 in the foregoing method embodiment accordingly, and details are not described here again.
A third output unit 603, configured to output the basic collection type questions by using the output manner of each basic collection type question.
It should be noted that, the specific working process of the third output unit 603 may refer to step S203 in the foregoing method embodiment accordingly, and details are not repeated here.
The first setting unit 604 is configured to obtain feedback data of each basic collection type problem, and set a confidence level of the feedback data of each basic collection type problem according to a confidence level determination method corresponding to each basic collection type problem.
It should be noted that, the specific working process of the first setting unit 604 may refer to step S204 in the foregoing method embodiment accordingly, and details are not described here again.
The second filtering unit 605 is configured to filter a plurality of deep collection problems associated with any one basic collection problem from the plurality of collection problems in the local question bank based on the confidence of the feedback data of each basic collection problem.
It should be noted that, the specific working process of the second filtering unit 605 may refer to step S205 in the foregoing method embodiment accordingly, and details are not described here again.
A fourth output unit 606, configured to output the deep collection type problem by using the output mode corresponding to each deep collection type problem.
It should be noted that, the specific working process of the fourth output unit 606 may refer to step S206 in the foregoing method embodiment accordingly, and is not described herein again.
The second setting unit 607 is configured to obtain the feedback data of each deep-collecting type problem, and set the confidence level of the feedback data of each deep-collecting type problem according to the confidence level determination method corresponding to each deep-collecting type problem.
It should be noted that, the specific working process of the second setting unit 607 may refer to step S207 in the foregoing method embodiment accordingly, and is not described herein again.
Optionally, in another embodiment of the present invention, the first output unit 503 includes:
and the first output subunit is used for screening the verification problems corresponding to the target collection dimensionality from the verification problems in the local question bank aiming at each target collection dimensionality, or determining the verification problem output corresponding to the target collection dimensionality from the cloud database.
And the verification problem corresponding to each target collection dimension is obtained by screening the feedback data of the collection problem corresponding to each target collection dimension.
It should be noted that, for the specific working process of the first output subunit, a specific implementation manner of step S103 in the foregoing method embodiment may be referred to accordingly, and details are not described here again.
Optionally, in another embodiment of the present invention, the verification unit 504 includes:
and the verification subunit is used for respectively verifying the correctness of the feedback data of the verification problem corresponding to each target collection dimension.
When the correctness of the feedback data of the verification problem corresponding to the target collection dimension cannot be verified, the verification subunit determines that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is unverifiable; when the feedback data of the verification problem corresponding to the target collection dimension is verified to be wrong, determining that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is low in confidence; and when the feedback data of the verification problem corresponding to the target collection dimension is verified to be correct, determining that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is higher in confidence.
It should be noted that, for the specific working process of the verification subunit, reference may be made to a specific implementation manner of step S104 in the foregoing method embodiment, and details are not described here again.
Optionally, in another embodiment of the present invention, the modifying unit 505 includes:
and the first correcting unit is used for calculating the product of a first correction coefficient corresponding to a preset verification problem and the overall confidence corresponding to the target collection dimension when the verification result of the feedback data of the collection type problem corresponding to the target collection dimension is not verifiable, so as to obtain the overall confidence corresponding to the corrected target collection dimension. Wherein the first correction coefficient is smaller than 1.
And the second correcting unit is used for calculating the product of a second correction coefficient corresponding to the preset verification problem and the overall confidence corresponding to the target collection dimension when the verification result of the feedback data of the collection type problem corresponding to the target collection dimension is low in confidence, so as to obtain the overall confidence corresponding to the corrected target collection dimension. Wherein the second correction coefficient is smaller than 1.
And the third correcting unit is used for calculating the product of a third correction coefficient corresponding to the preset verification problem and the overall confidence corresponding to the target collection dimension when the verification result of the feedback data of the collection type problem corresponding to the target collection dimension is higher in confidence, so as to obtain the overall confidence corresponding to the corrected target collection dimension. Wherein the third correction factor is equal to 1.
It should be noted that, in the embodiment of the present invention, a specific implementation manner of step S105 in the method embodiment may be correspondingly referred to in a specific working process of the unit, and details are not described here again.
Optionally, in another embodiment of the present invention, the first screening unit 602, as shown in fig. 7, includes:
a first determining unit 701, configured to determine the total number of the basic collection problems to be selected.
It should be noted that, for the specific working process of the first determining unit 701, a specific implementation manner of the step S301 in the foregoing method embodiment may be referred to accordingly, and details are not described here again.
A second determining unit 702, configured to determine, according to the total number of the basic collection problems to be screened and the importance coefficient of each collection problem corresponding to each target collection dimension, the selected number of the basic collection problems corresponding to each target collection dimension; the larger the importance coefficient of the collection problem corresponding to the target collection dimension is, the larger the selection number of the basic collection problems corresponding to the target collection dimension is.
It should be noted that, a specific implementation manner of step S302 in the foregoing method embodiment may be referred to in a specific working process of the second determining unit 702, and details are not described here again.
The first screening subunit 703 is configured to screen, from the multiple collection problems corresponding to each target collection dimension, the basic collection problems that satisfy the selection number of the basic collection problems corresponding to each target collection dimension.
It should be noted that, a specific implementation manner of step S303 in the above method embodiment may be referred to in a specific working process of the first screening subunit 703, and details are not described here again.
In another embodiment of the present invention, the second screening unit 605, as shown in fig. 8, includes:
the second calculating unit 801 is configured to calculate, for each basic collection problem, a to-be-selected number of deep collection problems having an association relationship with the basic collection problem according to a confidence of feedback data of the basic collection problem.
The lower the confidence of the feedback data of the basic collection problem is, the more the number of the to-be-selected deep collection problems which are required to be selected and have an association relation with the basic collection problem are calculated.
It should be noted that, for the specific working process of the second calculating unit 801, a specific implementation manner of the step S401 in the foregoing method embodiment may be referred to accordingly, and details are not described here again.
And a second screening subunit 802, configured to screen deep collection problems that meet the number requirement of the to-be-selected from the collection problems that have an association relationship with each basic collection problem, respectively.
It should be noted that, a specific implementation manner of the step S402 in the foregoing method embodiment may be correspondingly referred to in a specific working process of the second screening subunit 802, and details are not described here again.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (14)

1. A method for automated collection of user data, comprising:
when a data collection request is received, outputting at least one collection type problem corresponding to each target collection dimension, and setting corresponding confidence coefficient aiming at the feedback data of each collection type problem corresponding to each target collection dimension; wherein the confidence level is used for explaining the credibility of the feedback data of the collection type problem;
calculating to obtain an overall confidence corresponding to each target collection dimension according to the feedback data of the collection problem corresponding to each target collection dimension and the confidence of the feedback data;
outputting a verification problem corresponding to each target collection dimension;
receiving feedback data of the verification problem corresponding to each target collection dimension input by a user, verifying the confidence of the feedback data of the collection problem corresponding to the target collection dimension by using the feedback data of the verification problem corresponding to each target collection dimension, and obtaining a verification result of the feedback data of the collection problem corresponding to each target collection dimension;
correcting the overall confidence corresponding to the target collection dimension based on the verification result of the feedback data of the collection problem corresponding to each target collection dimension;
and outputting feedback data of the collection type problem corresponding to each target collection dimension and the corrected overall confidence corresponding to each target collection dimension.
2. The method according to claim 1, wherein the outputting at least one collection class problem corresponding to each target collection dimension when receiving a data collection request, and setting a corresponding confidence for the feedback data of each collection class problem corresponding to each target collection dimension comprises:
selecting a plurality of collection dimensions when a data collection request is received, and taking each selected collection dimension as the target collection dimension;
screening at least one basic collection problem from a plurality of collection problems in a local question bank aiming at each target collection dimension;
outputting the basic collection problems by respectively adopting the output mode of each basic collection problem;
acquiring feedback data of each basic collection type problem, and setting the confidence coefficient of the feedback data of each basic collection type problem according to a confidence coefficient determination method corresponding to each basic collection type problem;
screening a plurality of deep collection problems which have an association relation with any one basic collection problem from a plurality of collection problems in the local question bank based on the confidence of the feedback data of each basic collection problem;
respectively adopting an output mode corresponding to each deep collection problem to output the deep collection problems;
and acquiring the feedback data of each deep collection type problem, and setting the confidence coefficient of the feedback data of each deep collection type problem according to the confidence coefficient determining method corresponding to each deep collection type problem.
3. The method of claim 1, wherein outputting the validation problem for each of the target collection dimensions comprises:
for each target collection dimension, screening a plurality of verification problems in a local question bank to obtain a verification problem corresponding to the target collection dimension, or determining a verification problem corresponding to the target collection dimension from a cloud database to output; and the verification problem corresponding to each target collection dimension is obtained by screening based on the feedback data of the collection type problem corresponding to each target collection dimension.
4. The method according to claim 1, wherein the verifying the confidence of the feedback data of the collection-class problem corresponding to the target collection dimension by using the feedback data of the verification problem corresponding to each target collection dimension to obtain the verification result of the feedback data of the collection-class problem corresponding to each target collection dimension comprises:
verifying the correctness of the feedback data of the verification problem corresponding to each target collection dimension respectively;
if the correctness of the feedback data of the verification problem corresponding to the target collection dimension cannot be verified, determining that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is unverifiable;
if the feedback data of the verification problem corresponding to the target collection dimension is verified to be wrong, determining that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is low in confidence;
and if the feedback data of the verification problem corresponding to the target collection dimension is verified to be correct, determining that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is high in confidence.
5. The method according to claim 4, wherein the modifying the overall confidence corresponding to the target collection dimension based on the verification result of the feedback data of the collection class problem corresponding to each target collection dimension comprises:
if the verification result of the feedback data of the collection type problem corresponding to the target collection dimension is that the feedback data cannot be verified, calculating the product of a preset first correction coefficient corresponding to the verification problem and the overall confidence corresponding to the target collection dimension to obtain the overall confidence corresponding to the target collection dimension after correction; wherein the first correction coefficient is less than 1;
if the confidence coefficient of the feedback data of the collection type problem corresponding to the target collection dimension is lower, calculating the product of a preset second correction coefficient corresponding to the verification problem and the overall confidence coefficient corresponding to the target collection dimension to obtain the overall confidence coefficient corresponding to the corrected target collection dimension; wherein the second correction coefficient is less than 1;
if the verification result of the feedback data of the collection type problem corresponding to the target collection dimension is high in confidence, calculating the product of a preset third correction coefficient corresponding to the verification problem and the overall confidence corresponding to the target collection dimension to obtain the overall confidence corresponding to the corrected target collection dimension; wherein the third correction factor is equal to 1.
6. The method of claim 2, wherein the filtering out at least one basic collection class problem from a plurality of collection class problems in a local question bank for each of the target collection dimensions comprises:
determining the total number of the basic collection problems to be selected;
determining the selected number of the basic collection problems corresponding to each target collection dimension according to the total number of the basic collection problems to be screened and the importance coefficient of each collection problem corresponding to each target collection dimension; the larger the importance coefficient of the collection problem corresponding to the target collection dimension is, the larger the selection number of the basic collection problems corresponding to the target collection dimension is;
and screening the plurality of collection problems corresponding to each target collection dimension to obtain basic collection problems meeting the selection quantity of the basic collection problems corresponding to each target collection dimension.
7. The method according to claim 2, wherein the screening out a plurality of deep-collected questions from the plurality of collected questions in the local question bank based on the confidence of the feedback data of each of the basic collected questions comprises:
respectively aiming at each basic collection problem, calculating the quantity to be selected of deep collection problems which have incidence relation with the basic collection problems according to the confidence of the feedback data of the basic collection problems; the lower the confidence of the feedback data of the basic collection problem is, the more the number to be selected of the deep collection problems which are required to be selected and have the incidence relation with the basic collection problem are calculated;
and screening the deep collection problems meeting the requirement of the number to be selected from the collection problems which are associated with each basic collection problem respectively.
8. An apparatus for automated collection of user data, comprising:
the system comprises a collecting unit, a judging unit and a judging unit, wherein the collecting unit is used for outputting at least one collection type problem corresponding to each target collection dimension when a data collection request is received, and setting corresponding confidence coefficient aiming at the feedback data of each collection type problem corresponding to each target collection dimension; wherein the confidence level is used for explaining the credibility of the feedback data of the collection type problem;
the first calculation unit is used for calculating and obtaining an overall confidence coefficient corresponding to each target collection dimension according to the feedback data of the collection type problem corresponding to each target collection dimension and the confidence coefficient of the feedback data;
the first output unit is used for outputting the verification problem corresponding to each target collection dimension;
the verification unit is used for receiving feedback data of the verification problem corresponding to each target collection dimension, which is input by a user, verifying the confidence coefficient of the feedback data of the collection problem corresponding to the target collection dimension by using the feedback data of the verification problem corresponding to each target collection dimension, and obtaining the verification result of the feedback data of the collection problem corresponding to each target collection dimension;
the correction unit is used for correcting the overall confidence corresponding to the target collection dimension based on the verification result of the feedback data of the collection problem corresponding to each target collection dimension;
and the second output unit is used for outputting feedback data of the collection type problem corresponding to each target collection dimension and the corrected overall confidence corresponding to each target collection dimension.
9. The apparatus of claim 8, wherein the collection unit comprises:
the device comprises a selecting unit, a calculating unit and a calculating unit, wherein the selecting unit is used for selecting a plurality of collecting dimensions when receiving a data collecting request and taking each selected collecting dimension as a target collecting dimension;
the first screening unit is used for screening at least one basic collection problem from a plurality of collection problems in a local question bank aiming at each target collection dimension;
the third output unit is used for outputting the basic collection problems by respectively adopting the output mode of each basic collection problem;
the first setting unit is used for acquiring the feedback data of each basic collection type problem and setting the confidence coefficient of the feedback data of each basic collection type problem according to the confidence coefficient determining method corresponding to each basic collection type problem;
the second screening unit is used for screening a plurality of deep collection problems which have an association relation with any one basic collection problem from a plurality of collection problems in the local question bank based on the confidence degree of the feedback data of each basic collection problem;
the fourth output unit is used for outputting the deep collection problems by respectively adopting the output mode corresponding to each deep collection problem;
and the second setting unit is used for acquiring the feedback data of each deep collection type problem and setting the confidence coefficient of the feedback data of each deep collection type problem according to the confidence coefficient determining method corresponding to each deep collection type problem.
10. The apparatus of claim 8, wherein the first output unit comprises:
the first output subunit is configured to, for each target collection dimension, screen a plurality of verification problems in a local question bank to obtain a verification problem corresponding to the target collection dimension, or determine a verification problem output corresponding to the target collection dimension from a cloud database;
and the verification problem corresponding to each target collection dimension is obtained by screening based on the feedback data of the collection type problem corresponding to each target collection dimension.
11. The apparatus of claim 8, wherein the authentication unit comprises:
the verification subunit is used for respectively verifying the correctness of the feedback data of the verification problem corresponding to each target collection dimension;
when the correctness of the feedback data of the verification problem corresponding to the target collection dimension cannot be verified, the verification subunit determines that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is unverifiable; when the feedback data of the verification problem corresponding to the target collection dimension is verified to be wrong, determining that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is low in confidence; and when the feedback data of the verification problem corresponding to the target collection dimension is verified to be correct, determining that the verification result of the feedback data of the collection problem corresponding to the target collection dimension is high in confidence.
12. The apparatus of claim 11, wherein the modification unit comprises:
the first correcting unit is used for calculating the product of a preset first correction coefficient corresponding to the verification problem and the overall confidence corresponding to the target collection dimension when the verification result of the feedback data of the collection type problem corresponding to the target collection dimension is that the feedback data cannot be verified, so as to obtain the overall confidence corresponding to the target collection dimension after correction; wherein the first correction coefficient is less than 1;
the second correcting unit is used for calculating the product of a preset second correction coefficient corresponding to the verification problem and the overall confidence corresponding to the target collection dimension when the confidence of the verification result of the feedback data of the collection type problem corresponding to the target collection dimension is lower, so as to obtain the corrected overall confidence corresponding to the target collection dimension; wherein the second correction coefficient is less than 1;
a third correcting unit, configured to calculate, when a verification result of the feedback data of the collection-type problem corresponding to the target collection dimension is higher than a confidence level, a product of a preset third correction coefficient corresponding to the verification problem and an overall confidence level corresponding to the target collection dimension, so as to obtain an overall confidence level corresponding to the target collection dimension after correction; wherein the third correction factor is equal to 1.
13. The apparatus of claim 9, wherein the first screening unit comprises:
the first determining unit is used for determining the total number of the basic collection problems to be selected;
a second determining unit, configured to determine, according to the total number of the basic collection problems to be screened and the importance coefficient of each collection problem corresponding to each target collection dimension, the selected number of the basic collection problems corresponding to each target collection dimension; the larger the importance coefficient of the collection problem corresponding to the target collection dimension is, the larger the selection number of the basic collection problems corresponding to the target collection dimension is;
the first screening subunit is configured to screen, from the multiple collection problems corresponding to each target collection dimension, basic collection problems that satisfy the selected number of the basic collection problems corresponding to each target collection dimension.
14. The apparatus of claim 9, wherein the second screening unit comprises:
the second calculation unit is used for calculating the number to be selected of the deep collection problems which are associated with the basic collection problems according to the confidence of the feedback data of the basic collection problems respectively aiming at each basic collection problem; the lower the confidence of the feedback data of the basic collection problem is, the more the number to be selected of the deep collection problems which are required to be selected and have the incidence relation with the basic collection problem are calculated;
and the second screening subunit is used for screening the deep collection problems meeting the quantity requirement of the to-be-selected data from the collection problems which are associated with each basic collection problem respectively.
CN201911183543.0A 2019-11-27 2019-11-27 Automatic user data collection method and device Active CN110888864B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911183543.0A CN110888864B (en) 2019-11-27 2019-11-27 Automatic user data collection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911183543.0A CN110888864B (en) 2019-11-27 2019-11-27 Automatic user data collection method and device

Publications (2)

Publication Number Publication Date
CN110888864A true CN110888864A (en) 2020-03-17
CN110888864B CN110888864B (en) 2022-08-23

Family

ID=69749066

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911183543.0A Active CN110888864B (en) 2019-11-27 2019-11-27 Automatic user data collection method and device

Country Status (1)

Country Link
CN (1) CN110888864B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982088A (en) * 2012-11-01 2013-03-20 北京百度网讯科技有限公司 Method for providing feedback information of user on destination page
CN104333530A (en) * 2013-07-22 2015-02-04 深圳市腾讯计算机***有限公司 Information credibility verifying method and apparatus
CN104572972A (en) * 2014-12-31 2015-04-29 百度在线网络技术(北京)有限公司 Method and device for verifying user
US20150261859A1 (en) * 2014-03-11 2015-09-17 International Business Machines Corporation Answer Confidence Output Mechanism for Question and Answer Systems
US20160104200A1 (en) * 2014-10-08 2016-04-14 Microsoft Corporation User directed information collections
CN106355414A (en) * 2015-07-15 2017-01-25 阿里巴巴集团控股有限公司 Method and apparatus for processing user feedback information
CN107369034A (en) * 2017-06-14 2017-11-21 广东数相智能科技有限公司 A kind of user investigates the sincere method and apparatus judged
CN107909376A (en) * 2017-12-05 2018-04-13 国网山东省电力公司济南供电公司 A kind of power system customer satisfaction reponse system
CN108415938A (en) * 2018-01-24 2018-08-17 中电科华云信息技术有限公司 A kind of method and system of the data automatic marking based on intelligent mode identification
CN108804682A (en) * 2018-06-12 2018-11-13 北京顶象技术有限公司 Analyze method, apparatus, electronic equipment and the storage medium of video comments authenticity
CN108829839A (en) * 2018-06-19 2018-11-16 精硕科技(北京)股份有限公司 Verification method, device, storage medium and the processor of credibility of sample's
CN109344176A (en) * 2018-09-05 2019-02-15 浙江工业大学 False comment detection method based on Two-way Cycle figure
CN110070333A (en) * 2019-03-19 2019-07-30 平安普惠企业管理有限公司 Intelligent questionnaire method, device, computer equipment and storage medium
CN110311788A (en) * 2019-06-28 2019-10-08 京东数字科技控股有限公司 Auth method, device, electronic equipment and readable medium

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982088A (en) * 2012-11-01 2013-03-20 北京百度网讯科技有限公司 Method for providing feedback information of user on destination page
CN104333530A (en) * 2013-07-22 2015-02-04 深圳市腾讯计算机***有限公司 Information credibility verifying method and apparatus
US20150261859A1 (en) * 2014-03-11 2015-09-17 International Business Machines Corporation Answer Confidence Output Mechanism for Question and Answer Systems
US20160104200A1 (en) * 2014-10-08 2016-04-14 Microsoft Corporation User directed information collections
CN104572972A (en) * 2014-12-31 2015-04-29 百度在线网络技术(北京)有限公司 Method and device for verifying user
CN106355414A (en) * 2015-07-15 2017-01-25 阿里巴巴集团控股有限公司 Method and apparatus for processing user feedback information
CN107369034A (en) * 2017-06-14 2017-11-21 广东数相智能科技有限公司 A kind of user investigates the sincere method and apparatus judged
CN107909376A (en) * 2017-12-05 2018-04-13 国网山东省电力公司济南供电公司 A kind of power system customer satisfaction reponse system
CN108415938A (en) * 2018-01-24 2018-08-17 中电科华云信息技术有限公司 A kind of method and system of the data automatic marking based on intelligent mode identification
CN108804682A (en) * 2018-06-12 2018-11-13 北京顶象技术有限公司 Analyze method, apparatus, electronic equipment and the storage medium of video comments authenticity
CN108829839A (en) * 2018-06-19 2018-11-16 精硕科技(北京)股份有限公司 Verification method, device, storage medium and the processor of credibility of sample's
CN109344176A (en) * 2018-09-05 2019-02-15 浙江工业大学 False comment detection method based on Two-way Cycle figure
CN110070333A (en) * 2019-03-19 2019-07-30 平安普惠企业管理有限公司 Intelligent questionnaire method, device, computer equipment and storage medium
CN110311788A (en) * 2019-06-28 2019-10-08 京东数字科技控股有限公司 Auth method, device, electronic equipment and readable medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨梅: ""基于问卷的数据收集与分析的研究"", 《中国优秀硕士学位论文全文数据库(社会科学辑)》 *

Also Published As

Publication number Publication date
CN110888864B (en) 2022-08-23

Similar Documents

Publication Publication Date Title
CN107945015B (en) Man-machine question and answer auditing method, device, equipment and computer readable storage medium
US20180260891A1 (en) Systems and methods for generating and using optimized ensemble models
CN110458697A (en) Method and apparatus for assessing risk
US9384231B2 (en) Data lineage management operation procedures
US9892106B1 (en) Methods, systems and articles for correcting errors in electronic government forms
US20090150166A1 (en) Hiring process by using social networking techniques to verify job seeker information
US20130046661A1 (en) Accounting system and management methods of transaction classifications that is simple, accurate and self-adapting
US20040158512A1 (en) System and method for coordinating the collection, analysis and storage of payroll information provided to government agencies by government contractors
US20100274708A1 (en) Apparatus and method for creating a collateral risk score and value tolerance for loan applications
CN106295351B (en) A kind of Risk Identification Method and device
US20120109834A1 (en) Automated business and individual risk management and validation process
WO2008042246A1 (en) Process and system for the automated collection of business information directly from a business entity's accounting system
WO2008068630A2 (en) Intelligent collections models
CN111160737A (en) Adaptation method of resource allocation scheme and related equipment
CN111369006B (en) Recall model generation method and device
CN110866209A (en) Online education data pushing method and system and computer equipment
US11127082B1 (en) Virtual assistant for recommendations on whether to arbitrate claims
CN116629456A (en) Method, system and storage medium for predicting overdue risk of service
CN115860280B (en) Shale gas yield prediction method, device, equipment and storage medium
US20150310545A1 (en) System and method for progress account opening by means of risk-based context analysis
Padilla et al. The effect of assembly bias on redshift-space distortions
CN110888864B (en) Automatic user data collection method and device
US9548996B2 (en) Hybrid engine for generating a recommended security tier
CN111047146B (en) Risk identification method, device and equipment for enterprise users
CN117541401A (en) Information pushing method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant