WO2019243876A1 - Method, system and computer program for determining weights of representativeness in individual-level data - Google Patents

Method, system and computer program for determining weights of representativeness in individual-level data Download PDF

Info

Publication number
WO2019243876A1
WO2019243876A1 PCT/IB2018/054587 IB2018054587W WO2019243876A1 WO 2019243876 A1 WO2019243876 A1 WO 2019243876A1 IB 2018054587 W IB2018054587 W IB 2018054587W WO 2019243876 A1 WO2019243876 A1 WO 2019243876A1
Authority
WO
WIPO (PCT)
Prior art keywords
person
representativeness
panel
weight
search queries
Prior art date
Application number
PCT/IB2018/054587
Other languages
French (fr)
Inventor
Mathieu TREPANIER
Mikael BOURQUI
Original Assignee
Tsquared Insights Sa
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsquared Insights Sa filed Critical Tsquared Insights Sa
Priority to PCT/IB2018/054587 priority Critical patent/WO2019243876A1/en
Publication of WO2019243876A1 publication Critical patent/WO2019243876A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0203Market surveys; Market polls
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Definitions

  • the present invention concerns a method, system and computer program for determining weights of representativeness in individual-level data and for performing behavioral analysis in individual-level data.
  • characteristics e.g. socio-demographic characteristics
  • the weights applied to the respondents to a survey are normally determined using their demographic attributes such as age, gender, and geographical location. This assumes that if the data is adjusted (through reweighting) in such a way that each demographic group is represented in proportion to its size in the general population, measurements on the behaviour of the sample will accurately represent the behaviour of the population.
  • the weights of representativeness are determined based on the occurrences of the search queries of the persons of the panel and the occurrences of the search queries of a population which should be represented by the panel. This allows first a fully automatic determination of the weights of representativeness. Second, the representativeness of the results of the panel for the population is significantly improved, because the representativeness is determined on data representing the behaviour. This allows to determine the representativeness of each individual significantly better than with demographic attributes. Third, this allows to analyse also data for which demographic attributes of the persons of the panel are missing or are incomplete. Fourth, this method is particularly well suited for representing a population which is limited to the internet using part of a general population.
  • the dependent claims refer to advantageous embodiments. [0010] The way of calculating the weights of representativeness defined in the dependent claims are particularly well suited to improve the representativeness of the panel.
  • An alternative embodiment of the invention is a computerized method for determining weights of representativeness in individual-level data, wherein the individual-level data comprise for a panel of persons e- commerce shopping events performed by each person of the panel, wherein the panel of persons of the individual-level data is a subset of a predetermined population of persons, wherein reference data comprises the occurrences of the e-commerce shopping events performed by the persons of the population, the method comprising the step of determining, in a processing means, for each person of the panel the weight of representativeness of this person based on the occurrences of the e- commerce shopping events for this person of a defined set of e-commerce shopping events of the individual-level data and based on the occurrence of the e-commerce shopping events of the same set of e-commerce shopping events in the reference data.
  • the weight of representativeness of each person is further based on the occurrences of the e-commerce shopping events for all persons of the panel of the defined set of e-commerce shopping events of the individual-level data.
  • the weights of representativeness are determined iteratively.
  • the weight of representativeness of each person in a second or higher iteration is further based on the weight of representativeness of this person from the previous iteration.
  • the weight of representativeness of each person in a second or higher iteration is further based on the weight of
  • the following steps are performed in the processing means in each iteration: selecting the defined set of e-commerce shopping events in the individual-level data for this iteration; determining for each person of the panel an intermediate weight of representativeness of this person based on the occurrences of the e-commerce shopping events for this person, preferably for all persons of the panel of the set of e-commerce shopping events selected in this iteration and based on the occurrences of the e-commerce shopping events of the same set of e-commerce shopping events in the reference data, and determining for each person of the panel the weight of representativeness of this person based on the intermediate weight of representativeness of this person.
  • the weight of representativeness of each person of the panel is based on the intermediate weight of representativeness of this person of this iteration and the weight of representativeness of this person from the previous iteration.
  • the weight of representativeness of each person or the intermediate weight of representativeness of each person is determined, in the processing means, based on a weighted sum over all e- commerce shopping events of the defined set of e-commerce shopping events of the multiplication of the occurrence of each e-commerce shopping event of the defined set of e-commerce shopping event for this person with a weight of each e-commerce shopping event of the defined set of e-commerce shopping event.
  • representativeness of each person or the intermediate weight of representativeness of each person is further determined, in the processing means, based on a sum over all e-commerce shopping events of the defined set of e-commerce shopping events of the weight of each e-commerce shopping event of the defined set of e-commerce shopping event.
  • the weight of representativeness of each person or the intermediate weight of representativeness of each person is determined, in the processing means, based on ratio of a sum of an offset and of the sum over all e-commerce shopping events of the defined set of e-commerce shopping events of the weight of each e-commerce shopping event of the defined set of e-commerce shopping events divided by a sum of the offset and of the weighted sum.
  • the weight of each e-commerce shopping event is determined based on the ratio of the cumulative occurrence of this e-commerce shopping event of all persons of the panel divided by the occurrence of this e-commerce shopping event in the population.
  • the cumulative occurrence of this e- commerce shopping event of all persons of the panel is based on the sum over all persons of the panel of the multiplication of the occurrence of this e-commerce shopping event of each person of the panel with the weight of representativeness of the respective person from the previous iteration.
  • Fig. 1 shows a view of a schematic embodiment of a system according to the invention.
  • Fig. 2 shows a view of a schematic embodiment of a method for behavioural analysis in individual-level data according to the invention.
  • Fig. 3 shows a view of a schematic embodiment of a method for determining weights of representativeness in individual-level data according to the invention.
  • the term "person" is used herein to define any distinguishable individual.
  • the distinguishable individual is normally automatically detected, e.g. by device tracking, etc. such that the person can also refer to an identified device. This automatic detection of the persons might not always be detected correctly such that two persons might in fact refer to two humans.
  • the term person shall not be interpreted as the human behind the person, but as the distinguishable individual defined in the individual-level data (see definition below) deemed to be a different human than the other persons.
  • the person is distinguishable in the individual-level data by an identifier. This identifier could be simply a code, a number, name, etc.
  • the identifier is anonymous such that the person cannot be identified.
  • the term "population" defines a set of persons with at least one common demographic attribute.
  • This at least one common demographic attribute is preferably the geographical region, e.g. the country.
  • the population could be all persons of a country region comprising at least one country, e.g. Switzerland, European Union, etc. It is also possible that the at least one common demographic attribute comprises (in addition) the age.
  • the population could thus comprise all persons of a certain age group (in a certain geographical region). Due to the later defined method, the population is restricted to the internet using persons of the at least one common demographic attribute. For example, the population comprises all internet-using persons of a certain country region.
  • search query refers to any identifier defining a search in the internet by a person in a search engine.
  • the search query is preferably a keyword.
  • a keyword can comprise one or more words.
  • a word can be any concatenation of characters like letters, numbers or other signs defined by character encodings like American Standard Code for
  • the search query could refer also to complex search query identifiers like images, sounds, locations, etc.
  • the term "individual-level data” refers to any data which are associated to different persons, in particular to the identifiers of the persons.
  • the individual-level data comprise thus for each person (or its identifier) data associated to the person. This data for some of those persons can be empty, if this person was maybe not active for example in a relevant period.
  • the individual-level data comprise at least the search queries performed by the persons.
  • the individual-level data comprise for each person the search queries performed.
  • the individual-level data comprise for each search query the keyword searched (the term entered into the search engine by the user) and the identifier identifying (maybe anonymously) the person performing the search.
  • the search queries performed by each person could be stored in different ways.
  • individual-level data could comprise directly the occurrence of each search query performed by each user.
  • the individual- level data could comprise the search queries performed by each user over the time (with or without a time stamp). This allows to determine the occurrence from the individual-level data.
  • the occurrence of the search queries performed by each person might be stored directly or indirectly in the individual-level data. In the latter case, the occurrence of the search queries can be retrieved from the individual-level data.
  • the individual-level data comprise preferably further data associated to each person, e.g. location data, visited web pages, etc.
  • the individual-level data can be already pre-processed for the below described methods.
  • the pre processing could comprise anonymization and or a categorization.
  • the categorization could for example group the activity of each person (e.g. search queries) in different time slots (hours, days, weeks, months, years) or in only one time slot to be analysed.
  • the individual-level data can however also be rather in a raw format such that the below method could comprise the steps of anonymization or categorization or of retrieving the
  • a person of the individual-level data indicates a person in relation to whom data, in particular performed search queries are stored in the individual-level data.
  • the individual-level data can also be device-level data. In this case, each person corresponds to a device and/or each device corresponds to a person.
  • the individual-level data preferably comprise records of actions of the persons. The actions can be search queries or other actions like visiting web-sites, opening applications, visiting locations, etc. Many records of actions about one person constitute the data about the behaviour of the person.
  • the term "reference data" shall comprises any data which indicate the occurrence of different search queries in the population. This can be directly the list of search queries with their respective occurrence. However, it is also possible that the reference data contains only indirectly the information of the occurrence of different search queries in the population and that this information must be retrieved from the reference data.
  • the term "panel" refers to a defined set of persons of the individual-level data which is a (strict) subset of the population.
  • the panel comprise preferably all persons which fulfill the at least one common demographic attribute of the population. It is however possible to define the set of persons of the panel smaller, e.g. because some persons where not active during the complete time window of analysis or for other reasons.
  • occurrence of a search query can be any information which indicates the occurrence of a search query. It could be the absolute number of times the search query was performed (by a person, a panel or a population). In this case, it is important that the absolute number is always taken for the same period of time. Preferably but not necessarily, the same period of time as used for the behavioural analysis described later. It is however also possible that the occurrence of a search query is indicated as a frequency, i.e. normalized by a time period. This has the advantage that data with different time periods can be used. This has however also the disadvantage that different time periods might deviate the results.
  • Fig. 1 shows an embodiment of a system for performing the below described methods.
  • the system comprises a storage means 1 and a processing means 2.
  • the system can be a computer.
  • the system can also comprise two or more interconnected computers.
  • the interconnected computers could be located in the same location as in a data center or in remote locations as for cloud computing.
  • the system can also be realised in a specialised processing chip. Many realizations of the system are possible.
  • the storage means 1 stores the individual-level data and the reference data.
  • the storage means 1 can comprise a first storage section for storing the individual-level data.
  • the storage means 1 can comprise a second storage section for storing the reference data.
  • the storage means 1 can comprise a third storage section for storing a computer program with instructions which perform the below described methods, when executed on the processing means 2.
  • the first, second and/or third storage section can be arranged in separate storage devices forming together the storage means or (as logical sections) in the same storage device.
  • the storage means 1 can comprise one, two or more storages devices.
  • the storage devices can be located in the same location or in remote locations.
  • the storage means 1 can be (completely or in part) in the same location as the processing means 2 or in a remote location to the processing means 2.
  • the processing means 2 is configured to execute the method described below. Any kind of processing means 2 which allows to execute the below described method can be used.
  • the processing means 2 can be for example at least one processor.
  • the processing means 2 can comprise one or more processors.
  • the system comprises further an interface for outputting the result of the below described method.
  • the interface could be a display, a socket for a display, a communication interface like a network or peripheral interface/socket.
  • Fig. 2 shows an embodiment of a method for an analysis of the individual-level data. The method is performed on the persons of a panel defined. The panel can be the same for different analysis' or can be defined each time in dependence of the analysis. The method can be performed on a defined time window in the individual-level data. However, it is also possible that the method is applied without considering a time window or using the complete individual-level data (in respect to the time).
  • a second step S2 the behavior of each person of the panel is analyzed on the basis of the individual-level data.
  • This analysis results in an analysis result for each person.
  • the analysis is preferably a behavioral analysis.
  • the records of each person of the panel are analyzed for certain criteria to obtain an analysis result for each person. This criteria could be for example, "has the individual visited football related content on media websites?" to answer the question of whether a person is a football fan.
  • the analysis result would be a binary variable meaning for each person either "yes, this person is a football fan" or "no, this person is no football fan".
  • step S3 the analysis result of the panel is reweighted for representativeness. This is done by weighting the analysis result of each person with weight of representativeness of this person and combining the weighted results of each person to obtain an analysis result of the population, preferably a behavioural result of the population. [0032] If the same panel and the same population is used for different analysis, step S1 must be performed only once. The same weights of representativeness can be used in step S3 for different analyses so that the step S1 does not need to be repeated for each analysis. Just when a new panel or a new time period is selected, the new weights of
  • Fig. 3 shows an embodiment for determining the weights of representativeness based on the individual-level data.
  • step S11 the method is initialized. This could comprise: 1) the selection of some parameters such as the at least one common
  • the initialization step S11 can also be omitted, if those parameters do not change.
  • the method is iterative such that the steps S12 to S18 are performed at least twice and/or are performed until a certain stop criterion is fulfilled.
  • at least one of the method steps of the r-th iteration is based on the weight of representativeness coi ( ) of the previous iteration r-1.
  • a set of search queries is defined.
  • J search queries are selected.
  • the number J is equal for each iteration, preferably as selected in the initialization step. However, it is also possible to select the number J different in each iteration r.
  • the J search queries of the individual-level data are preferably selected randomly.
  • the J search queries by other criteria, e.g. their occurrence (e.g. the J most used search queries in the individual- level data or in the population) or their order (e.g. the J first search queries).
  • the best results are achieved, if the J (distinct) search queries are selected randomly.
  • the J search queries are distinct to each other.
  • a set of / distinct search queries ⁇ . h ⁇ : is selected (in the r-th iteration) from the individual-level data as described above. [0037]
  • step S13 the occurrences of the J search queries are retrieved.
  • the occurrence i of the search query K ⁇ that each user has searched each of the search query is retrieved from the individual-level data.
  • t is preferably the number of times that each user has searched each of the search queries 3 ⁇ 4 .
  • can be already stored in the individual-level data and just be read or the
  • individual-level data can be processed to determine the occurrence
  • step S14 a relative weight of each search query in the panel is determined.
  • the relative weight R/ r) for the search query j is determined based on the ratio h i 3 ⁇ 4 i ⁇ 3 ⁇ 4 of the combination of the occurrences x/ r) of the search
  • the relative weight R r) for the search query j is determined based on the ratio of the sample volume of the search query j and the population volume of the search query j.
  • the sample volume and/or the combination of the occurrences x r) of the search query j is determined based on or is equal to the sum ⁇ over the persons i of the panel of the occurrences of the search query j of the person i.
  • an intermediate weight of representativeness is determined for each person of the panel.
  • the intermediate weight of representativeness for the person i is determined based on the weighted sum over all J search queries j of the multiplication of the occurrence x i of the search query j for this person i with the relative weight R/ r) of the defined set of search query j.
  • the intermediate weight of representativeness for the person i is determined based on a ratio of a sum of an offset and of the sum over all search queries of the defined set of search queries of the weight of each search query of the defined set of search queries divided by a sum of the offset and of the weighted sum
  • the offset is preferably selected as 1, but can be another number.
  • the offset is preferably non-zero. The non-zero offset is added to the
  • the intermediate weight of representativeness for the person i is determined based on the inverse of this ratio
  • the intermediate weight of representativeness corresponds to the final weight of representativeness. Otherwise, in step S16 the weight of representativeness of each person i is calculated based on the intermediate weight of representativeness of the respective person i combined, preferably multiplied with the weight of representativeness of this person i of the previous iteration r-1 [0041] If the weight of representativeness is calculated iteratively, it is checked in step S17, if a stopping condition is fulfilled. If the stopping condition is fulfilled, the method ends in step S18, otherwise the steps S12 to S17 are repeated as described above. Many stopping conditions are possible.
  • a preferred stopping condition is that the total difference between the weights of representativeness at the end of the iteration r and their corresponding values at the end of the previous iteration r-1 fall below a set threshold for n successive iterations n is preferably at least two. (threshold value ⁇
  • n 1.
  • other stopping conditions are also possible.
  • e-commerce shopping event shall be an event indicating an e-commerce shopping activity of a person or of persons.
  • the e-commerce shopping event (of a person) comprises search queries and/or acquisitions (of this person) on one or multiple e-commerce shopping website(s)/platform(s).
  • the search queries and/or acquisitions of a person on e-commerce shopping websites like amazon (registered trademark), ebay (registered trademark) or any other e- commerce shop could be considered to be an e-commerce shopping activity.
  • the e-commerce shopping event could comprise a certain product (identified for example by an electronic product code or any other identifier of the product) or a product category.
  • the individual-level data comprise at least the e-commerce shopping events performed by the persons (of the panel).
  • the individual-level data comprise for each person the e-commerce shopping events performed (maybe instead of the search queries).
  • the individual-level data comprise for e- commerce shopping event the e-commerce shopping event and the identifier identifying (maybe anonymously) the person performing the event.
  • reference data shall comprise any data which indicate the occurrence of different e-commerce shopping events in the population (maybe instead of the occurrences of the search queries in the population).
  • occurrence of an e-commerce shopping event can be any information which indicates the occurrence of a defined e-commerce shopping event, e.g. the search and/or acquisition of a certain product or of a product from a product category. It could be the absolute number of times the e-commerce shopping event was performed or a frequency of the e-commerce shopping event.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Physics & Mathematics (AREA)
  • Game Theory and Decision Science (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Computerized method for determining weights of representativeness in individual-level data, wherein the individual-level data comprise for a panel of persons search queries performed by each person of the panel, wherein the panel of persons of the individual-level data is a subset of a predetermined population of persons, wherein reference data comprises the occurrences of the search queries performed by the persons of the population, the method comprising the step of :determining, in a processing means, for each person of the panel the weight of representativeness of this person based on the occurrences of the search queries for this person of a defined set of search queries of the individual-level data and based on the occurrence of the search queries of the same set of search queries in the reference data.

Description

Method, system and computer program for determining weights of representativeness in individual-level data
Field of the invention
[0001] The present invention concerns a method, system and computer program for determining weights of representativeness in individual-level data and for performing behavioral analysis in individual-level data. Description of related art
[0002] Surveys try to estimate the opinion, needs, wants, and habits of a population on a certain question. This question could be the outcome of an election, a market study, etc. The quality of the result of the survey depends largely on the representativeness of the panel (also called sample) which is a subset of the population. The same problem arises if individual- level data of a panel are analysed for a question which should be answered at the population level. The ability to interpret the results of any sample- based research as relevant at the population level relies on the degree to which the sample is representative of the population. Knowledge about an unrepresentative sample cannot be generalized to the population and is of limited value to anyone who needs to base decisions on data.
[0003] Reweighting for representativeness is common in survey research. The standard approach involves assigning to each panel participant a weight to compensate the fact that the participant's observable
characteristics (e.g. socio-demographic characteristics) are over or under represented in the panel as compared to the population. In survey research, the need for reweighting arises when the characteristics of the sample of respondents diverge from those of the population, because of factors like lower or higher response rates among particular groups of people. The weights applied to the respondents to a survey are normally determined using their demographic attributes such as age, gender, and geographical location. This assumes that if the data is adjusted (through reweighting) in such a way that each demographic group is represented in proportion to its size in the general population, measurements on the behaviour of the sample will accurately represent the behaviour of the population.
[0004] This approach has the obvious limitation that differences in behaviour do not usually coincide exactly with demographic categories. Indeed, it is impossible to know whether a sample accurately represents the behaviour of the population by looking at demographic attributes alone. A sample might be 'representative' in terms of its age, gender, geography, or even income distribution while still over- or under-sampling groups with certain hard-to-measure attributes such as personality traits (extraversion, curiosity, conscientiousness, ...) or personal history (experience with computers, travel abroad, exposure to media, ...). Many survey designs attempt to take such factors into account, but can only do so to the extent that the factors to correct for have been anticipated by the researcher and can be measured. Therefore, this approach cannot yet be fully automatized and still needs to be designed by human interactions to define or correct the weights of the panel participants. This is also due to the fact that the demographic attributes and the non-demographic attributes are normally not available for all persons. It is thus problematic to use this method for the automatized analysis of individual-level data as for example in the behavioural analysis of individual-level data.
Brief summary of the invention
[0005] It is an object to find an improved method for determining weights for representativeness for participants of individual-level data. In particular, the method should be fully automatized and/or improve the accuracy of the estimates achieved by weighting for representativeness with the improved weights of representativeness.
[0006] It is a further object to fully automatize and/or improve the behavioural analysis of individual-level data.
[0007] This object is solved by the independent claims. [0008] The weights of representativeness are determined based on the occurrences of the search queries of the persons of the panel and the occurrences of the search queries of a population which should be represented by the panel. This allows first a fully automatic determination of the weights of representativeness. Second, the representativeness of the results of the panel for the population is significantly improved, because the representativeness is determined on data representing the behaviour. This allows to determine the representativeness of each individual significantly better than with demographic attributes. Third, this allows to analyse also data for which demographic attributes of the persons of the panel are missing or are incomplete. Fourth, this method is particularly well suited for representing a population which is limited to the internet using part of a general population.
[0009] The dependent claims refer to advantageous embodiments. [0010] The way of calculating the weights of representativeness defined in the dependent claims are particularly well suited to improve the representativeness of the panel.
[0011] An alternative embodiment of the invention is a computerized method for determining weights of representativeness in individual-level data, wherein the individual-level data comprise for a panel of persons e- commerce shopping events performed by each person of the panel, wherein the panel of persons of the individual-level data is a subset of a predetermined population of persons, wherein reference data comprises the occurrences of the e-commerce shopping events performed by the persons of the population, the method comprising the step of determining, in a processing means, for each person of the panel the weight of representativeness of this person based on the occurrences of the e- commerce shopping events for this person of a defined set of e-commerce shopping events of the individual-level data and based on the occurrence of the e-commerce shopping events of the same set of e-commerce shopping events in the reference data. [0012] In one embodiment, the weight of representativeness of each person is further based on the occurrences of the e-commerce shopping events for all persons of the panel of the defined set of e-commerce shopping events of the individual-level data. Preferably, the weights of representativeness are determined iteratively. Preferably, the weight of representativeness of each person in a second or higher iteration is further based on the weight of representativeness of this person from the previous iteration. Preferably, the weight of representativeness of each person in a second or higher iteration is further based on the weight of
representativeness of all persons of the panel from the previous iteration. Preferably, the following steps are performed in the processing means in each iteration: selecting the defined set of e-commerce shopping events in the individual-level data for this iteration; determining for each person of the panel an intermediate weight of representativeness of this person based on the occurrences of the e-commerce shopping events for this person, preferably for all persons of the panel of the set of e-commerce shopping events selected in this iteration and based on the occurrences of the e-commerce shopping events of the same set of e-commerce shopping events in the reference data, and determining for each person of the panel the weight of representativeness of this person based on the intermediate weight of representativeness of this person. Preferably, for the second or higher iteration, the weight of representativeness of each person of the panel is based on the intermediate weight of representativeness of this person of this iteration and the weight of representativeness of this person from the previous iteration.
[0013] In one embodiment, the weight of representativeness of each person or the intermediate weight of representativeness of each person is determined, in the processing means, based on a weighted sum over all e- commerce shopping events of the defined set of e-commerce shopping events of the multiplication of the occurrence of each e-commerce shopping event of the defined set of e-commerce shopping event for this person with a weight of each e-commerce shopping event of the defined set of e-commerce shopping event. Preferably, the weight of
representativeness of each person or the intermediate weight of representativeness of each person is further determined, in the processing means, based on a sum over all e-commerce shopping events of the defined set of e-commerce shopping events of the weight of each e-commerce shopping event of the defined set of e-commerce shopping event.
Preferably, the weight of representativeness of each person or the intermediate weight of representativeness of each person is determined, in the processing means, based on ratio of a sum of an offset and of the sum over all e-commerce shopping events of the defined set of e-commerce shopping events of the weight of each e-commerce shopping event of the defined set of e-commerce shopping events divided by a sum of the offset and of the weighted sum. Preferably, the weight of each e-commerce shopping event is determined based on the ratio of the cumulative occurrence of this e-commerce shopping event of all persons of the panel divided by the occurrence of this e-commerce shopping event in the population.
[0014] In one embodiment, the cumulative occurrence of this e- commerce shopping event of all persons of the panel is based on the sum over all persons of the panel of the multiplication of the occurrence of this e-commerce shopping event of each person of the panel with the weight of representativeness of the respective person from the previous iteration.
Brief Description of the Drawings
[0015] The invention will be better understood with the aid of the description of an embodiment given by way of example and illustrated by the figures, in which: Fig. 1 shows a view of a schematic embodiment of a system according to the invention.
Fig. 2 shows a view of a schematic embodiment of a method for behavioural analysis in individual-level data according to the invention. Fig. 3 shows a view of a schematic embodiment of a method for determining weights of representativeness in individual-level data according to the invention.
Detailed Description of possible embodiments of the Invention [0016] The term "person" is used herein to define any distinguishable individual. The distinguishable individual is normally automatically detected, e.g. by device tracking, etc. such that the person can also refer to an identified device. This automatic detection of the persons might not always be detected correctly such that two persons might in fact refer to two humans. The term person shall not be interpreted as the human behind the person, but as the distinguishable individual defined in the individual-level data (see definition below) deemed to be a different human than the other persons. Preferably, the person is distinguishable in the individual-level data by an identifier. This identifier could be simply a code, a number, name, etc. Preferably, the identifier is anonymous such that the person cannot be identified.
[0017] The term "population" defines a set of persons with at least one common demographic attribute. This at least one common demographic attribute is preferably the geographical region, e.g. the country. The population could be all persons of a country region comprising at least one country, e.g. Switzerland, European Union, etc. It is also possible that the at least one common demographic attribute comprises (in addition) the age. The population could thus comprise all persons of a certain age group (in a certain geographical region). Due to the later defined method, the population is restricted to the internet using persons of the at least one common demographic attribute. For example, the population comprises all internet-using persons of a certain country region.
[0018] The term "search query" refers to any identifier defining a search in the internet by a person in a search engine. The search query is preferably a keyword. A keyword can comprise one or more words. A word can be any concatenation of characters like letters, numbers or other signs defined by character encodings like American Standard Code for
Information Interchange (ASCII) or Unicode transformation format (UTF). However, the search query could refer also to complex search query identifiers like images, sounds, locations, etc.. [0019] The term "individual-level data" refers to any data which are associated to different persons, in particular to the identifiers of the persons. The individual-level data comprise thus for each person (or its identifier) data associated to the person. This data for some of those persons can be empty, if this person was maybe not active for example in a relevant period. The individual-level data comprise at least the search queries performed by the persons. Thus, the individual-level data comprise for each person the search queries performed. Preferably, the individual- level data comprise for each search query the keyword searched (the term entered into the search engine by the user) and the identifier identifying (maybe anonymously) the person performing the search. The search queries performed by each person could be stored in different ways. The
individual-level data could comprise directly the occurrence of each search query performed by each user. Alternatively or in addition, the individual- level data could comprise the search queries performed by each user over the time (with or without a time stamp). This allows to determine the occurrence from the individual-level data. In other words, the occurrence of the search queries performed by each person might be stored directly or indirectly in the individual-level data. In the latter case, the occurrence of the search queries can be retrieved from the individual-level data. The individual-level data comprise preferably further data associated to each person, e.g. location data, visited web pages, etc. The individual-level data can be already pre-processed for the below described methods. The pre processing could comprise anonymization and or a categorization. The categorization could for example group the activity of each person (e.g. search queries) in different time slots (hours, days, weeks, months, years) or in only one time slot to be analysed. The individual-level data can however also be rather in a raw format such that the below method could comprise the steps of anonymization or categorization or of retrieving the
information necessary for the below mentioned methods. A person of the individual-level data indicates a person in relation to whom data, in particular performed search queries are stored in the individual-level data. The individual-level data can also be device-level data. In this case, each person corresponds to a device and/or each device corresponds to a person. The individual-level data preferably comprise records of actions of the persons. The actions can be search queries or other actions like visiting web-sites, opening applications, visiting locations, etc. Many records of actions about one person constitute the data about the behaviour of the person. [0020] The term "reference data" shall comprises any data which indicate the occurrence of different search queries in the population. This can be directly the list of search queries with their respective occurrence. However, it is also possible that the reference data contains only indirectly the information of the occurrence of different search queries in the population and that this information must be retrieved from the reference data.
[0021] The term "panel" refers to a defined set of persons of the individual-level data which is a (strict) subset of the population. The panel comprise preferably all persons which fulfill the at least one common demographic attribute of the population. It is however possible to define the set of persons of the panel smaller, e.g. because some persons where not active during the complete time window of analysis or for other reasons.
[0022] The term "occurrence of a search query" can be any information which indicates the occurrence of a search query. It could be the absolute number of times the search query was performed (by a person, a panel or a population). In this case, it is important that the absolute number is always taken for the same period of time. Preferably but not necessarily, the same period of time as used for the behavioural analysis described later. It is however also possible that the occurrence of a search query is indicated as a frequency, i.e. normalized by a time period. This has the advantage that data with different time periods can be used. This has however also the disadvantage that different time periods might deviate the results.
[0023] The terms like time window, period of time, time slot, etc. are used interchangeably. [0024] Fig. 1 shows an embodiment of a system for performing the below described methods. The system comprises a storage means 1 and a processing means 2. The system can be a computer. The system can also comprise two or more interconnected computers. The interconnected computers could be located in the same location as in a data center or in remote locations as for cloud computing. The system can also be realised in a specialised processing chip. Many realizations of the system are possible.
[0025] The storage means 1 stores the individual-level data and the reference data. The storage means 1 can comprise a first storage section for storing the individual-level data. The storage means 1 can comprise a second storage section for storing the reference data. The storage means 1 can comprise a third storage section for storing a computer program with instructions which perform the below described methods, when executed on the processing means 2. The first, second and/or third storage section can be arranged in separate storage devices forming together the storage means or (as logical sections) in the same storage device. The storage means 1 can comprise one, two or more storages devices. The storage devices can be located in the same location or in remote locations. The storage means 1 can be (completely or in part) in the same location as the processing means 2 or in a remote location to the processing means 2.
[0026] The processing means 2 is configured to execute the method described below. Any kind of processing means 2 which allows to execute the below described method can be used. The processing means 2 can be for example at least one processor. The processing means 2 can comprise one or more processors. [0027] Preferably, the system comprises further an interface for outputting the result of the below described method. The interface could be a display, a socket for a display, a communication interface like a network or peripheral interface/socket. [0028] Fig. 2 shows an embodiment of a method for an analysis of the individual-level data. The method is performed on the persons of a panel defined. The panel can be the same for different analysis' or can be defined each time in dependence of the analysis. The method can be performed on a defined time window in the individual-level data. However, it is also possible that the method is applied without considering a time window or using the complete individual-level data (in respect to the time).
[0029] In a first step S1, the weights of representativeness are
determined for the panel according to the invention based on the individual-level data, in particular based on the search queries of the panel in the individual-level data. The details of the step are described below with the help of Fig. 3.
[0030] In a second step S2, the behavior of each person of the panel is analyzed on the basis of the individual-level data. This analysis results in an analysis result for each person. The analysis is preferably a behavioral analysis. Preferably, the records of each person of the panel are analyzed for certain criteria to obtain an analysis result for each person. This criteria could be for example, "has the individual visited football related content on media websites?" to answer the question of whether a person is a football fan. The analysis result would be a binary variable meaning for each person either "yes, this person is a football fan" or "no, this person is no football fan".
[0031] In a third step S3, the analysis result of the panel is reweighted for representativeness. This is done by weighting the analysis result of each person with weight of representativeness of this person and combining the weighted results of each person to obtain an analysis result of the population, preferably a behavioural result of the population. [0032] If the same panel and the same population is used for different analysis, step S1 must be performed only once. The same weights of representativeness can be used in step S3 for different analyses so that the step S1 does not need to be repeated for each analysis. Just when a new panel or a new time period is selected, the new weights of
representativeness need to be determined again in step S1.
[0033] Fig. 3 shows an embodiment for determining the weights of representativeness based on the individual-level data.
[0034] In a step S11, the method is initialized. This could comprise: 1) the selection of some parameters such as the at least one common
demographic attribute for defining the population, 2) the identification of the persons in the individual-level data corresponding to the population or with at least one common demographic data for the panel, 3) the number of persons I included in the panel (preferably all persons of the individual- level data belonging to the population) and/or 4) the number of search queries J to be considered in each iteration. The number of search queries J is adjusted based on tests of algorithm performance and is preferably larger than 50, preferably than 100, preferably than 200, preferably than 500, preferably than 1000. The persons of the panel are labelled £ = i
Figure imgf000012_0001
This labelling is just an arbitrary identifier used for distinguishing the persons. This labelling shall not be limitative for the invention. The initialization step S11 can also be omitted, if those parameters do not change.
[0035] In a preferred embodiment, the method is iterative such that the steps S12 to S18 are performed at least twice and/or are performed until a certain stop criterion is fulfilled. In this case, at least one of the method steps of the r-th iteration is based on the weight of representativeness coi( ) of the previous iteration r-1. In this case, the weight of representativeness co i(0) of the zeroth iteration used for the first iteration r=1 is set for all persons !" = T of the panel to the same initial value which is preferably one: — i, for all persons 5 - ^ However, it is also possible to determine the weight of representativeness in one run, i.e. not iteratively, such the step S 17 is not necessary and the steps S12 to S15 are performed just once.
[0036] In a step S12, a set of search queries is defined. Preferably, J search queries are selected. Preferably, the number J is equal for each iteration, preferably as selected in the initialization step. However, it is also possible to select the number J different in each iteration r. The J search queries of the individual-level data are preferably selected randomly.
However, it is also possible to select the J search queries by other criteria, e.g. their occurrence (e.g. the J most used search queries in the individual- level data or in the population) or their order (e.g. the J first search queries). However, the best results are achieved, if the J (distinct) search queries are selected randomly. Preferably, the J search queries are distinct to each other. A set of / distinct search queries
Figure imgf000013_0001
··· . h ·: is selected (in the r-th iteration) from the individual-level data as described above. [0037] In step S13, the occurrences of the J search queries are retrieved.
, i· ϊ'/
The occurrence i of the search query K} that each user has searched each of the search query
Figure imgf000013_0002
is retrieved from the individual-level data. The occurrence
Figure imgf000013_0003
t is preferably the number of times
Figure imgf000013_0004
that each user has searched each of the search queries ¾ . To retrieve the occurrence
Figure imgf000013_0005
j from the individual-level data means that the occurrence
Figure imgf000013_0006
Ί can be already stored in the individual-level data and just be read or the
individual-level data can be processed to determine the occurrence
Figure imgf000013_0007
j )
Further, the occurrence of each search query j in the population is retrieved from the reference data. [0038] In step S14, a relative weight of each search query in the panel is determined. The relative weight R r) for the search query j is determined based on the combination of the occurrences of the search query j for all users i :=
Figure imgf000013_0008
and on the occurrence ¾ of the search query j in the population (retrieved from the reference data). Preferably, the relative weight R/r) for the search query j is determined based on the ratio hi ¾i ^ ¾ of the combination of the occurrences x/r) of the search
ί.?\ί
query j for all users
Figure imgf000014_0001
and on the occurrence :V; of the search query j in the population. In other words, the relative weight R r) for the search query j is determined based on the ratio of the sample volume of the search query j and the population volume of the search query j. The sample volume and/or the combination of the occurrences x r) of the search query j is determined based on or is equal to the sum
Figure imgf000014_0002
^ over the persons i of the panel of the occurrences of the search query j of the person i. Preferably, the combination of the
occurrences x r) of the search query j for all users := ^ - and/or the sum pΊ ~ over the persons i of the panel of the occurrences of the search query j of the person i is based on a corrected occurrences of the search query j of the i-th person
Figure imgf000014_0003
which is obtained by a combination, preferably the multiplication of the occurrence xf of the j-th search query of the i-th user and the i-th user's weight of
representativeness coi( ) of the last iteration r-1. However, it is also possible to determine the relative weight on the basis of the non-corrected occurrences.
[0039] In step S15, an intermediate weight of representativeness is determined for each person of the panel. Preferably, the intermediate weight of representativeness for the person i is determined based on the weighted sum over all J search queries j of the multiplication of the occurrence xi of the search query j for this person i with the relative weight R/r) of the defined set of search query j.
Figure imgf000014_0004
based on a sum over the J search queries of the weight R/r) of each search query j
Figure imgf000014_0005
Preferably, the intermediate weight of representativeness for the person i is determined based on a ratio of a sum of an offset and of the sum over all search queries of the defined set of search queries of the weight of each search query of the defined set of search queries divided by a sum of the offset and of the weighted sum
Figure imgf000015_0001
The offset is preferably selected as 1, but can be another number. The offset is preferably non-zero. The non-zero offset is added to the
numerator and denominator of the weighted average computation so as to ensure that the adjustment value is nonzero and definite. Preferably, the intermediate weight of representativeness for the person i is determined based on the inverse of this ratio
Figure imgf000015_0002
Recall that for each individual, we have occurrence
Figure imgf000015_0003
recording how many times that individual has searched each of the current round's / randomly-selected keywords. The adjustment to that individual's weight for the current round is the inverse of the average of these counts weighted by the relative weights (eventually corrected by the mentioned offset in the numerator and the denominator).
[0040] If the weight of representativeness is calculated non-iteratively, the intermediate weight of representativeness corresponds to the final weight of representativeness. Otherwise, in step S16 the weight of representativeness of each person i is calculated based on the intermediate weight of representativeness of the respective person i combined, preferably multiplied with the weight of representativeness of this person i of the previous iteration r-1
Figure imgf000015_0004
[0041] If the weight of representativeness is calculated iteratively, it is checked in step S17, if a stopping condition is fulfilled. If the stopping condition is fulfilled, the method ends in step S18, otherwise the steps S12 to S17 are repeated as described above. Many stopping conditions are possible. A preferred stopping condition is that the total difference between the weights of representativeness at the end of the iteration r and their corresponding values at the end of the previous iteration r-1 fall below a set threshold for n successive iterations n is preferably at least two.
Figure imgf000016_0001
(threshold value }
However, it is also possible that n =1. This stopping condition can also be combined with a maximum number of iterations:
Figure imgf000016_0002
= ¾AX. However, other stopping conditions are also possible.
[0042] In an alternative embodiment, it is also possible to determine the weights of representativeness based on the occurrences of e-commerce shopping events for this person of a defined set of e-commerce shopping events of the individual-level data and based on the occurrence of the e- commerce shopping events of the same set of e-commerce shopping events in the reference data. The above-described applies analogously for this alternative embodiment, wherein the search queries above are replaced by e-commerce shopping events.
[0043] The term e-commerce shopping event shall be an event indicating an e-commerce shopping activity of a person or of persons.
Preferably, the e-commerce shopping event (of a person) comprises search queries and/or acquisitions (of this person) on one or multiple e-commerce shopping website(s)/platform(s). For example, the search queries and/or acquisitions of a person on e-commerce shopping websites like amazon (registered trademark), ebay (registered trademark) or any other e- commerce shop could be considered to be an e-commerce shopping activity. The e-commerce shopping event could comprise a certain product (identified for example by an electronic product code or any other identifier of the product) or a product category. [0044] In this alternative embodiment, the individual-level data comprise at least the e-commerce shopping events performed by the persons (of the panel). Thus, the individual-level data comprise for each person the e-commerce shopping events performed (maybe instead of the search queries). Preferably, the individual-level data comprise for e- commerce shopping event the e-commerce shopping event and the identifier identifying (maybe anonymously) the person performing the event.
[0045] The term "reference data" shall comprise any data which indicate the occurrence of different e-commerce shopping events in the population (maybe instead of the occurrences of the search queries in the population).
[0046] The term "occurrence of an e-commerce shopping event" can be any information which indicates the occurrence of a defined e-commerce shopping event, e.g. the search and/or acquisition of a certain product or of a product from a product category. It could be the absolute number of times the e-commerce shopping event was performed or a frequency of the e-commerce shopping event.

Claims

Claims
1. Computerized method for determining weights of
representativeness in individual-level data, wherein the individual-level data comprise for a panel of persons search queries performed by each person of the panel, wherein the panel of persons of the individual-level data is a subset of a predetermined population of persons, wherein reference data comprises the occurrences of the search queries performed by the persons of the population, the method comprising the step of :
determining, in a processing means (2), for each person of the panel the weight of representativeness of this person based on the occurrences of the search queries for this person of a defined set of search queries of the individual-level data and based on the occurrence of the search queries of the same set of search queries in the reference data.
2. Method according to claim 1, wherein the weight of
representativeness of each person is further based on the occurrences of the search queries for all persons of the panel of the defined set of search queries of the individual-level data.
3. Method according to one of the previous claims, wherein the weights of representativeness are determined iteratively.
4. Method according to the previous claim, wherein the weight of representativeness of each person in a second or higher iteration is further based on the weight of representativeness of this person from the previous iteration.
5. Method according to the previous claim, wherein the weight of representativeness of each person in a second or higher iteration is further based on the weight of representativeness of all persons of the panel from the previous iteration.
6. Method according to one of claims 4 to 5, wherein the following steps are performed in the processing means (2) in each iteration: selecting the defined set of search queries in the individual- level data for this iteration;
determining for each person of the panel an intermediate weight of representativeness of this person based on the occurrences of the search queries for this person, preferably for all persons of the panel of the set of search queries selected in this iteration and based on the occurrences of the search queries of the same set of search queries in the reference data, and
determining for each person of the panel the weight of representativeness of this person based on the intermediate weight of representativeness of this person.
7. Method according to the previous claim, wherein for the second or higher iteration, the weight of representativeness of each person of the panel is based on the intermediate weight of representativeness of this person of this iteration and the weight of representativeness of this person from the previous iteration.
8. Method according to one of the previous claims, wherein the weight of representativeness of each person or the intermediate weight of representativeness of each person is determined, in the processing means (2), based on a weighted sum over all search queries of the defined set of search queries of the multiplication of the occurrence of each search query of the defined set of search query for this person with a weight of each search query of the defined set of search query.
9. Method according to the previous claim, wherein the weight of representativeness of each person or the intermediate weight of representativeness of each person is further determined, in the processing means (2), based on a sum over all search queries of the defined set of search queries of the weight of each search query of the defined set of search query.
10. Method according to claim 8 or 9, wherein the weight of representativeness of each person or the intermediate weight of representativeness of each person is determined, in the processing means (2), based on ratio of a sum of an offset and of the sum over all search queries of the defined set of search queries of the weight of each search query of the defined set of search queries divided by a sum of the offset and of the weighted sum.
11. Method according to one of claims 8 to 10, wherein the weight of each search query is determined based on the ratio of the cumulative occurrence of this search query of all persons of the panel divided by the occurrence of this search query in the population.
12. Method according to claim 11 and one of claims 4 to 7, wherein the cumulative occurrence of this search query of all persons of the panel is based on the sum over all persons of the panel of the
multiplication of the occurrence of this search query of each person of the panel with the weight of representativeness of the respective person from the previous iteration.
13. Computerized method for behavioural analysis in individual- level data comprising the following steps:
determining weights of representativeness of each person of a panel of the individual-level data for a population according to the method according to one of the previous claims;
analyzing the behavior of each person of the panel on the basis of the individual-level data to obtain a behavioral result for each person;
determine a behavioral result of the population based on the combination of the results of the analyzed behavior of each person of the panel weighted by the determined weight of representativeness of the respective person.
14. Computer program comprising a set of instructions configured to perform the steps of the method according to one of the previous claims, when executed on a processing means (2).
15. System comprising:
storage means (1) storing individual-level data and reference data, wherein the individual-level data comprise for a panel of persons search queries performed by each person of the panel, wherein the panel of persons of the individual-level data is a subset of a predetermined population of persons, wherein reference data comprises the occurrences of the search queries performed by the persons of the population; and processing means (2) configured to determine for each person of the panel the weight of representativeness of this person based on the occurrences of the search queries for this person of a defined set of search queries of the individual-level data and based on the occurrence of the search queries of the same set of search queries in the reference data.
16. System according to claim 15, wherein the processing means (2) is further configured to:
analyzing the behavior of each person of the panel on the basis of the individual-level data to obtain a behavioral result for each person; and
determine a behavioral result of the population based on the combination of the results of the analyzed behavior of each person of the panel weighted by the determined weight of representativeness of the respective person.
PCT/IB2018/054587 2018-06-21 2018-06-21 Method, system and computer program for determining weights of representativeness in individual-level data WO2019243876A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/IB2018/054587 WO2019243876A1 (en) 2018-06-21 2018-06-21 Method, system and computer program for determining weights of representativeness in individual-level data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/IB2018/054587 WO2019243876A1 (en) 2018-06-21 2018-06-21 Method, system and computer program for determining weights of representativeness in individual-level data

Publications (1)

Publication Number Publication Date
WO2019243876A1 true WO2019243876A1 (en) 2019-12-26

Family

ID=63145142

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2018/054587 WO2019243876A1 (en) 2018-06-21 2018-06-21 Method, system and computer program for determining weights of representativeness in individual-level data

Country Status (1)

Country Link
WO (1) WO2019243876A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022056741A1 (en) * 2020-09-16 2022-03-24 Motorola Solutions, Inc. Device, system and method for modifying electronic workflows

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080256056A1 (en) * 2007-04-10 2008-10-16 Yahoo! Inc. System for building a data structure representing a network of users and advertisers
US20100257171A1 (en) * 2009-04-03 2010-10-07 Yahoo! Inc. Techniques for categorizing search queries
WO2011014905A1 (en) * 2009-08-04 2011-02-10 Zebra Research Pty Ltd Method for undertaking market research of a target population
US20150006547A1 (en) * 2013-06-28 2015-01-01 1World Online, Inc. Dynamic research panel
US20150201031A1 (en) * 2012-01-27 2015-07-16 Compete, Inc. Dynamic normalization of internet traffic
US20150363802A1 (en) * 2013-11-20 2015-12-17 Google Inc. Survey amplification using respondent characteristics

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080256056A1 (en) * 2007-04-10 2008-10-16 Yahoo! Inc. System for building a data structure representing a network of users and advertisers
US20100257171A1 (en) * 2009-04-03 2010-10-07 Yahoo! Inc. Techniques for categorizing search queries
WO2011014905A1 (en) * 2009-08-04 2011-02-10 Zebra Research Pty Ltd Method for undertaking market research of a target population
US20150201031A1 (en) * 2012-01-27 2015-07-16 Compete, Inc. Dynamic normalization of internet traffic
US20150006547A1 (en) * 2013-06-28 2015-01-01 1World Online, Inc. Dynamic research panel
US20150363802A1 (en) * 2013-11-20 2015-12-17 Google Inc. Survey amplification using respondent characteristics

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022056741A1 (en) * 2020-09-16 2022-03-24 Motorola Solutions, Inc. Device, system and method for modifying electronic workflows

Similar Documents

Publication Publication Date Title
Sofaer et al. The area under the precision‐recall curve as a performance metric for rare binary events
Boo et al. Secondary analysis of national survey datasets
Eisend et al. Measurement characteristics of Aaker's brand personality dimensions: Lessons to be learned from human personality research
US9069872B2 (en) Relating web page change with revisitation patterns
Braunhofer et al. Selective contextual information acquisition in travel recommender systems
KR101827345B1 (en) Personalized recommendation system and its method using multiple algorithms and self-learning function
JP6154963B2 (en) Information processing apparatus, information processing method, and information processing program
WO2001025947A1 (en) Method of dynamically recommending web sites and answering user queries based upon affinity groups
Khedr et al. A proposed configurable approach for recommendation systems via data mining techniques
Leone et al. Depends who's asking: Interviewer effects in demographic and health surveys abortion data
Liu et al. Analysis of the performance and robustness of methods to detect base locations of individuals with geo-tagged social media data
Chen et al. Identifying home locations in human mobility data: an open-source R package for comparison and reproducibility
Adhikary et al. Micro-modelling of individual tourist’s information-seeking behaviour: a heterogeneity-specific study
Durand et al. How to combine and analyze all the data from diverse sources: a multilevel analysis of institutional trust in the world
Rabbani et al. A latent profile analysis of college students' financial knowledge: The role of financial education, financial well-being, and financial risk tolerance
Van Den Bergh et al. Latent class trees with the three-step approach
Schecter et al. The power, accuracy, and precision of the relational event model
Bartram Sexual orientation and life satisfaction
Kang et al. Analysing diffusion pattern of mobile application services in Korea using the competitive Bass model and Herfindahl index
Paganin et al. Computational strategies and estimation performance with Bayesian semiparametric item response theory models
US20190005519A1 (en) Peak sale and one year sale prediction for hardcover first releases
US20150142782A1 (en) Method for associating metadata with images
Peng et al. Rasch model in data envelopment analysis: application in the international tourist hotel industry
WO2019243876A1 (en) Method, system and computer program for determining weights of representativeness in individual-level data
KR102323153B1 (en) Decision support server and method for providing decision support reference based on unconsciouness area of customer

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18752622

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205N DATED 11/02/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18752622

Country of ref document: EP

Kind code of ref document: A1