CN110738396B - Feature extraction method, device and equipment for equipment - Google Patents

Feature extraction method, device and equipment for equipment Download PDF

Info

Publication number
CN110738396B
CN110738396B CN201910881085.1A CN201910881085A CN110738396B CN 110738396 B CN110738396 B CN 110738396B CN 201910881085 A CN201910881085 A CN 201910881085A CN 110738396 B CN110738396 B CN 110738396B
Authority
CN
China
Prior art keywords
equipment
information
feature vector
generate
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910881085.1A
Other languages
Chinese (zh)
Other versions
CN110738396A (en
Inventor
王骏
杨陆毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Advanced New Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced New Technologies Co Ltd filed Critical Advanced New Technologies Co Ltd
Priority to CN201910881085.1A priority Critical patent/CN110738396B/en
Publication of CN110738396A publication Critical patent/CN110738396A/en
Application granted granted Critical
Publication of CN110738396B publication Critical patent/CN110738396B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0207Discounts or incentives, e.g. coupons or rebates
    • G06Q30/0222During e-commerce, i.e. online transactions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0207Discounts or incentives, e.g. coupons or rebates
    • G06Q30/0225Avoiding frauds

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Game Theory and Decision Science (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A feature extraction method, device and equipment for equipment are disclosed. According to the scheme provided by the embodiment of the specification, some equipment information and environment information which are not easy to tamper in equipment are collected and combined to form the characteristic vector of the equipment, and the characteristic vector containing the characteristic value is generated by combining the service processing times, so that the characteristic vector can be used as the characteristic value of a training sample corresponding to the equipment, and model training and risk identification are performed.

Description

Feature extraction method, device and equipment for equipment
Technical Field
Embodiments of the present disclosure relate to the field of information technologies, and in particular, to a method, an apparatus, and a device for extracting features of a device.
Background
Electronic merchant web sites often conduct marketing campaigns in order to acquire new customers. Marketing campaigns typically identify user devices (e.g., cell phones, tablets, etc.) that allow the same device to receive rewards a limited number of times.
However, some black-producing group partners often use some tools to modify mobile phone parameters, so that the black-producing group partners can simulate an infinite number of devices by using a small number of devices, and receive marketing rewards infinitely. In this process, the wind control system needs to implement wind control management based on suitable equipment characteristics.
Based on this, a feature extraction scheme for a device is required.
Disclosure of Invention
The embodiment of the application aims to provide a scheme for realizing equipment characteristic extraction in wind control management.
In order to solve the technical problems, the embodiment of the application is realized as follows:
In one aspect, an embodiment of the present disclosure provides a feature extraction method for a device, including:
acquiring equipment information and environment information of equipment;
performing numerical conversion on the equipment information to generate a corresponding equipment information numerical value;
aggregating the equipment information values to generate equipment identifiers, wherein the equipment identifiers are used for identifying the same type of equipment;
generating a feature vector containing the equipment identifier and the environmental information, and determining the service processing times corresponding to the feature vector under the same time window;
and determining the service processing times corresponding to the feature vector as the feature value of the training sample corresponding to the equipment so as to train the equipment risk identification model.
On the other hand, the embodiment of the specification also provides a device risk identification method based on a risk identification model, which comprises the following steps:
Acquiring equipment information to be detected and environment information of equipment;
performing numerical conversion on the equipment information to generate a corresponding equipment information numerical value;
aggregating the equipment information values to generate equipment identifiers, wherein the equipment identifiers are used for identifying the same type of equipment;
Generating a feature vector containing the equipment identifier and the environmental information, acquiring the service processing times corresponding to the feature vector under the same time window, and determining the service processing times as a feature value of the feature vector of the equipment to be detected;
And based on the characteristic value of the characteristic vector of the equipment to be detected, evaluating the risk degree of the equipment to be detected by adopting the equipment risk identification model.
Correspondingly, the embodiment of the specification also provides a feature extraction device for equipment, which comprises:
The acquisition module acquires equipment information and environment information of the equipment;
The conversion module is used for carrying out numerical conversion on the equipment information to generate a corresponding equipment information numerical value;
The aggregation module is used for aggregating the equipment information values to generate equipment identifiers, wherein the equipment identifiers are used for identifying the same type of equipment;
the generation module generates a feature vector containing the equipment identifier and the environmental information, and determines the service processing times corresponding to the feature vector under the same time window;
And the characteristic value determining module is used for determining the service processing times corresponding to the characteristic vector as the characteristic value of the training sample corresponding to the equipment so as to train the equipment risk identification model.
Corresponding to another aspect, an embodiment of the present disclosure further provides an apparatus for identifying a risk of a device based on a device risk identification model, including:
the acquisition module is used for acquiring equipment information to be detected and environment information of equipment;
The conversion module is used for carrying out numerical conversion on the equipment information to generate a corresponding equipment information numerical value;
The aggregation module is used for aggregating the equipment information values to generate equipment identifiers, wherein the equipment identifiers are used for identifying the same type of equipment;
The generation module generates a feature vector containing the equipment identifier and the environmental information, acquires the service processing times corresponding to the feature vector under the same time window, and determines the service processing times as a feature value of the feature vector of the equipment to be detected;
and the risk identification module is used for evaluating the risk degree of the equipment to be detected by adopting the equipment risk identification model based on the characteristic value of the characteristic vector of the equipment to be detected.
According to the scheme provided by the embodiment of the specification, some equipment information and environment information which are generally difficult to tamper in equipment are collected, the characteristic vector of the equipment is formed in a combined mode, and the characteristic vector containing the characteristic value is generated by combining the service processing times, so that the characteristic value of a training sample can be used for model training and risk identification, the identification accuracy of a risk identification model to the equipment is improved, the equipment dimension information is prevented from being broken through by a single point, and the stability and the accuracy of equipment risk identification are integrally improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the embodiments of the disclosure.
Further, not all of the effects described above need be achieved in any of the embodiments of the present specification.
Drawings
In order to more clearly illustrate the embodiments of the present description or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the embodiments of the present description, and other drawings may be obtained according to these drawings for a person having ordinary skill in the art.
Fig. 1 is a schematic flow chart of a feature extraction method for a device according to an embodiment of the present disclosure;
Fig. 2 is a schematic flow chart of a device risk identification method according to an embodiment of the present disclosure;
fig. 3 is a schematic structural view of a feature extraction device for an apparatus according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an apparatus risk identification device according to an embodiment of the present disclosure;
fig. 5 is a schematic diagram of an apparatus for configuring the method of the embodiments of the present specification.
Detailed Description
In order for those skilled in the art to better understand the technical solutions in the embodiments of the present specification, the technical solutions in the embodiments of the present specification will be described in detail below with reference to the drawings in the embodiments of the present specification, and it is apparent that the described embodiments are only some embodiments of the present specification, not all embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification shall fall within the scope of protection.
When current marketers conduct business promotions, there are often a large number of promotional campaigns, such as dispensing red packs, cash, vouchers, and the like. In this process, a common method for identifying equipment in the wind control system is to use equipment fingerprints. For example, a device is located using a local mac address, international mobile equipment identity (International Mobile Equipment Identity, IMEI), international mobile subscriber identity (International Mobile Subscriber Identity, IMSI), baseband, version number, etc. as a device fingerprint.
In this way, the black-producing team often modifies some key parameters of the mobile phone through tampering with the tool, thereby causing the fingerprint of the device to change and the uniqueness to be destroyed. The black product team can disguise an infinite number of devices through continuous tampering of a small number of devices, so that interception of the wind control system is bypassed, marketing rewards are received without limit, and loss is caused. The core problem is that the characteristics of the devices in the risk identification model are too single and can be easily bypassed.
Based on this, the present specification embodiment provides a feature extraction method for a device for use in risk recognition model training. As shown in fig. 1, fig. 1 is a schematic flow chart of a feature extraction method for a device according to an embodiment of the present disclosure, where the flow specifically includes the following steps:
s101, acquiring equipment information and environment information of the equipment.
As previously mentioned, the device may be a user terminal such as a cell phone, tablet, personal computer, or the like.
In this embodiment of the present disclosure, the device information may include the aforementioned IMEI, IMSI, baseband, and version number of the strong device information, and may also include weak device information such as a device brand, a device model, a processor frequency, a ring volume, a call volume, an alarm volume, a remaining battery power, a remaining device memory capacity, or a remaining device memory card capacity. In other words, the acquired device information is device information that is not easily or frequently modified by the user.
The context information in which the device is located may include a network protocol IP address of the device, a media access control MAC address of the device, or a real physical address of the device (e.g., latitude and longitude coordinates obtained by a location module of the device).
S103, performing numerical conversion on the equipment information to generate a corresponding equipment information numerical value.
The numerical conversion herein includes a variety of ways. In particular, for device information that is not variable (here, not variable refers to changing naturally during use by a user), such as a device brand, a device model, a processor frequency, and the like. The one-to-one mapping may be performed using a preset mapping table. As shown in table 1, table 1 is a mapping table of device brands and device information values provided in the embodiments of the present specification.
Equipment branding Device information value
Apple 1
Huawei 2
Honor 3
Vivo 4
…… ……
For variable device information such as ring volume, call volume, alarm volume, battery remaining power, device memory remaining capacity or device memory card remaining capacity, corresponding numerical conversion can be performed based on a preset algorithm based on the current value of the device information.
For example, for the aforementioned variable device information, a percentage coefficient in the device is obtained for the device information, which percentage coefficient is generally used to describe the remaining available proportion of the device information. And further, determining the coefficient interval of the percentage statement, and determining the equipment information value corresponding to the coefficient interval according to the preset interval value corresponding relation.
For example, it is assumed that five intervals are divided from 0 to 100% equidistantly for the remaining proportion of the battery charge in advance, and sequentially correspond to the values 1 to 5. Assuming that the remaining battery power of one device is 50%, it can be known that the percentage coefficient 50% corresponds to the coefficient interval [0.4,0.6], and further it can be known that the corresponding device information value is 3.
By performing interval numerical mapping on the variable device information, the deviation caused by micro fluctuation of the device information can be reduced, and the stability of the sample characteristics can be improved.
S105, aggregating the equipment information values to generate equipment identifiers, wherein the equipment identifiers are used for identifying the same type of equipment.
The aggregation may be to splice the device information values in a specified order, and generate a character string containing the device information values as a device identifier. For example, the spliced character string is "112141336", that is, the device brand 1, the device model 1, the processor frequency 2, the ring volume 1, the call volume 4, the alarm volume 1, the remaining battery level 3, the remaining device memory capacity 3, and the remaining device memory card capacity 6 are respectively represented.
Or further other operations may be performed on the different device information values, for example, the device information values of the invariable device information may be encoded separately, and the device information values of the variable device information may be further generalized and aggregated to obtain the device identifier.
It will be readily appreciated that although the resulting device identification may already characterize the device, in practice, devices of other users may often have the same or similar device information and thus the same device identification. In other words, the device identification is used to identify a class of devices that have the same or similar device information. As shown by the aforementioned device identifier "112141336".
S107, generating a feature vector containing the equipment identifier and the environment information, and determining the service processing times corresponding to the feature vector under the same time window.
On the basis that the device identification has been generated, the device information and the environment information can be combined to obtain a feature vector which can characterize the device. Specifically, a network protocol IP address of the device may be obtained, and a first feature vector devicetag _ip_variable_category containing a device identifier and the IP address is generated; or acquiring a Media Access Control (MAC) address of the equipment to generate a second feature vector devicetag _Mac_variable_category containing the equipment identifier and the MAC address; or acquiring the real physical address of the device, and generating a third feature vector devicetag _lbs_variable_category containing the device identification and the real physical address.
The three feature vectors may be used alone or together. That is, for a device, one or more of the three feature vectors may be included in the training samples corresponding to the device.
Further, the service processing times under the same time window corresponding to the feature vector can be obtained for the determined feature vector. The service processing times comprise transaction times, acquisition rewards times, account numbers and the like. The time window may be preset, for example, the first 24 hours of the current time.
For example, for the first feature vector devicetag _ip_variable_category, if the feature vector is in the form of "(112141336, ip 1)", the number of rewards N1, the number of transactions N2 or the number of accounts N3 identified at the device as "112141336" and the ip address as "ip1" need to be acquired for the full amount of sample data (typically, history data over a certain period of time).
And S109, determining the service processing times corresponding to the feature vector as the feature value of the training sample corresponding to the equipment so as to train the equipment risk identification model.
As described above, the corresponding feature value N1 may be determined, so that the feature value N1 corresponding to the first feature vector (112141336, ip 1) "is used as the first feature value of the training sample corresponding to the apparatus. Similarly, the sample may further include another eigenvalue N2, such as a first eigenvector, or a second eigenvalue corresponding to a second eigenvector, and so on.
As previously described, device identification is used to identify a class of devices, but after context information is joined, the resulting feature vector may then be used to identify a particular device. In practice, it is considered that under certain environmental information conditions, different devices are sufficiently distinguished from each other by the device information obtained by the embodiments of the present specification. For example, at one ip or one longitude and latitude coordinate, the device information of two devices is not substantially the same. Thus, the feature vector can be used as a sample feature in risk identification to participate in model training and scoring. Of course, it should be noted that, in the case that the foregoing feature vector is included in the model training, other conventional feature variables are not excluded from participating in the training and risk recognition as sample features.
According to the scheme provided by the embodiment of the specification, some equipment information and environment information which are generally difficult to tamper in equipment are collected, the characteristic vector of the equipment is formed in a combined mode, and the characteristic vector containing the characteristic value is generated by combining the service processing times, so that the characteristic value of a training sample can be used for model training and risk identification, the identification accuracy of a risk identification model to the equipment is improved, the equipment dimension information is prevented from being broken through by a single point, and the stability and the accuracy of equipment risk identification are integrally improved.
Further, in the embodiment of the present disclosure, the corresponding model training may be performed based on the feature values of the training samples corresponding to the foregoing apparatus. In particular, the manner in which the model is trained may include supervised training or unsupervised clustering training.
In other words, the embodiment of the description can determine in advance whether a device is at risk (i.e. whether the device information is a black machine) and can also perform feature value extraction on the black machine sample based on the foregoing steps to serve as a negative sample in training, so that in supervised learning, each training sample can be given a corresponding label (whether the device is a black machine or not) in practice, and thus, the first device risk identification model can be obtained according to the feature value training of the training sample and used for evaluating whether the device is a black machine.
Or without the label of each sample device, the embodiment of the specification can also perform corresponding unsupervised clustering model training based on the feature value of the sample feature, and can classify the devices with similar features correspondingly through the clustering training, thereby obtaining a second device risk identification model for classification, and the second device risk identification model is used for evaluating whether one device is a black machine or not.
On the other hand, after the foregoing device risk recognition model has been trained, the embodiment of the present disclosure further provides a device risk recognition method based on the foregoing device risk recognition model, as shown in fig. 2, and fig. 2 is a schematic flow chart of a device risk recognition method provided in the embodiment of the present disclosure, including:
S201, acquiring equipment information to be detected and environment information where equipment is located;
s203, performing numerical conversion on the equipment information to generate a corresponding equipment information numerical value;
S205, aggregating the equipment information values to generate equipment identifiers, wherein the equipment identifiers are used for identifying the same type of equipment;
S207, generating a feature vector containing the equipment identifier and the environmental information, acquiring the service processing times corresponding to the feature vector under the same time window, and determining the service processing times as a feature value of the feature vector of the equipment to be detected;
s209, based on the characteristic value of the characteristic vector of the equipment to be detected, evaluating the risk degree of the equipment to be detected by adopting the equipment risk identification model.
Correspondingly, the embodiment of the present disclosure further provides a feature extraction device for a device, as shown in fig. 3, and fig. 3 is a schematic structural diagram of the feature extraction device for a device provided in the embodiment of the present disclosure, including:
the acquisition module 301 acquires device information and environment information where the device is located;
the conversion module 303 performs numerical conversion on the device information to generate a corresponding device information numerical value;
The aggregation module 305 aggregates the device information values to generate a device identifier, where the device identifier is used to identify the same type of device;
A generating module 307, configured to generate a feature vector containing the device identifier and the environmental information, and determine the number of service processing times corresponding to the feature vector under the same time window;
and the feature value determining module 309 determines the number of service processing times corresponding to the feature vector as the feature value of the training sample corresponding to the device, so as to perform training of the device risk identification model.
Further, the device information includes at least one of a device brand, a device model, a processor frequency, a ring volume, a call volume, an alarm volume, a battery remaining capacity, a device memory remaining capacity, or a device memory card remaining capacity; correspondingly, the conversion module 303 obtains percentage coefficients of ring volume, call volume, alarm volume, battery remaining capacity, device memory remaining capacity or device memory card remaining capacity; and determining a coefficient interval to which the percentage coefficient belongs, and determining a device information value corresponding to the coefficient interval according to a preset interval value corresponding relation.
Further, the aggregation module 305 splices the device information values according to a specified sequence, and generates a character string containing the device information values; and determining the character string as a device identifier.
Further, the generating module 307 obtains the IP address of the device and generates a first feature vector containing the device identifier and the IP address; or acquiring a Media Access Control (MAC) address of the equipment to generate a second feature vector containing the equipment identifier and the MAC address; or acquiring the real physical address of the device, and generating a third feature vector containing the device identification and the real physical address.
Further, the device further includes a model training module 311 for performing supervised or unsupervised model training according to the feature values of the training samples corresponding to the equipment, so as to generate an equipment risk identification model.
Corresponding to another aspect, the embodiment of the present disclosure further provides an apparatus risk recognition device based on the foregoing apparatus risk recognition model, as shown in fig. 4, and fig. 4 is a schematic structural diagram of the apparatus risk recognition device provided in the embodiment of the present disclosure, including:
The acquisition module 401 acquires equipment information to be detected and environment information in which the equipment is located;
the conversion module 403 performs numerical conversion on the device information to generate a corresponding device information numerical value;
an aggregation module 405, configured to aggregate the device information values, and generate a device identifier, where the device identifier is used to identify a device of the same class;
The generating module 407 generates a feature vector containing the equipment identifier and the environmental information, acquires the service processing times corresponding to the feature vector under the same time window, and determines the service processing times as a feature value of the feature vector of the equipment to be detected;
and a risk identification module 409, configured to evaluate a risk degree of the device to be detected using the device risk identification model based on the feature value of the feature vector of the device to be detected.
The embodiments of the present disclosure also provide a computer device at least including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the feature extraction method shown in fig. 1 when executing the program.
The embodiments of the present disclosure also provide a computer device, which at least includes a memory, a processor, and a computer program stored on the memory and capable of running on the processor, where the processor implements the device risk identification method shown in fig. 2 when executing the program.
FIG. 5 illustrates a more specific hardware architecture diagram of a computing device provided by embodiments of the present description, which may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 implement communication connections therebetween within the device via a bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit ), a microprocessor, an Application SPECIFIC INTEGRATED Circuit (ASIC), or one or more integrated circuits, etc. for executing related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory ), static storage, dynamic storage, etc. Memory 1020 may store an operating system and other application programs, and when the embodiments of the present specification are implemented in software or firmware, the associated program code is stored in memory 1020 and executed by processor 1010.
The input/output interface 1030 is used to connect with an input/output module for inputting and outputting information. The input/output module may be configured as a component in a device (not shown) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.
Communication interface 1040 is used to connect communication modules (not shown) to enable communication interactions of the present device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).
Bus 1050 includes a path for transferring information between components of the device (e.g., processor 1010, memory 1020, input/output interface 1030, and communication interface 1040).
It should be noted that although the above-described device only shows processor 1010, memory 1020, input/output interface 1030, communication interface 1040, and bus 1050, in an implementation, the device may include other components necessary to achieve proper operation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary to implement the embodiments of the present description, and not all the components shown in the drawings.
The present embodiment also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the feature extraction method shown in fig. 1.
The present embodiment also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the device risk identification method shown in fig. 2.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
From the foregoing description of embodiments, it will be apparent to those skilled in the art that the present embodiments may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solutions of the embodiments of the present specification may be embodied in essence or what contributes to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present specification.
The system, method, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. A typical implementation device is a computer, which may be in the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email device, game console, tablet computer, wearable device, or a combination of any of these devices.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the method embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points. The above-described method embodiments are merely illustrative, in that the modules illustrated as separate components may or may not be physically separate, and the functions of the modules may be implemented in the same piece or pieces of software and/or hardware when implementing the embodiments of the present disclosure. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The foregoing is merely a specific implementation of the embodiments of this disclosure, and it should be noted that, for a person skilled in the art, several improvements and modifications may be made without departing from the principles of the embodiments of this disclosure, and these improvements and modifications should also be considered as protective scope of the embodiments of this disclosure.

Claims (13)

1. A feature extraction method for a device, comprising:
Acquiring equipment information and environment information of equipment; the environment information is equipment address information;
According to the coefficient interval corresponding to the equipment information, carrying out numerical conversion on the equipment information to generate a corresponding equipment information numerical value;
aggregating the equipment information values to generate equipment identifiers, wherein the equipment identifiers are used for identifying the same type of equipment;
generating a feature vector containing the equipment identifier and the environmental information, and determining the service processing times corresponding to the feature vector under the same time window;
and determining the service processing times corresponding to the feature vector as the feature value of the training sample corresponding to the equipment so as to train the equipment risk identification model.
2. The method of claim 1, the device information comprising at least one of a device brand, a device model, a processor frequency, a ring tone volume, a call volume, an alarm clock volume, a battery remaining power, a device memory remaining capacity, or a device memory card remaining capacity;
Correspondingly, performing numerical conversion on the equipment information to generate a corresponding equipment information numerical value, including:
Acquiring percentage coefficients of ring volume, call volume, alarm clock volume, battery residual capacity, equipment memory residual capacity or equipment memory card residual capacity;
And determining a coefficient interval to which the percentage coefficient belongs, and determining a device information value corresponding to the coefficient interval according to a preset interval value corresponding relation.
3. The method of claim 1, aggregating the device information values, generating a device identification, comprising:
Splicing the equipment information values according to a designated sequence to generate a character string containing the equipment information values;
And determining the character string as a device identifier.
4. The method of claim 1, generating a feature vector containing the device identification and environmental information, comprising:
Acquiring a network protocol (IP) address of equipment, and generating a first feature vector containing equipment identification and the IP address; or alternatively
Acquiring a Media Access Control (MAC) address of the equipment, and generating a second feature vector containing the equipment identifier and the MAC address; or alternatively
The real physical address of the device is obtained, and a third feature vector containing the device identification and the real physical address is generated.
5. The method of claim 1, the method further comprising:
And training the supervised or unsupervised model according to the characteristic values of the training samples corresponding to the equipment to generate an equipment risk identification model.
6. A device risk identification method based on the device risk identification model of claim 5, comprising:
acquiring equipment information to be detected and environment information of equipment; the environment information is equipment address information;
According to the coefficient interval corresponding to the equipment information, carrying out numerical conversion on the equipment information to generate a corresponding equipment information numerical value;
aggregating the equipment information values to generate equipment identifiers, wherein the equipment identifiers are used for identifying the same type of equipment;
Generating a feature vector containing the equipment identifier and the environmental information, acquiring the service processing times corresponding to the feature vector under the same time window, and determining the service processing times as a feature value of the feature vector of the equipment to be detected;
And based on the characteristic value of the characteristic vector of the equipment to be detected, evaluating the risk degree of the equipment to be detected by adopting the equipment risk identification model.
7. A feature extraction apparatus for a device, comprising:
The acquisition module acquires equipment information and environment information of the equipment; the environment information is equipment address information;
the conversion module is used for carrying out numerical conversion on the equipment information according to the coefficient interval corresponding to the equipment information to generate a corresponding equipment information numerical value;
The aggregation module is used for aggregating the equipment information values to generate equipment identifiers, wherein the equipment identifiers are used for identifying the same type of equipment;
the generation module generates a feature vector containing the equipment identifier and the environmental information, and determines the service processing times corresponding to the feature vector under the same time window;
And the characteristic value determining module is used for determining the service processing times corresponding to the characteristic vector as the characteristic value of the training sample corresponding to the equipment so as to train the equipment risk identification model.
8. The apparatus of claim 7, the device information comprising at least one of a device brand, a device model, a processor frequency, a ring tone volume, a call volume, an alarm clock volume, a battery remaining power, a device memory remaining capacity, or a device memory card remaining capacity;
Correspondingly, the conversion module acquires percentage coefficients of ring volume, call volume, alarm clock volume, battery residual capacity, equipment memory residual capacity or equipment memory card residual capacity; and determining a coefficient interval to which the percentage coefficient belongs, and determining a device information value corresponding to the coefficient interval according to a preset interval value corresponding relation.
9. The apparatus of claim 7, the aggregation module concatenates the device information values in a specified order to generate a string comprising device information values;
And determining the character string as a device identifier.
10. The apparatus of claim 7, the generation module to obtain a network protocol IP address of the device, generate a first feature vector comprising the device identification and the IP address; or acquiring a Media Access Control (MAC) address of the equipment to generate a second feature vector containing the equipment identifier and the MAC address; or acquiring the real physical address of the device, and generating a third feature vector containing the device identification and the real physical address.
11. The apparatus of claim 7, further comprising a model training module configured to perform supervised or unsupervised model training based on feature values of training samples corresponding to the device to generate a device risk identification model.
12. A device risk recognition apparatus based on the device risk recognition model of claim 11, comprising:
the acquisition module is used for acquiring equipment information to be detected and environment information of equipment; the environment information is equipment address information;
the conversion module is used for carrying out numerical conversion on the equipment information according to the coefficient interval corresponding to the equipment information to generate a corresponding equipment information numerical value;
The aggregation module is used for aggregating the equipment information values to generate equipment identifiers, wherein the equipment identifiers are used for identifying the same type of equipment;
The generation module generates a feature vector containing the equipment identifier and the environmental information, acquires the service processing times corresponding to the feature vector under the same time window, and determines the service processing times as a feature value of the feature vector of the equipment to be detected;
and the risk identification module is used for evaluating the risk degree of the equipment to be detected by adopting the equipment risk identification model based on the characteristic value of the characteristic vector of the equipment to be detected.
13. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 6 when the program is executed by the processor.
CN201910881085.1A 2019-09-18 2019-09-18 Feature extraction method, device and equipment for equipment Active CN110738396B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910881085.1A CN110738396B (en) 2019-09-18 2019-09-18 Feature extraction method, device and equipment for equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910881085.1A CN110738396B (en) 2019-09-18 2019-09-18 Feature extraction method, device and equipment for equipment

Publications (2)

Publication Number Publication Date
CN110738396A CN110738396A (en) 2020-01-31
CN110738396B true CN110738396B (en) 2024-06-14

Family

ID=69268030

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910881085.1A Active CN110738396B (en) 2019-09-18 2019-09-18 Feature extraction method, device and equipment for equipment

Country Status (1)

Country Link
CN (1) CN110738396B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111653271B (en) * 2020-05-26 2023-09-05 大众问问(北京)信息科技有限公司 Sample data acquisition and model training method and device and computer equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107302527A (en) * 2017-06-09 2017-10-27 北京奇安信科技有限公司 A kind of unit exception detection method and device
CN109492378A (en) * 2018-11-26 2019-03-19 平安科技(深圳)有限公司 A kind of auth method based on EIC equipment identification code, server and medium
CN109544190A (en) * 2018-11-28 2019-03-29 北京芯盾时代科技有限公司 A kind of fraud identification model training method, fraud recognition methods and device
CN109600362A (en) * 2018-11-26 2019-04-09 平安科技(深圳)有限公司 Zombie host recognition methods, identification equipment and medium based on identification model

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110609941B (en) * 2015-04-14 2023-07-21 创新先进技术有限公司 Risk identification method and device for Internet operation event
CN105678544A (en) * 2015-12-31 2016-06-15 深圳前海微众银行股份有限公司 Risk monitoring method of remote account opening and server
CN107046516B (en) * 2016-02-05 2020-04-14 上海行邑信息科技有限公司 Wind control method and device for identifying mobile terminal identity
CN106713288A (en) * 2016-12-08 2017-05-24 同盾科技有限公司 Fraud risk identification and prevention method and system
CN108322427A (en) * 2017-01-18 2018-07-24 阿里巴巴集团控股有限公司 A kind of method and apparatus carrying out air control to access request
CN107800678B (en) * 2017-02-16 2020-04-03 平安科技(深圳)有限公司 Method and device for detecting abnormal registration of terminal
CN107392121B (en) * 2017-07-06 2023-05-09 同济大学 Self-adaptive equipment identification method and system based on fingerprint identification
CN108122163A (en) * 2017-11-14 2018-06-05 阿里巴巴集团控股有限公司 Risk monitoring and control method, apparatus and equipment based on internet credit
CN108600414B (en) * 2018-05-09 2022-04-26 中国平安人寿保险股份有限公司 Equipment fingerprint construction method and device, storage medium and terminal
CN109978033B (en) * 2019-03-15 2020-08-04 第四范式(北京)技术有限公司 Method and device for constructing same-operator recognition model and method and device for identifying same-operator

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107302527A (en) * 2017-06-09 2017-10-27 北京奇安信科技有限公司 A kind of unit exception detection method and device
CN109492378A (en) * 2018-11-26 2019-03-19 平安科技(深圳)有限公司 A kind of auth method based on EIC equipment identification code, server and medium
CN109600362A (en) * 2018-11-26 2019-04-09 平安科技(深圳)有限公司 Zombie host recognition methods, identification equipment and medium based on identification model
CN109544190A (en) * 2018-11-28 2019-03-29 北京芯盾时代科技有限公司 A kind of fraud identification model training method, fraud recognition methods and device

Also Published As

Publication number Publication date
CN110738396A (en) 2020-01-31

Similar Documents

Publication Publication Date Title
WO2019095782A1 (en) Data sample label processing method and apparatus
CN108156268A (en) Acquisition methods and server, the terminal device of device identification
CN108933713B (en) Method and device for realizing sandbox debugging based on shadow link and business server
CN110392155B (en) Notification message display and processing method, device and equipment
CN109213857A (en) A kind of fraud recognition methods and device
CN111475853B (en) Model training method and system based on distributed data
CN109598414A (en) Risk evaluation model training, methods of risk assessment, device and electronic equipment
CN104123324A (en) Positioning and obtaining method and device for unread messages
CN110070076B (en) Method and device for selecting training samples
CN111126623A (en) Model updating method, device and equipment
CN112463634A (en) Software testing method and device under micro-service architecture
CN110750530A (en) Service system and data checking method thereof
CN109102324B (en) Model training method, and red packet material laying prediction method and device based on model
CN109002733A (en) A kind of pair of equipment carries out the method and device of reliability evaluation
CN115151931A (en) Electronic device for providing transaction information and operation method thereof
CN110738396B (en) Feature extraction method, device and equipment for equipment
CN109087089B (en) Payment method, payment device and terminal equipment
CN110147999B (en) Transaction risk identification method and device
WO2021093367A1 (en) Model training and risk identification method, apparatus and device
CN104899733B (en) Data processing method and data processing device
CN109191140B (en) Grading card model integration method and device
CN113850603A (en) Method and device for determining reason of payment failure
CN109325015A (en) A kind of extracting method and device of the feature field of domain model
CN110929285B (en) Method and device for processing private data
CN111340574A (en) Risk user identification method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200925

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20200925

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

GR01 Patent grant