Disclosure of Invention
In order to solve the above problems, an object of the present invention is to provide a machine learning online feature production system and method, which do not need to store features during feature production, have high efficiency and low storage cost, support multiple production modes by the same feature, reduce delay in obtaining large-volume features, and have high obtaining efficiency.
The embodiment of the invention provides a machine learning online feature production system, which comprises:
the remote mode service application cluster comprises a plurality of first application nodes, and each first application node calls the feature service cluster through a remote mode;
the feature service cluster comprises a plurality of feature service nodes, each feature service node acquires a first feature from a feature repository through a feature acquisition module according to an application calling request of the remote mode service application cluster, performs feature processing and feature operator operation on the first feature through a first feature processing plug-in and a first feature operator execution module, and outputs a processing result conforming to feature metadata definition;
the feature proxy master node cluster comprises a plurality of feature proxy master nodes, wherein each feature proxy master node maintains features in each first local storage library in a feature subscription mode and respectively generates a full feature snapshot for the full features in each first local storage library at fixed time, wherein the first local storage library is a local storage library of the feature proxy master node;
the embedded mode service application cluster comprises a plurality of second application nodes, each second application node is packaged with a feature proxy SDK, when each second application node initializes features, a full-quantity feature snapshot is downloaded from each feature proxy main node through the feature proxy SDK and a full-quantity feature is constructed in each second local storage library, each second application node further acquires a second feature from each second local storage library through the feature proxy SDK according to an application calling request of the embedded mode service application cluster, performs feature processing and feature operator operation on the second feature through a second feature processing plug-in and a second feature operator execution module, and outputs a processing result conforming to feature metadata definition, wherein the second local storage library is a local storage library of the second application node.
As a further improvement of the present invention, each feature service node obtains a first feature from a feature repository through a feature obtaining module according to an application call request of the remote mode service application cluster, including:
each feature service node loads first feature metadata from a feature metadata database through the feature acquisition module according to an application calling request of the remote mode service application cluster;
determining from the first feature metadata whether the first feature in the feature repository is dependent on other features;
upon determining that the first feature depends on other features, retrieving, by the feature retrieval module, all features that the first feature depends on from the feature repository;
upon determining that the first feature is not dependent on other features, retrieving, by the feature retrieval module, the first feature directly from the feature repository.
As a further development of the invention, the first feature processing plug-in and the first feature operator execution module are loaded in the respective feature service node in accordance with the first feature metadata.
As a further improvement of the invention, the feature operators executed in the first feature operator execution module comprise one or more of a feature type conversion operator, a feature mapping operator, a feature segmentation operator, a feature discretization operator and a feature normalization operator.
As a further improvement of the present invention, the feature acquisition module is packaged to support heterogeneous data sources,
the feature repository is one or more repositories, the plurality of repositories being of the same data type or of different data types.
As a further improvement of the present invention, the obtaining, by each second application node, a second feature from each second local repository according to the application invocation request of the embedded mode service application cluster includes:
each second application node loads second feature metadata from a feature metadata repository through a feature proxy SDK according to the application call request of the embedded mode service application cluster;
determining, from the second feature metadata, whether the second feature in a target local repository is dependent on other features, the target local repository being one or more of a plurality of second local repositories;
when the second feature is determined to be dependent on other features, acquiring all features dependent on the second feature from the target local repository through a feature proxy (SDK);
upon determining that the second feature is not dependent on other features, directly obtaining the second feature from the target local repository through a feature proxy SDK.
As a further development of the invention, the second feature processing plug-in and the second feature operator execution module are loaded in respective second application nodes in accordance with the second feature metadata.
As a further improvement of the present invention, the feature operators executed in the second feature operator execution module include one or more of a feature type conversion operator, a feature mapping operator, a feature segmentation operator, a feature discretization operator, and a feature normalization operator.
As a further refinement of the invention, the full-scale feature in each first local repository comprises a subscription feature and a first change feature persistently stored in the local repository of each feature broker node,
the subscription characteristic is obtained by loading from the characteristic storage library when each characteristic proxy main node is initialized, and the first change characteristic is obtained by consuming the message from the characteristic message queue by each characteristic proxy main node according to the set message queue offset position.
As a further improvement of the present invention, when initializing the feature, each second application node further includes: and setting a message queue offset position through a feature agent SDK and consuming information from the feature message queue so as to acquire a second change feature from the feature message queue and synchronize the second change feature to a second local storage library.
The embodiment of the invention also provides a machine learning online feature production method, which comprises the following steps:
s1, initializing each first application node in the remote mode service application cluster, each feature service node in the feature service cluster, each feature proxy main node in the feature proxy main node cluster and each second application node in the embedded mode service application cluster;
s2, determining one of the remote mode service application cluster and the embedded mode service application cluster as a target service application cluster according to an application calling request of a calling party;
s3, when the target business application cluster is the remote mode business application cluster, each feature service node in the feature service cluster acquires a first feature from a feature repository through a feature acquisition module, and executes feature processing and feature operator operation on the first feature through a first feature processing plug-in and a first feature operator execution module, and outputs a processing result conforming to feature metadata definition;
s4, when the target service application cluster is the inline mode service application cluster, each second application node in the inline mode service application cluster acquires a second feature from each second local repository through a feature proxy SDK, and performs feature processing and feature operator operation on the second feature through a second feature processing plugin and a second feature operator execution module, and outputs a processing result conforming to the feature metadata definition, where the second local repository is a local repository of the second application node.
As a further improvement of the present invention, the initializing each feature service node in the feature service cluster includes:
s11, creating the connection of the feature service node with the feature repository and the feature metadata repository;
s12, the feature service node loads feature metadata from the feature metadata base;
s13, constructing a feature processing plug-in and a feature operator execution module in the feature service node, and caching the feature processing plug-in and the feature operator execution module in a memory of the feature service node;
s14, creating a feature metadata change monitoring task in the feature service node, monitoring the change and newly-added events of the feature metadata, and caching the changed and newly-added feature metadata into a memory of the feature service node;
s15, starting a feature remote calling service and providing the feature remote calling service for the first application node to call;
s16, the initialization is completed, and the flow ends.
As a further improvement of the present invention, each feature proxy master node in the feature proxy master node cluster comprises:
s21, creating the connection between the feature proxy master node and the feature repository and the feature message queue;
s22, determining whether a feature exists in the first local storage library of the feature broker master node, if so, executing S25, and if not, executing S23;
s23, setting the current consumption information position of the message queue and stopping consuming the characteristic message queue;
s24, subscription characteristics are loaded from the characteristic storage library and are stored in the first local storage library in a persistent mode until data loading is completed;
s25, starting to consume the characteristic message queue;
s26, starting execution of a checkpoint task, wherein the checkpoint task is a timing task for generating a full-scale feature snapshot by consuming offset positions in the feature message queue by the features in the first local storage library and the feature proxy master node;
s27, starting a feature data downloading service interface to allow the second application node to download the full feature snapshot;
s28, the initialization is completed, and the flow ends.
As a further improvement of the present invention, the initializing each second application node in the embedded mode service application cluster includes:
s31, creating connection between the second application node and a feature metadata base, the feature proxy main node and a feature message queue;
s32, the second application node loads feature metadata from the feature metadata base;
s33, constructing a feature processing plug-in and a feature operator execution module in the second application node, and caching the feature processing plug-in and the feature operator execution module in a memory of the second application node;
s34, determining whether the characteristics exist in the second local storage library of the second application node, if so, loading the characteristics in the second local storage library, starting the characteristic message queue to consume the message, and ending the process, if not, executing S35;
s35, downloading a full quantity of feature snapshots from the feature proxy main node and writing the full quantity of feature snapshots into the second local storage library, simultaneously verifying whether the loaded feature metadata is legal or not, if not, ending the process, and if so, executing S36;
s36, the second application node loads the current consumption information position of the characteristic message queue and starts to consume the characteristic message queue;
s37, the initialization is completed, and the flow ends.
As a further improvement of the present invention, said S3 includes:
each feature service node monitors an application calling request of the remote mode service application cluster;
after receiving a feature acquisition request, loading first feature metadata from a feature metadata database through a feature acquisition module;
determining whether a first feature in the feature repository depends on other features according to the first feature metadata, if so, acquiring all features of which the first feature depends from the feature repository through the feature acquisition module, otherwise, directly acquiring the first feature from the feature repository through the feature acquisition module;
loading the first feature processing plug-in and the first feature operator execution module corresponding to the first feature according to the first feature metadata;
sequentially executing feature processing and feature operator operation through the first feature processing plug-in and the first feature operator execution module;
and outputting the processing result conforming to the feature metadata definition, and ending the flow.
As a further improvement of the present invention, said S4 includes:
each second application node monitors an application calling request of the embedded mode service application cluster;
after receiving the request for obtaining the characteristics, loading second characteristic metadata from a characteristic metadata repository through a characteristic proxy SDK;
determining whether a second feature in a target local repository depends on other features according to the second feature metadata, if so, acquiring all features of which the second feature depends from the target local repository through a feature proxy SDK, otherwise, directly acquiring the second feature from the target local repository through the feature proxy SDK, wherein the target local repository is one or more of a plurality of second local repositories;
loading the second feature processing plug-in and the second feature operator execution module corresponding to the second feature according to the second feature metadata;
sequentially executing feature processing and feature operator operation through the second feature processing plug-in and the second feature operator execution module;
and outputting the processing result conforming to the feature metadata definition, and ending the flow.
Embodiments of the present invention also provide an electronic device, which includes a memory and a processor, where the memory is configured to store one or more computer instructions, and the one or more computer instructions are executed by the processor to implement the method.
Embodiments of the present invention also provide a computer-readable storage medium, on which a computer program is stored, the computer program being executed by a processor to implement the method.
The invention has the beneficial effects that:
the feature processing plug-in is provided for the production of the dependent features, the feature processing plug-in is operated to perform feature calculation processing without storage, the efficiency of feature production is improved, a feature operator execution module is provided for basic operation of the features, various production modes of the same feature can be realized, and the diversity operation of feature production is realized. The feature proxy SDK is integrated with the application, so that delay in obtaining a large number of features is reduced, and the feature obtaining efficiency is improved. The full feature snapshot of the subscribed features is maintained by the feature proxy main node, so that the full feature can be quickly loaded to the feature proxy SDK to provide data support.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that, if directional indications (such as up, down, left, right, front, and back … …) are involved in the embodiment of the present invention, the directional indications are only used to explain the relative positional relationship between the components, the movement situation, and the like in a specific posture (as shown in the drawing), and if the specific posture is changed, the directional indications are changed accordingly.
In addition, in the description of the present invention, the terms used are for illustrative purposes only and are not intended to limit the scope of the present invention. The terms "comprises" and/or "comprising" are used to specify the presence of stated elements, steps, operations, and/or components, but do not preclude the presence or addition of one or more other elements, steps, operations, and/or components. The terms "first," "second," and the like may be used to describe various elements, not necessarily order, and not necessarily limit the elements. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified. These terms are only used to distinguish one element from another. These and/or other aspects will become apparent to those of ordinary skill in the art in view of the following drawings, and the description of the embodiments of the present invention will be more readily understood by those of ordinary skill in the art. The drawings are only for purposes of illustrating the described embodiments of the invention. One skilled in the art will readily recognize from the following description that alternative embodiments of the structures and methods illustrated in the present application may be employed without departing from the principles described in the present application.
As shown in fig. 1, the system for producing online characteristics of machine learning according to the embodiment of the present invention includes:
the remote mode service application cluster comprises a plurality of first application nodes, and each first application node calls the feature service cluster through a remote mode;
the feature service cluster comprises a plurality of feature service nodes, each feature service node acquires a first feature from a feature repository through a feature acquisition module according to an application calling request of the remote mode service application cluster, performs feature processing and feature operator operation on the first feature through a first feature processing plug-in and a first feature operator execution module, and outputs a processing result conforming to feature metadata definition;
the feature proxy master node cluster comprises a plurality of feature proxy master nodes, wherein each feature proxy master node maintains features in each first local storage library in a feature subscription mode and respectively generates a full feature snapshot for the full features in each first local storage library at fixed time, wherein the first local storage library is a local storage library of the feature proxy master node;
the embedded mode service application cluster comprises a plurality of second application nodes, each second application node is packaged with a feature proxy SDK, when each second application node initializes features, a full-quantity feature snapshot is downloaded from each feature proxy main node through the feature proxy SDK and a full-quantity feature is constructed in each second local storage library, each second application node further acquires a second feature from each second local storage library through the feature proxy SDK according to an application calling request of the embedded mode service application cluster, performs feature processing and feature operator operation on the second feature through a second feature processing plug-in and a second feature operator execution module, and outputs a processing result conforming to feature metadata definition, wherein the second local storage library is a local storage library of the second application node.
The system of the invention provides the feature processing plug-in for the production of the dependence features, and when the feature calculation processing is carried out, particularly the dependence features are calculated and processed, the feature processing plug-in can be operated to calculate without storage, thereby improving the efficiency of feature production and reducing the storage cost. The characteristic operator execution module is provided for the basic operation of the characteristic, the characteristic operator execution module can execute the operation of one or more characteristic operators, various production modes of the same characteristic can be realized, and the diversity operation of characteristic production is realized. The feature proxy SDK is integrated with the service application, and the features are locally acquired through the feature proxy SDK, so that the delay in acquiring the mass features is reduced, and the feature acquisition efficiency is improved. The full feature snapshot of the subscribed features is maintained by the feature proxy main node, so that the full feature can be quickly loaded to the feature proxy SDK to provide data support.
The system supports the characteristic acquisition modes of a Remote Procedure Call (RPC) mode and an embedded mode (local mode), can support various service application scenes, and meets various service requirements. The remote mode service application cluster realizes that each first application node remotely acquires the characteristics from the characteristic repository by calling the characteristic service cluster. And the embedded service application node group is locally called to realize that each second application node acquires the characteristics from the local storage library. It can be understood that the system of the present invention unifies the calling modes of multiple services, the feature service node in the feature service cluster remotely obtains features from the feature repository, and the second application node in the embedded mode service application cluster obtains features from the local repository, and the local mode call has higher efficiency for obtaining features compared with the remote mode call, especially for obtaining large-volume features.
The remote mode service application cluster is composed of a plurality of first application nodes, such as the first application nodes-1, … … and the first application node-n shown in fig. 1, wherein each first application node is stateless and peer-to-peer, and can well support horizontal extension.
The feature service cluster is responsible for reading features from the feature repository, performing operations such as feature processing and feature operator operation, and outputting results conforming to feature metadata definition to a caller. The invoker of the feature service cluster may be one or more first application nodes in the remote mode business application cluster. The feature service cluster is composed of a plurality of feature service nodes, such as the feature service node-1, the feature service nodes-2, … … and the feature service node-n shown in fig. 1, wherein each feature service node is stateless and peer-to-peer, and can well support horizontal extension. Each feature service node encapsulates the details of the whole feature acquisition and calculation, and provides a simple and easy-to-use API for a plurality of first application nodes of the remote mode service application cluster, that is, the first application nodes call the feature service nodes through the API, and provide support for reusing the feature service in more service scenes.
The feature proxy main node cluster is responsible for maintaining the features of service scene subscription and executing a checkpoint task at regular time. The checkpoint task is a timing task for generating a full-scale feature snapshot from the features in the local storage library (the local storage library of the feature proxy Master node, i.e. the first local storage library) and the offset position in the feature message queue consumed by the feature proxy Master node (Master node). The feature agent Master node generates a full-scale feature snapshot to be provided for the feature agent SDK to download, so that the feature agent SDK can construct features according to the snapshot. When the feature agent SDK is initialized, the full-scale feature snapshot is downloaded from the feature agent Master node, and the full-scale feature is built in the local repository (the local repository of the second application node, namely the second local repository) again, so that the horizontal capacity expansion of the feature agent SDK node (the second application node) is facilitated.
The embedded mode service application cluster is used for embedded mode calling, namely local mode calling, and is composed of a plurality of second application nodes, such as second application nodes-1 and … … and a second application node-n shown in fig. 1, wherein each second application node is stateless and peer-to-peer, and can well support extension. Each second application node encapsulates a feature proxy SDK, a simple and easy-to-use SDK is provided for a plurality of second application nodes of the embedded mode service application cluster, the feature proxy SDK is responsible for subscribing, feature increasing and feature modifying maintenance in a local repository, an efficient feature obtaining mode is provided for service applications with high concurrency, large volume and low obtaining delay, and delay in feature obtaining is reduced. After the feature agent SDK is started, features are initialized, full feature snapshots are downloaded from a feature agent Master node, then the full features are loaded to a local storage library, and meanwhile, message queue offset positions are set to consume feature message queues, so that complete feature data in the local storage library of the feature agent SDK can be guaranteed, and consistency of the feature data in the feature agent SDK can be guaranteed.
In an optional implementation manner, the obtaining, by the feature obtaining module, a first feature from a feature repository according to an application invocation request of the remote mode service application cluster by each feature service node includes:
each feature service node loads first feature metadata from a feature metadata database through the feature acquisition module according to an application calling request of the remote mode service application cluster;
determining from the first feature metadata whether the first feature in the feature repository is dependent on other features;
upon determining that the first feature depends on other features, retrieving, by the feature retrieval module, all features that the first feature depends on from the feature repository;
upon determining that the first feature is not dependent on other features, retrieving, by the feature retrieval module, the first feature directly from the feature repository.
In an alternative embodiment, the first feature processing plug-in and the first feature operator execution module are loaded in respective feature service nodes according to the first feature metadata.
In an optional implementation manner, the feature operators executed in the first feature operator execution module include one or more of a feature type conversion operator, a feature mapping operator, a feature segmentation operator, a feature discretization operator, and a feature normalization operator.
In an alternative embodiment, the feature acquisition module is packaged to support heterogeneous data sources,
the feature repository is one or more repositories, the plurality of repositories being of the same data type or of different data types.
Each feature service node in the feature service cluster in the invention has three core composition modules, which are respectively:
(1) feature acquisition module
The method is used for abstracting the feature repository, encapsulating details among heterogeneous data sources, flexibly expanding new data source types, and supporting multiple data repository types, such as redis, pika, hbase, KV storage engine supporting redis protocol, and the like.
The feature repository may be one or more repositories, and the plurality of repositories may be of the same type or of different types. For multiple repositories, the type of each repository may be redis, pika, hbase, and KV storage engines supporting the redis protocol, etc.
(2) First feature data processing plug-in
An insert for processing a feature. For some service scenarios, some features (for example, the first feature) depend on other basic features in the feature repository and need to be processed, and when the first feature data processing plug-in performs calculation processing on the dependent features, the dependent features do not need to be stored and are directly calculated during running. When the dependency characteristics are acquired, the first characteristic data processing plug-in is called to realize the dependency according to the basic characteristics depended on in the first characteristic metadata description. And realizing the feature processing of the features to be produced according to the service requirements based on the plug-in. By means of plug-in, the characteristics can be flexibly expanded, the method is suitable for production of various characteristics, and the efficiency of characteristic production is improved.
(3) First feature operator execution module
The method is used for performing operator operation on the features and meeting algorithm requirements under different scenes. For example, in some service application scenarios, the features need to be processed by a conventional operator to obtain features meeting the algorithm requirements. Conventional feature operators are for example: the system comprises a feature type conversion operator, a feature mapping operator, a feature segmentation operator, a feature discretization operator, a feature normalization operator and the like. When the feature service node receives a request for obtaining the features, the feature service node loads dependent feature operators according to the feature metadata to execute and return a result. The first characteristic operator execution module can execute the operation of one or more characteristic operators, can realize various production modes of the same characteristic, and realizes the diversity operation of characteristic production.
In an optional implementation manner, the obtaining, by each second application node, the second feature from each second local repository according to the application invocation request of the inline mode service application cluster includes:
each second application node loads second feature metadata from a feature metadata repository through a feature proxy SDK according to the application call request of the embedded mode service application cluster;
determining, from the second feature metadata, whether the second feature in a target local repository is dependent on other features, the target local repository being one or more of a plurality of second local repositories;
when the second feature is determined to be dependent on other features, acquiring all features dependent on the second feature from the target local repository through a feature proxy (SDK);
upon determining that the second feature is not dependent on other features, directly obtaining the second feature from the target local repository through a feature proxy SDK.
In an alternative embodiment, the second feature processing plug-in and the second feature operator execution module are loaded in respective second application nodes according to the second feature metadata.
In an optional implementation manner, the feature operators executed in the second feature operator execution module include one or more of a feature type conversion operator, a feature mapping operator, a feature segmentation operator, a feature discretization operator, and a feature normalization operator.
As described above, by the remote mode call, when the local mode call is performed, for some service scenarios, some features (for example, the second feature) depend on other basic features and need to be processed, and when the second feature data processing plug-in performs calculation processing on the dependent features, the dependent features do not need to be stored and are directly calculated during operation. When the dependency characteristics are acquired, the second characteristic data processing plug-in is called to realize the dependency according to the basic characteristics depended on in the second characteristic metadata description. By means of plug-in, the characteristics can be flexibly expanded, the method is suitable for production of various characteristics, and the efficiency of characteristic production is improved.
Correspondingly, the second characteristic operator execution module is used for executing operator operation on the characteristics needing conventional operator processing in some service application scenes to obtain the characteristics meeting the algorithm requirements. And when the second application node receives the request for acquiring the characteristics, the second application node loads the dependent characteristic operators according to the characteristic metadata to execute and returns a result. The second characteristic operator execution module can execute the operation of one or more characteristic operators, such as a characteristic type conversion operator, a characteristic mapping operator, a characteristic segmentation operator, a characteristic discretization operator, a characteristic normalization operator and the like, can realize multiple production modes of the same characteristic, and realizes the diversity operation of characteristic production.
In an alternative embodiment, the full-scale feature in each first local repository comprises a subscription feature and a first change feature persistently stored in the local repository of each feature broker node,
the subscription characteristic is obtained by loading from the characteristic storage library when each characteristic proxy main node is initialized, and the first change characteristic is obtained by consuming the message from the characteristic message queue by each characteristic proxy main node according to the set message queue offset position.
In the invention, the first local repository acquires the changed first change characteristic from the characteristic message queue in time, and maintains the integrity of the data in the first local repository, so that the main node of the characteristic proxy can generate a full-quantity characteristic snapshot for downloading the characteristic SDK, and the integrity of the data in the characteristic proxy SDK is ensured.
In an optional implementation manner, when initializing the feature, each second application node further includes: and setting a message queue offset position through a feature agent SDK and consuming information from the feature message queue so as to acquire a second change feature from the feature message queue and synchronize the second change feature to a second local storage library.
In the invention, the second local storage library can also acquire the changed second change characteristic from the characteristic message queue in time, so that new characteristics can be avoided after the full-quantity characteristic snapshot is downloaded and the full-quantity characteristic is constructed locally, and the characteristic SDK can further ensure the integrity of local data.
Each feature agent SDK in the invention comprises three core modules which are respectively:
(1) second local repository
The method supports a memory cache mode and a persistent storage mode, and can select the corresponding mode according to the service application requirement.
(2) Second feature data processing plug-in
An insert for processing a feature. For some service scenarios, some features (for example, the second feature) depend on other basic features in the feature repository, and need to be processed, stored, and directly calculated during operation. When the characteristics are acquired, a second characteristic data processing plug-in is called to realize the acquisition according to the basic characteristics depended on in the first characteristic metadata description. And processing the characteristics to be produced is realized according to the service requirements based on the plug-in. Through the mode of plug-in components, can expand the characteristic in a flexible way, be applicable to multiple characteristic production, improve characteristic production efficiency.
(3) Second feature operator execution module
The method is used for performing operator operation on the features and meeting algorithm requirements under different scenes. For example, in some service application scenarios, the features need to be processed by a conventional operator to obtain features meeting the algorithm requirements. Conventional feature operators are for example: the system comprises a feature type conversion operator, a feature mapping operator, a feature segmentation operator, a feature discretization operator, a feature normalization operator and the like. And when the second application node receives the request for acquiring the characteristics, the second application node loads the dependent characteristic operators according to the characteristic metadata to execute and returns a result.
The embodiment of the invention provides a machine learning online feature production method, which comprises the following steps:
s1, initializing each first application node in the remote mode service application cluster, each feature service node in the feature service cluster, each feature proxy main node in the feature proxy main node cluster and each second application node in the embedded mode service application cluster;
s2, determining one of the remote mode service application cluster and the embedded mode service application cluster as a target service application cluster according to the application calling request of the calling party;
s3, when the target business application cluster is the remote mode business application cluster, each feature service node in the feature service cluster acquires a first feature from a feature repository through a feature acquisition module, and executes feature processing and feature operator operation on the first feature through a first feature processing plug-in and a first feature operator execution module, and outputs a processing result conforming to feature metadata definition;
s4, when the target service application cluster is the inline mode service application cluster, each second application node in the inline mode service application cluster acquires a second feature from each second local repository through a feature proxy SDK, and performs feature processing and feature operator operation on the second feature through a second feature processing plugin and a second feature operator execution module, and outputs a processing result conforming to the feature metadata definition, where the second local repository is a local repository of the second application node.
In an optional embodiment, the initializing each feature service node in the feature service cluster includes:
s11, creating the connection of the feature service node with the feature repository and the feature metadata repository;
s12, the feature service node loads feature metadata from the feature metadata base;
s13, constructing a feature processing plug-in and a feature operator execution module in the feature service node, and caching the feature processing plug-in and the feature operator execution module in a memory of the feature service node;
s14, creating a feature metadata change monitoring task in the feature service node, monitoring the change and newly-added events of the feature metadata, and caching the changed and newly-added feature metadata into a memory of the feature service node;
s15, starting a feature remote calling service and providing the feature remote calling service for the first application node to call;
s16, initialization is completed, and the flow ends
In an alternative embodiment, each feature broker master node in the feature broker master node cluster includes:
s21, creating the connection between the feature proxy master node and the feature repository and the feature message queue;
s22, determining whether a feature exists in the first local storage library of the feature broker master node, if so, executing S25, and if not, executing S23;
s23, setting the current consumption information position of the message queue and stopping consuming the characteristic message queue;
s24, subscription characteristics are loaded from the characteristic storage library and are stored in the first local storage library in a persistent mode until data loading is completed;
s25, starting to consume the characteristic message queue;
s26, starting execution of a checkpoint task, wherein the checkpoint task is a timing task for generating a full-scale feature snapshot by consuming offset positions in the feature message queue by the features in the first local storage library and the feature proxy master node;
s27, starting a feature data downloading service interface to allow the second application node to download the full feature snapshot;
s28, the initialization is completed, and the flow ends.
In an optional embodiment, the initializing each second application node in the embedded mode service application cluster includes:
s31, creating connection between the second application node and a feature metadata base, the feature proxy main node and a feature message queue;
s32, the second application node loads feature metadata from the feature metadata base;
s33, constructing a feature processing plug-in and a feature operator execution module in the second application node, and caching the feature processing plug-in and the feature operator execution module in a memory of the second application node;
s34, determining whether the characteristics exist in the second local storage library of the second application node, if so, loading the characteristics in the second local storage library, starting the characteristic message queue to consume the message, and ending the process, if not, executing S35;
s35, downloading a full quantity of feature snapshots from the feature proxy main node and writing the full quantity of feature snapshots into the second local storage library, simultaneously verifying whether the loaded feature metadata is legal or not, if not, ending the process, and if so, executing S36;
s36, the second application node loads the current consumption information position of the characteristic message queue and starts to consume the characteristic message queue;
s37, the initialization is completed, and the flow ends.
In an alternative embodiment, the S3 includes:
each feature service node monitors an application calling request of the remote mode service application cluster;
after receiving a feature acquisition request, loading first feature metadata from a feature metadata database through a feature acquisition module;
determining whether a first feature in the feature repository depends on other features according to the first feature metadata, if so, acquiring all features of which the first feature depends from the feature repository through the feature acquisition module, otherwise, directly acquiring the first feature from the feature repository through the feature acquisition module;
loading the first feature processing plug-in and the first feature operator execution module corresponding to the first feature according to the first feature metadata;
sequentially executing feature processing and feature operator operation through the first feature processing plug-in and the first feature operator execution module;
and outputting the processing result conforming to the feature metadata definition, and ending the flow.
In an alternative embodiment, the S4 includes:
each second application node monitors an application calling request of the embedded mode service application cluster;
after receiving the request for obtaining the characteristics, loading second characteristic metadata from a characteristic metadata repository through a characteristic proxy SDK;
determining whether a second feature in a target local repository depends on other features according to the second feature metadata, if so, acquiring all features of which the second feature depends from the target local repository through a feature proxy SDK, otherwise, directly acquiring the second feature from the target local repository through the feature proxy SDK, wherein the target local repository is one or more of a plurality of second local repositories;
loading the second feature processing plug-in and the second feature operator execution module corresponding to the second feature according to the second feature metadata;
sequentially executing feature processing and feature operator operation through the second feature processing plug-in and the second feature operator execution module;
and outputting the processing result conforming to the feature metadata definition, and ending the flow.
The disclosure also relates to an electronic device comprising a server, a terminal and the like. The electronic device includes: at least one processor; a memory communicatively coupled to the at least one processor; and a communication component communicatively coupled to the storage medium, the communication component receiving and transmitting data under control of the processor; wherein the memory stores instructions executable by the at least one processor to implement the method of the above embodiments.
In an alternative embodiment, the memory is used as a non-volatile computer-readable storage medium for storing non-volatile software programs, non-volatile computer-executable programs, and modules. The processor executes various functional applications of the device and data processing, i.e., implements the method, by executing nonvolatile software programs, instructions, and modules stored in the memory.
The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store a list of options, etc. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and such remote memory may be connected to the external device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
One or more modules are stored in the memory and, when executed by the one or more processors, perform the methods of any of the method embodiments described above.
The product can execute the method provided by the embodiment of the application, has corresponding functional modules and beneficial effects of the execution method, and can refer to the method provided by the embodiment of the application without detailed technical details in the embodiment.
The present disclosure also relates to a computer-readable storage medium for storing a computer-readable program for causing a computer to perform some or all of the above-described method embodiments.
That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Furthermore, those of ordinary skill in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.
It will be understood by those skilled in the art that while the present invention has been described with reference to exemplary embodiments, various changes may be made and equivalents may be substituted for elements thereof without departing from the scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from the essential scope thereof. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed, but that the invention will include all embodiments falling within the scope of the appended claims.