The determination method, apparatus and server of the dispensing address of vending machine
Technical field
This specification belongs to a kind of determination method of the dispensing address of Internet technical field more particularly to vending machine, dress
It sets and server.
Background technique
With the progress of internet and artificial intelligence technology, new retail industry starts development, popularizes.For example, nobody sells
Do not limited by the factors such as time, artificial, place when machine is due to selling goods, but have the characteristics that it is flexible, convenient, obtained compared with
It is extensive to promote and use.
Currently, suitable address can accurately and accurately be selected to launch vending machine by needing a kind of method, so that passing through
The vending machine, which is sold goods, obtains relatively good income.
Summary of the invention
This specification is designed to provide the determination method, apparatus and server of a kind of dispensing address of vending machine, with compared with
Small implementation cost efficiently, is accurately instructed the dispensing of vending machine, improves the income obtained based on the vending machine launched.
The determination method, apparatus and server of a kind of dispensing address for vending machine that this specification provides are realized in
:
A kind of determination method of the dispensing address of vending machine, comprising: fisrt feature information relevant to destination address is obtained,
With second feature information relevant to target vending machine;Using the first model, according to the fisrt feature information, with determining target
The corresponding area grade in location, wherein the area grade is used to indicate the commercialization in the preset range region where destination address
Degree;Using the second model, according to the area grade and the second feature information, it is determined whether thrown in the destination address
Put the target vending machine.
A kind of determination method of the dispensing address of vending machine, comprising: fisrt feature information relevant to destination address is obtained,
With second feature information relevant to target vending machine;Using third model, according to the fisrt feature information and described second
Characteristic information, it is determined whether launch the target vending machine in the destination address.
A kind of determining device of the dispensing address of vending machine, comprising: module is obtained, it is relevant to destination address for obtaining
Fisrt feature information, and second feature information relevant to target vending machine;First determining module, for utilizing the first model,
According to the fisrt feature information, the corresponding area grade of destination address is determined, wherein the area grade is used to indicate target
The commercialization degree in the preset range region where address;Second determining module, for utilizing the second model, according to the region
Grade and the second feature information, it is determined whether launch the target vending machine in the destination address.
A kind of determining device of the dispensing address of vending machine, comprising: module is obtained, it is relevant to destination address for obtaining
Fisrt feature information, and second feature information relevant to target vending machine;Determining module, for utilizing third model, according to
The fisrt feature information and the second feature information, it is determined whether launch the target vending machine in the destination address.
A kind of server, including processor and for the memory of storage processor executable instruction, the processor
It is realized when executing described instruction and obtains fisrt feature information relevant to destination address, and relevant to target vending machine second special
Reference breath;The corresponding area grade of destination address is determined, wherein described according to the fisrt feature information using the first model
Area grade is used to indicate the commercialization degree in the preset range region where destination address;Using the second model, according to described
Area grade and the second feature information, it is determined whether launch the target vending machine in the destination address.
A kind of computer readable storage medium, is stored thereon with computer instruction, and described instruction is performed realization and obtains
Fisrt feature information relevant to destination address, and second feature information relevant to target vending machine;Utilize the first model, root
According to the fisrt feature information, the corresponding area grade of destination address is determined, wherein the area grade is with being used to indicate target
The commercialization degree in the preset range region where location;Using the second model, according to the area grade and the second feature
Information, it is determined whether launch the target vending machine in the destination address.
The determination method, apparatus and server of a kind of dispensing address for vending machine that this specification provides, due to passing through elder generation
Using acquired fisrt feature information relevant to destination address to be determined as the mode input of the first model, it is input to pre-
In first trained the first model for evaluating the commercialization degree in the preset range region where address, with obtaining the target
The area grade of location;It is again that the area grade of destination address and the combination of second feature information is defeated collectively as the model of the second model
Enter, is input to trained in advance for searching for optimal action policy so that obtaining in the target vending machine that destination address is launched
Compared in the second model of high yield, determine whether to launch target vending machine in destination address to be exported according to the second model, from
And the dispensing of vending machine can be efficiently and accurately carried out with lesser implementation cost, raising is sold based on the vending machine launched
Income acquired in commodity, solve existing method causes implementation cost higher due to needing to rely on largely to examine on the spot, with
And need to rely on the experience of people causes accuracy lower to judge to launch address, less reliable, objective technical problem.
Detailed description of the invention
In order to illustrate more clearly of this specification embodiment or technical solution in the prior art, below will to embodiment or
Attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only
The some embodiments recorded in this specification, for those of ordinary skill in the art, in not making the creative labor property
Under the premise of, it is also possible to obtain other drawings based on these drawings.
Fig. 1 is the structure group of the system of the determination method of the dispensing address of the vending machine provided using this specification embodiment
At a kind of embodiment schematic diagram;
Fig. 2 is in a Sample Scenario, using the determination for launching address for the vending machine that this specification embodiment provides
A kind of schematic diagram of embodiment of method;
Fig. 3 is in a Sample Scenario, using the determination for launching address for the vending machine that this specification embodiment provides
A kind of schematic diagram of embodiment of method;
Fig. 4 is in a Sample Scenario, using the determination for launching address for the vending machine that this specification embodiment provides
A kind of schematic diagram of embodiment of method;
Fig. 5 is a kind of embodiment of the process of the determination method of the dispensing address for the vending machine that this specification embodiment provides
Schematic diagram;
Fig. 6 is a kind of embodiment of the process of the determination method of the dispensing address for the vending machine that this specification embodiment provides
Schematic diagram;
Fig. 7 is a kind of schematic diagram of embodiment of the structure for the server that this specification embodiment provides;
Fig. 8 is a kind of embodiment of the structure of the determining device of the dispensing address for the vending machine that this specification embodiment provides
Schematic diagram.
Specific embodiment
In order to make those skilled in the art more fully understand the technical solution in this specification, below in conjunction with this explanation
Attached drawing in book embodiment is clearly and completely described the technical solution in this specification embodiment, it is clear that described
Embodiment be only this specification a part of the embodiment, instead of all the embodiments.The embodiment of base in this manual,
Every other embodiment obtained by those of ordinary skill in the art without making creative efforts, all should belong to
The range of this specification protection.
In view of existing method is at the dispensing address of the vending machines such as the unmanned vending machine of determination, generally require first to expend
A large amount of human cost is treated selected address and is explored on the spot, then the artificial experience accumulated other than, in conjunction with exploration
As a result decision is carried out, it is determined whether to launch vending machine in the address.Due to needing when exploration early period by largely adjusting on the spot
It grinds and treats selected address and explored on the spot, cause implementation cost relatively high;In later period decision again due to needing to rely on
Previous experiences come manually determine address to be selected be appropriate for launch vending machine, cause decision process be easy by people's subjectivity because
The influence of element, often determining result is not accurate enough, reliable.And for be by launching unmanned vending machine and selling goods etc.
Represent new retail industry this simultaneously merged online service, line experience, modern logistics new industry, people's recognizes
The experience known and accumulated is relatively limited, causes determining that relatively other scenes are more when launching address based on artificial experience
It is easy to appear error, and then influences to sell goods acquired income subsequently through vending machine.
For generate the above problem basic reason, this specification consideration can first train one be used for evaluate it is to be determined
First model of the commercialization degree in the preset range region where address, and determine for intelligence and thrown in address to be determined
The optimisation strategy of vending machine is put to obtain the second model compared with high yield;Relevant to destination address to be determined first is obtained again
Characteristic information, and second feature information relevant to vending machine to be put, the attribute data of two kinds of different dimensions.And then it can
By being input to first first by acquired fisrt feature information relevant to destination address to be determined as mode input
In model, the area grade of the destination address is obtained;The area grade of destination address and second feature information combine conduct again
Mode input is input in preparatory trained second model, obtains that the Policy Result compared with high yield can be obtained, and according to this
Policy Result finally determines whether to launch target vending machine in destination address, so as to lesser implementation cost, efficient, standard
The dispensing of vending machine is really instructed, improves and sells goods acquired income based on the vending machine launched.
This specification embodiment provides a kind of determination of the dispensing address of vending machine, and the method specifically can be applied to wrap
It includes in the system architecture of server and client side.It specifically can be shown refering to fig. 1.Wherein, client and server passes through wired
Or be wirelessly connected, to carry out data interaction.
When it is implemented, can be acquired by client and send relevant to destination address to be determined the to server
One characteristic information, and second feature information relevant to target vending machine to be put.
Service the available fisrt feature information relevant to destination address of implement body, and relevant to target vending machine the
Two characteristic informations;The corresponding area grade of destination address is determined according to the fisrt feature information using the first model, wherein
The area grade is used to indicate the commercialization degree in the preset range region where destination address;Using the second model, according to
The area grade and the second feature information, it is determined whether launch the target vending machine in the destination address.
In the present embodiment, the server can be a kind of applied to service data processing platform side, can be realized
The Batch Processing server of the functions such as data transmission, data processing.Specifically, the server can have data to transport for one
It calculates, the electronic equipment of store function and network interaction function;Or run in the electronic equipment, be data processing,
Storage and network interaction provide the software program supported.In the present embodiment, the quantity of the server is not limited specifically.
The server is specifically as follows a server, or several servers, alternatively, the service formed by several servers
Device cluster.
In the present embodiment, the client, which can be, a kind of can be realized data acquisition, data transmission etc. before functions
End equipment.Specifically, the client for example can be desktop computer, tablet computer, laptop, smart phone, number
Assistant, intelligent wearable device, shopping guide's terminal, television set with network access functions etc..Alternatively, the client can also be with
For the software application that can be run in above-mentioned electronic equipment.For example, it may be certain APP etc. run on mobile phone.
In a Sample Scenario, can with as shown in fig.2, using this specification embodiment provide vending machine determination
Method is that certain drinks manufacturer selects suitable address to launch nobody self-service beverage merchandiser of the said firm.
Certain drinks manufacturer plans to launch the unmanned beverage dispenser of the said firm in multiple positions in the city SZ, by above-mentioned
Vending machine sells the fruit drink of the said firm's release.In order to enable the later period can sell fruit drink by the vending machine launched
Obtain higher income, the drinks manufacturer can using this method from multiple dispensing addresses undetermined in the city SZ (such as: target
Location 1, destination address 2, destination address 3 and destination address 4) in find suitable address and sell fruit juice beverage to launch above-mentioned vending machine
Material.
Specifically, the drinks manufacturer can first pass through the client end acquisition being laid near destination address and above-mentioned 4
The relevant fisrt feature information of destination address.
Wherein, above-mentioned fisrt feature information specifically can be understood as a kind of relevant to destination address, be able to reflect target
The characteristic of the commercialization degree of range areas where address.Specifically, above-mentioned fisrt feature information may include: target
Stream of people's data in the preset range region where address are (for example, the people that the preset range region where destination address is passed through daily
Member's quantity), POI (POI, point of interest) data in the preset range region where destination address are (for example, where destination address
Market quantity, school's quantity in preset range region, or at a distance from nearest market etc.), arrive destination address logistics
Transport cost data (for example, from fruit drink depot transport fruit drink to the destination address transportation cost), target
The data of similar competing product are (for example, automatic selling in preset range region where destination address in preset range region where location
Sell the quantity of machine, the quantity of convenience store) etc..Certainly, it should be noted that above-mentioned cited fisrt feature information is
One kind schematically illustrates.When it is implemented, may be incorporated into other with target according to specific application scenarios and process demand
The relevant characteristic information in location is as fisrt feature information.In this regard, this specification is not construed as limiting.
Specifically, for example, can first acquire destination address by being laid in the camera of the client of each destination address
The image data in the preset range region (for example, border circular areas that the radius centered on destination address is 500 meters) at place, and
Above-mentioned image data is analyzed, is handled, to calculate stream of people's data in the preset range region where the destination address.Visitor
Family end can also count market quantity, school's quantity in the preset range region where the destination address by inquiring map
Deng and the destination address data such as at a distance from nearest market, school.Client can also be by infusing in online enquiries retail shop
Volume address information, counts the quantity etc. of convenience store in the preset range region where destination address.
By the above-mentioned means, corresponding with multiple destination addresses to be selected multiple can be collected by client
One characteristic information, and above-mentioned fisrt feature information is sent to the server of responsible location decision-making by network, thus server
The available fisrt feature information for obtaining multiple destination addresses.Meanwhile server also (can with acquisition and vending machine to be put
To be denoted as target vending machine) relevant second feature information.
Wherein, above-mentioned second feature information specifically can be understood as it is a kind of with plan destination address launch target sell
Machine is related, is able to reflect out the characteristic of the subsequent situation of Profit that acquisition is sold goods based on the target vending machine launched.
Specifically, above-mentioned second feature information may include: the price of target vending machine sold goods, target vending machine sold goods
Type (for example, fruit drink, soda or tea beverage etc.), the operation mode of target vending machine are (for example, join mould
Formula, directly-managed mode or leasehold mode etc.) etc..Certainly, it should be noted that above-mentioned cited second feature information is only
It is that one kind schematically illustrates.When it is implemented, may be incorporated into other according to concrete application scene and processing needs and sold with target
The relevant characteristic information of machine is sold as second feature information.In this regard, this specification is not construed as limiting.
Specifically, for example, above-mentioned server can be run by inquiring the unmanned beverage dispenser of the drinks manufacturer
The above-mentioned second feature information of data acquisitions such as plan.
Server can first call preparatory training after having acquired above-mentioned fisrt feature information and second feature information
The first good model, and using the fisrt feature information of destination address as the mode input of the first model, it is input to the first model;
The first model is run, corresponding model output is obtained;It determines to preset where destination address according to the output of the model of the first model
The area grade of range areas, the i.e. area grade of destination address.
Wherein, above-mentioned zone grade specifically can be understood as a kind of preset range region for measuring where destination address
Commercialization degree index parameter.The area grade of a usual destination address is higher, then where corresponding to the destination address
The commercialization degree in preset range region is also higher, and the region is relatively more bustling, lively.It is opposite, destination address
Area grade is lower, then the commercialization degree for corresponding to the preset range region where the destination address is also lower, the phase in the region
To desolateer, lonely.
Above-mentioned first model specifically can be understood as a kind of fisrt feature information for advancing with sample address as first
Sample data carries out model learning, training, and what is obtained can be used in the business of the preset range garden where evaluation goal address
The classification prediction model of change degree.
Server, may further be by the area of destination address after obtaining the area grade of destination address by the first model
Domain grade is combined with second feature information, as the mode input of the second model, is input to the second model;The second model is run,
Obtain corresponding model output, i.e., corresponding Policy Result;It is determined whether in target according to the output of the model of the second model
Target vending machine is launched at location.
Above-mentioned second model specifically can be understood as a kind of area grade for advancing with sample address and in sample address
The second feature information of the vending machine of dispensing carries out corresponding model training as the second sample data, and by intensified learning,
What is obtained can be searched out based on the environmental status data of the feature of vending machine for including the feature and dispensing for launching address etc.
It can the corresponding model for obtaining the Policy Result compared with high yield.
Wherein, the particular content of above-mentioned Policy Result may include: the vending machine, with withdrawing from target increased at destination address
Vending machine at location keeps that the vending machine at destination address is constant, sold goods of vending machine at adjustment destination address etc..
Certainly, it should be noted that above-mentioned cited Policy Result is that one kind schematically illustrates.When it is implemented, according to specific
Situation may be incorporated into content of other action policies as Policy Result when the second model is established in training.
The Policy Result for the multiple destination addresses of correspondence that server can be obtained according to the second model is with determining multiple targets
It is filtered out in location and is suitable for launching above-mentioned target vending machine address as final dispensing address, then in above-mentioned final dispensing
Location goes out to launch target vending machine, can sell fruit drink by going out the vending machine launched in address above mentioned so as to subsequent, obtain
Higher income.
For example, in this Sample Scenario, area grade for inputted destination address 1 and to be put in destination address 1
Target vending machine second feature information, if the Policy Result that the second model obtains are as follows: withdraw from selling at destination address
Machine can then remove the target vending machine launched before at destination address 1.For the area grade of inputted destination address 2
The second feature information of the target vending machine to be put with destination address 2, if the Policy Result that the second model obtains are as follows: increase
Vending machine at destination address can then launch target vending machine etc. at destination address.In addition, destination address 3 is to deserved
The Policy Result arrived are as follows: increase the vending machine at destination address.The corresponding obtained Policy Result of destination address 4 are as follows: withdraw from target
Vending machine at address.And then it can screen and can obtain in the destination address undetermined from 4 according to above-mentioned 4 Policy Results
The dispensing address of larger income: the dispensing address of destination address 2, destination address 3 as target vending machine.To by reference to the
The Policy Result that two models provide, target vending machine is launched in guidance at above-mentioned two destination address in the city SZ, larger to obtain
Income.
In another Sample Scenario, server can corresponding first model and the second mould be established in training in advance respectively
Type.
Specifically, can be as shown in fig.3, server can first pass through the said firm or other public affairs in query history record
Dispensing address, historical operation and the corresponding historical yield data of nobody self-service beverage merchandiser of department, by what is selected in history
Address is launched as sample address, and obtains the fisrt feature information of the sample address, second feature information, to the sample address
Action policy data that the vending machine at place is taken (or operation data, it is sold for example, increasing one at the sample address
Machine has migrated a vending machine at the sample address, or the commodity sold the vending machine at sample address are adjusted
It is whole etc.) and sample address at historical yield data of vending machine etc..
Further, server can be using the fisrt feature information of destination address as first sample data, first according to first
Sample data establishes corresponding first model.
Specifically, server can be first according to the evaluation rule of preset commercialization degree, in conjunction with where sample address
The commercialization situation in preset range region determines the area grade in the preset range region where sample address;And in the sample
Corresponding area grade, the sample data after being marked are marked out in the fisrt feature information of this address.Server can select
Such as GBDT (Gradient Boosting Decision Tree, gradient decline tree) model is selected as initial model.It recycles
Sample data after above-mentioned mark carries out model training to initial model, with the model parameter in the above-mentioned model of determination, thus
To the first model.
For example, above-mentioned first model can be expressed as following form:
Wherein, FmThe area grade of sample address, X can be specifically expressed as1、X2、X3、X4It is expressed as the fisrt feature information of sample
In sample address where stream of people's data in preset range region, the POI number in the preset range region where sample address
According to, transport to sample way address logistics the data of similar competing product in preset range area where cost data, sample address, T tool
Body can be expressed as the relation function of the area grade of sample address and the fisrt feature information of sample address, θmFor relation function
The model parameter that middle number is m.
It should be noted that it is above-mentioned it is cited using GBDT as initial model come to establish the first model be a kind of signal
Property explanation.When it is implemented, needing as the case may be with processing, prediction can also be able to carry out using other structures type
The model of classification establishes the first model as initial model.In this regard, this specification is not construed as limiting.
Server is when establishing the second model, it is contemplated that the second model needs established can be based on the spy for launching address
The environmental status datas such as the feature of vending machine of dispensing of seeking peace correctly search for preferably action policy relatively so that being thrown
The vending machine put can obtain higher income when the later period, operation was sold goods.And it is this sold goods based on vending machine it is new
Retail industry is as a kind of emerging industry, and the experience that people are accumulated is relatively limited, if according to the side for establishing the first model
Formula is first rule of thumb labeled by staff, and the experience accumulated by people is limited, and often it is easy to appear errors, lead
The accuracy for causing final training to obtain model is poor.The above problem, and the particularity of the second model of training are exactly paid attention to, at this
It proposes to establish to obtain higher second model of accuracy by intensified learning in embodiment.
When it is implemented, can with as shown in fig.4, the available sample address of server relevant to vending machine second
Characteristic information as the second sample data, and obtain the second sample state preset range region where corresponding sample address
Area grade.The grade of above-mentioned second sample data and corresponding region is combined again, as sample environment status data.
Meanwhile it obtaining to the operation data of the vending machine at sample address, as corresponding sample action policy data.According to sample
The historical yield data of vending machine at location, the earning rate and allowance for depreciation of comprehensive vending machine, establish corresponding reward function.Root again
According to above-mentioned sample environment status data, action policy data and reward function, environmental status data and action policy number are established
According to the mapping function for being mapped to bonus data.Sample environment status data and sample action policy data is recycled to train above-mentioned reflect
Function is penetrated, to determine the undetermined coefficient in mapping function, to obtain the second model.
Specifically, for example, vending machine can be expressed as to an Agent.By the area grade of sample address and sample
The second feature information of location combines, as a corresponding sample environment status data (can be denoted as Environment), with structure
Build state space (State Space).Sample environment status data can specifically be expressed as following form: s=(price,
mode,item,E).Wherein, s can specifically be expressed as a sample environment status data, and price, item, mode specifically can be with
Respectively indicate are as follows: the prices of the vending machine sold goods in second feature information, the type of sold goods, vending machine operation mould
Type, E can specifically indicate the area grade of sample address.The historical operating data to the vending machine at sample address is obtained, is made
For corresponding sample action policy data, to construct motion space (Action Space).Specifically, above-mentioned action policy data
It can be expressed as A=[ai].Wherein, each aiCan correspond to indicates a kind of concrete operations for vending machine.
It is also contemplated that the feature that supplemental characteristic different in sample environment status data is characterized is different, corresponding data
Dimension scale is not also identical, the parameters data in above-mentioned environmental status data can also be carried out feature normalization processing,
Each of s=(price, mode, item, E) element is normalized in the dimensional area of [0,1], convenient for subsequent
Data processing, and then the subsequent environmental status data that can be used after normalizing establishes the second model to train.
Meanwhile server can be according to the earning rate (Gain Rate) and allowance for depreciation that vending machine changes over time
(Discount Rate) establishes award of the reward function as model, with the pilot model direction search strategy knot high to income
Fruit.Wherein, reward function can specifically be expressed as rt=F (Gain Rate (t), Discount Rate (t)).Wherein, rtIt indicates
The income of t moment.
Further, it is possible to establish environmental status data in the following way and action policy data are mapped to bonus data
Mapping function: S × A × S → R.Wherein, S can specifically be expressed as sample environment status data, and A can specifically be expressed as acting
Policy data, R can specifically be expressed as reward function.
According to the sample environment status data and sample action policy data, the training mapping function, so that Agent
The optimal tactful π searched out during with environmental status data and action policy data interaction*, so that in arbitrary ring
Under border status data s and any time t, maximum long-term accumulated income can be obtained, following functional expression can be met:
Wherein, π*It can be specifically expressed as a kind of income reward, γ is weight, when value is more than or equal to 0 and is less than 1, k
Between step-length.
And then the second model can be solved the problems, such as that processing is converted into the optimal action policy of solution to meet
Following functional expression:
Wherein, Q*For that can obtain maximum long-term accumulated and receive based on environmental status data s and action policy data a
Benefit.
When the second model of specific training, it can be based on above-mentioned functional expression, when accumulating each in long-time by continuous iteration
Between the environmental status data put and the corresponding obtained income of action policy data combination, calculate environment dynamic data
Long-term accumulated income corresponding with action policy data, then using the maximization of long-term accumulated income as guidance, training search is closed
Suitable action policy data and environmental status data have obtained the second model to establish.
Certainly, it should be noted that only one is illustrated with property for the above-mentioned cited mode for establishing the second model.Tool
When body is implemented, as the case may be, it can also be based on intensified learning using other suitable modes, to establish above-mentioned second mould
Type.
It, can for inputted target using the second model obtained by the above method is based in this Sample Scenario
Location is predicted, to determine most suitable action policy (i.e. Policy Result), so that can be determined according to above-mentioned strategy
Launch whether target vending machine can obtain relatively best income at destination address, to realize the placement position of vending machine
Intelligent addressing, suitable address is found with guidance and launches vending machine, obtains preferable income.
Further, above-mentioned second mould can also be utilized in the case where destination address launches target vending machine in determination
Type, by fixed destination address, the target vending machine institute being constantly changing in the second feature information of the target vending machine of input
It sells the characteristic attributes such as type, the price of commodity and obtains corresponding a variety of Policy Results, determined according to a variety of Policy Results in mesh
Which kind of commodity the target vending machine of mark address sells, the price of commodity is set as how much relatively higher income could be obtained, from
And the intelligent selection that vending machine is sold goods is realized, to instruct the class to the target vending machine sold goods at destination address
Type and price are adjusted, optimize, and obtain better income.
For example, can be by the second feature information f comprising the destination address that target vending machine sold goods are fruit drink
With include target vending machine sold goods be soda destination address second feature information g respectively with the same target
The area grade of address combines, then inputs in the second model and respectively obtain corresponding Policy Result f and Policy Result g.If root
It is to maintain that the vending machine at destination address is constant according to the action policy that Policy Result f is determined, and is determined according to Policy Result g dynamic
Making strategy is the vending machine increased at destination address, then it can be concluded that in the case where identical vending machine quantity, if sold
The commodity that the machine of selling is sold are sodas, have higher income relative to fruit drink is sold.It at this moment, can by target
It is soda that the commodity that original vending machine is sold at location, which are exchanged, to further improve acquired income.
By above-mentioned Sample Scenario as it can be seen that this specification provide vending machine dispensing address determination method, due to passing through
First using acquired fisrt feature information relevant to destination address to be determined as mode input, it is input to and trains in advance
The commercialization degree for evaluating preset range region where address the first model in, obtain the region of the destination address
Grade;Again by the area grade of destination address and second feature information collectively as mode input, it is input to trained in advance
For search for optimal policy so that destination address launch target vending machine obtain in the second model compared with high yield, so as to
Determine whether to launch target vending machine in destination address according to the output of the second model, so as to lesser implementation cost, height
Effect, the dispensing for accurately instructing vending machine, are improved and are sold goods acquired income based on the vending machine launched, solved existing
Due to needing to rely on, largely exploration causes implementation cost higher to method on the spot, and needs to rely on the experience of people to judge to launch
Address causes accuracy lower, less reliable, objective technical problem.
As shown in fig.5, this specification embodiment provides a kind of determination method of the dispensing address of vending machine, wherein
This method is applied particularly to server-side.When it is implemented, this method may include the following contents.
S51: fisrt feature information relevant to destination address, and second feature relevant to target vending machine letter are obtained
Breath.
In the present embodiment, above-mentioned destination address specifically can be understood as the address of dispensing vending machine undetermined, above-mentioned mesh
Selling tender sells machine specifically and can be understood as planning the vending machine launched at above-mentioned destination address.
In the present embodiment, above-mentioned fisrt feature information specifically can be understood as a kind of relevant to destination address, can
Reflect the characteristic attribute data of the commercialization degree in the preset range region where destination address.Wherein, above-mentioned preset range area
Domain specifically can be using destination address as the center of circle, the round geographic area that radius is 500 meters;Certainly, above-mentioned preset range region
Can also be include the side length of destination address to be 500 meters of direction geographic area etc..For above-mentioned preset range region
Shapes and sizes, this specification are not construed as limiting.
Specifically, above-mentioned fisrt feature information may include stream of people's data in the preset range region where destination address,
For example, the personnel amount etc. passed through daily in preset range region where destination address.Fisrt feature information also can wrap
The POI data in the preset range region where destination address is included, such as the quotient in the preset range region where destination address
Number, station quantity, school's quantity or destination address are apart from nearest market, school, the distance at station etc..First
Characteristic information also may include transporting cost to destination address logistics, for example, the source of goods depot of vending machine sold goods is regular
The transportation cost etc. of fill-ins is transported to destination address.Fisrt feature information can also be including default where destination address
The data of similar competing product in range areas, for example, the quantity of other vending machines in preset range region where destination address,
The quantity etc. of convenience store in preset range region where destination address.Certainly, it should be noted that above-mentioned cited the
One characteristic information is that one kind schematically illustrates.When it is implemented, can also be drawn according to specific application scenarios and process demand
Enter other characteristic attribute informations relevant to destination address as fisrt feature information.In this regard, this specification is not construed as limiting.
In the present embodiment, above-mentioned second feature information be specifically as follows it is a kind of with plan destination address dispensing target
Vending machine is related, is able to reflect out the feature category of the subsequent situation of Profit that acquisition is sold goods based on the target vending machine launched
Property data.
Specifically, above-mentioned second feature information may include the price of target vending machine sold goods.It also may include mesh
Mark type, the brand etc. of vending machine institute commodity.It can also include the operation mode of target vending machine, for example, joining mode, directly-managed
Mode or leasehold mode etc..Certainly, it should be noted that above-mentioned cited second feature information is a kind of schematic
Explanation.When it is implemented, may be incorporated into other spies relevant to target vending machine according to concrete application scene and processing needs
Reference breath is used as second feature information.In this regard, this specification is not construed as limiting.
In the present embodiment, above-mentioned acquisition fisrt feature information relevant to destination address, and it is related to target vending machine
Second feature information, when it is implemented, may include the following contents: server passes through the client that is laid near destination address
End acquires and obtains fisrt feature information relevant to destination address;Server launches mesh by inquiry plan at destination address
The dispensing plan of vending machine is marked, second feature information relevant to target vending machine to be put at destination address is obtained.
S53: determining the corresponding area grade of destination address according to the fisrt feature information using the first model,
In, the area grade is used to indicate the commercialization degree in the preset range region where destination address.
In the present embodiment, above-mentioned first model specifically can be understood as a kind of fisrt feature for advancing with sample address
Information carries out model learning, training, what is obtained can be used in the default model where evaluation goal address as first sample data
Enclose the classification prediction model of the commercialization degree of garden.
In the present embodiment, above-mentioned zone grade specifically can be understood as a kind of default where destination address for measuring
The index parameter of the commercialization degree of range areas.The area grade of a usual destination address is higher, then with corresponding to the target
The commercialization degree in the preset range region where location is also higher, and the region is relatively more bustling, lively.Opposite, a mesh
The area grade for marking address is lower, then the commercialization degree for corresponding to the preset range region where the destination address is also lower, should
Region it is relatively more desolate, lonely.
In the present embodiment, above-mentioned to determine that destination address is corresponding according to the fisrt feature information using the first model
Area grade, when it is implemented, may include the following contents: using the fisrt feature information of destination address as the mould of the first model
Type input, is input in the first model;The first model is run, corresponding model output is obtained;It is exported, is determined according to the model
The area grade in the preset range region where destination address, the i.e. area grade of destination address out.
S55: the second model is utilized, according to the area grade and the second feature information, it is determined whether in the mesh
It marks address and launches the target vending machine.
In the present embodiment, above-mentioned second model specifically can be understood as a kind of area grade for advancing with sample address
Combination with the second feature information for the vending machine launched at sample address passes through intensified learning as the second sample data
Corresponding model learning, training are carried out, what is obtained can be based on the spy for the vending machine for including the feature and dispensing for launching address
The environmental status data of sign etc., the model for obtaining the Policy Result compared with high yield can be corresponded to by searching out.
Wherein, above-mentioned Policy Result can export to obtain according to the model of the second model.Specifically, above-mentioned Policy Result can
With include: increase destination address at vending machine, withdraw from the vending machine at destination address, keep destination address at vending machine not
Become or adjusts the sold goods of vending machine at destination address etc. to the action policy of the target vending machine at destination address.When
So, it should be noted that above-mentioned cited Policy Result is that one kind schematically illustrates.When it is implemented, according to specific feelings
Condition may be incorporated into other action policies as Policy Result when the second model is established in training.In this regard, this specification is not made
It limits.
In the present embodiment, above-mentioned to utilize the second model, according to the area grade and the second feature information, determine
Whether in the destination address dispensing target vending machine, when it is implemented, may include the following contents: by destination address
The second feature information of target vending machine to be put at area grade, with the destination address is combined, will be same after combination
When include the preset range region being able to reflect where destination address obtained based on fisrt feature information commercialization degree
Area grade and be able to reflect both different dimensions of the second feature information of situation of Profit of target vending machine to be put
Attributive character collectively as the mode input of the second model, be input to the second model;The second model is run, corresponding mould is obtained
Type exports to get corresponding Policy Result has been arrived;It is determined according to the Policy Result and is sold in destination address dispensing target
This event action to be taken strategy of machine, to obtain relatively good income.For example, it is determined whether will be at destination address
Launch target vending machine.
In the present embodiment, if the Policy Result exported according to the second model are as follows: increase the vending machine at destination address,
It then may determine that if launch the available relatively good income of the target vending machine at the destination address, and then can be true
It is scheduled on to lay at the destination address and launches corresponding target vending machine.If the Policy Result exported according to the second model are as follows: remove
Target vending machine at destination address out then may determine that available if launching target vending machine not at the destination address
Relatively good income, i.e., if at the destination address launch vending machine it is possible that loss etc. negative situation of Profit, at this moment
It can determine not laying at the destination address and launch the target vending machine.
In the present embodiment, it should be noted that consider emerging for selling goods this by Vending Machine
New retail Industry Model, people are relatively fewer to the experience of its cognition and accumulation, and technical staff often can not be accurately pre-
Survey the situation of Profit in later period.The second model is trained in such a way that sample marks based on experience if relying on technical staff, very
It is easy to introduce error due to the insufficient of technical staff's experience, causes the accuracy of finally obtained second model poor.Exactly infuse
It has anticipated above situation, the algorithm of intensified learning is introduced in the method provided by this specification embodiment to train and establish
Two models improve obtained second without being labeled to the sample data for establishing the second model for training
The accuracy of model, and then also improve subsequently through the second model to determine whether launching target vending machine in destination address
Accuracy.
In the present embodiment, it is also necessary to explanation, by first according to destination address using trained first model
Fisrt feature information is determined the area grade of destination address, by originally complicated fisrt feature information processing at relatively simple
The discrete grading index for the commercialization degree that can characterize the preset range region where destination address changed, i.e. region etc.
Grade;Area grade and second feature information are combined again, the second model of Lai Liyong is determined according to the information after said combination
Whether target vending machine is launched at destination address, can simplify the complexity of the second model in this way, reduce the number of the second model
According to treating capacity, relatively better second model of precision is established convenient for training.
In one embodiment, when it is implemented, the first of available multiple destination addresses and multiple destination addresses
The second feature information of characteristic information and target vending machine;According still further to aforesaid way using the first model and the second model to respectively
The fisrt feature information of multiple destination addresses and the second feature information of target vending machine are handled, and obtain corresponding to multiple targets
The Policy Result of address;According to the Policy Result of multiple destination addresses, Policy Result is filtered out from multiple destination addresses to increase
Add the destination address of the vending machine at destination address as finally determining dispensing address, and in the above-mentioned dispensing finally determined
It is laid at location and launches corresponding target vending machine, to obtain preferable income.
In one embodiment, in addition to that can determine whether destination address can be with using the second model in the manner described above
For launching target vending machine, realize outside intelligent addressing.It can also determine to launch in destination address using above-mentioned second model
Target vending machine sell the commodity of which type and could obtain relatively better income, realize intelligent selection.
The feelings of target vending machine are launched in the destination address specifically, determining in the Policy Result exported according to the second model
Under condition, further, server can also change in the second feature information of target vending machine to be put at the destination address
Sold goods type, then area grade of the second feature information together with identical destination address that will include different types of merchandize
Combination is input to the second model, obtains the Policy Result for corresponding to different types of merchandize.The plan of comprehensive corresponding different types of merchandize again
Slightly as a result, determining that the merchandise income for selling which kind of type in the target vending machine of destination address dispensing is relatively more preferable, Jin Erke
It is adjusted with the commodity sold the target vending machine launched at destination address, to obtain relatively better income.
Similar, will be able to include in the manner described above by changing the price of suffered commodity in second feature information
The second feature information of different commodity prices is input to the second model together with the area grade combination of identical destination address, obtains
The Policy Result of corresponding different commodity prices.The Policy Result of comprehensive corresponding different commodity prices again, is determined in destination address
Locating the prices of the target vending machine sold goods launched, to be set as how many acquired incomes relatively more preferable, and then can be to target
The commodity price of the target vending machine launched at address sold is adjusted, to obtain relatively better income.
Therefore the determination method of the dispensing address of the vending machine of this specification embodiment offer, due to by first will
Acquired fisrt feature information relevant to destination address to be determined is input to preparatory trained use as mode input
In the first model of commercialization degree for evaluating the preset range region where address, the region etc. of the destination address is obtained
Grade;Again by the area grade of destination address and second feature information collectively as mode input, it is input to preparatory trained use
In search optimal policy so that being obtained in the second model compared with high yield in the target vending machine that destination address is launched, so as to root
Determine whether to launch target vending machine in destination address according to the output of the second model, so as to lesser implementation cost, efficiently,
The dispensing of vending machine is accurately instructed, improves and sells goods acquired income based on the vending machine launched, solve existing side
Due to needing to rely on, largely exploration causes implementation cost higher to method on the spot, and needs to rely on the experience of people to judge to launch ground
Location causes accuracy lower, less reliable, objective technical problem.
In one embodiment, the fisrt feature information can specifically include at least one of: where destination address
Stream of people's data in preset range region, the POI data in the preset range region where destination address, arrive destination address logistics
Transport the data etc. of similar competing product in the preset range area where cost data, destination address.Certainly, it should be noted that
Above-mentioned cited fisrt feature information is that one kind schematically illustrates.When it is implemented, according to specific application scenarios and place
Reason demand may be incorporated into other characteristic attributes relevant to destination address as fisrt feature information.In this regard, this specification is not
It limits.
In one embodiment, the second feature information can specifically include at least one of: target vending machine institute
Sell the price of commodity, the type of target vending machine sold goods, operation mode of target vending machine etc..Certainly, it needs to illustrate
, above-mentioned cited second feature information is that one kind schematically illustrates.When it is implemented, according to specific application scenarios
And process demand, it may be incorporated into other characteristic attributes relevant to target vending machine as second feature information.In this regard, this theory
Bright book is not construed as limiting.
In one embodiment, using the second model, according to the area grade and the second feature information, determination is
It is no to launch the target vending machine in the destination address, when it is implemented, may include the following contents: utilizing second mould
Type is based on the area grade and the second feature information, obtains for the strategy for launching target vending machine in destination address
As a result;According to the Policy Result, it is determined whether launch the target vending machine in the destination address.
In the present embodiment, above-mentioned Policy Result can specifically include: increasing the vending machine at destination address, withdraws from target
The action policies such as the vending machine at address.If Policy Result is the vending machine increased at destination address, can be according to strategy
Target vending machine is launched at the destination address as a result, determining.If Policy Result is the vending machine withdrawn from destination address,
It can determine according to Policy Result and launch target vending machine not at the destination address.So as to obtain preferable income.
In the present embodiment, above-mentioned Policy Result specifically can also include: keep destination address at vending machine it is constant, adjust
The action policies such as the sold goods of the vending machine at whole destination address, the price of sold goods for adjusting target vending machine.Accordingly
, further the target vending machine launched at destination address can be adjusted according to above-mentioned Policy Result, for example, adjusting
The type of whole target vending machine sold goods, or the price etc. of adjustment target vending machine sold goods.So as to obtain
Relatively better income.
In one embodiment, first model, can specifically establish in the following way: obtain the of sample address
One characteristic information is as first sample data;According to the evaluation rule of preset commercialization degree, determines and mark out the first sample
The area grade in the preset range region where sample address corresponding to notebook data, the sample data after being marked;It utilizes
Sample data after the mark carries out model training, obtains first model.
In the present embodiment, when it is implemented, can record data by query history, vending machine is launched in acquisition in history
Address as sample address, and the fisrt feature information of the sample address is obtained, by the fisrt feature information of the sample address
As for training the first sample data of the first model.
In the present embodiment, pre- where acquisition sample address when it is implemented, data can be recorded by query history
If the state of trade of range areas, and according to the evaluation rule of preset commercialization degree, to the default model where sample address
The commercialization degree for enclosing region is graded, to determine the area grade in the preset range region where sample address, as
The area grade of sample address.Further, it is possible to mark out above-mentioned zone in first sample data corresponding to sample address
Grade, the sample data after being marked.
In the present embodiment, when it is implemented, can choose the initial model for the prediction that is suitable for classifying, and above-mentioned mark is utilized
Sample data afterwards carries out model training to above-mentioned initial model and obtains corresponding first model to determine model parameter.
In the present embodiment, above-mentioned initial model specifically can be GBDT (Gradient Boosting Decision
Tree, gradient decline tree) model.Certainly, it should be noted that above-mentioned cited initial model is that one kind is schematically said
It is bright.When it is implemented, can also use the model of the other kinds of prediction that is suitable for classifying as introductory die as the case may be
Type.In this regard, this specification is not construed as limiting.
In one embodiment, second model can specifically be established in the following way: obtain sample address with
The relevant second feature information of vending machine is as the second sample data;Combine preset range corresponding to second sample data
The area grade in region and second sample data, as sample environment status data;Obtain the vending machine to sample address
Operation data, as sample action policy data;It establishes environmental status data and action policy data is mapped to bonus data
Mapping function;According to the sample environment status data and sample action policy data, the training mapping function obtains institute
State the second model.
In the present embodiment, it when it is implemented, data can be recorded by query history, obtains corresponding with sample address
The second feature information of the vending machine of dispensing, i.e. the second feature information of sample address, as the second sample data.
In the present embodiment, in order to establish higher second model of accuracy, when it is implemented, can will be anti-
The area grade of the commercialization degree of the geographic area at the place of sample address is reflected, and is able to reflect the dispensing at sample address
The characteristic information of two kinds of different dimensions of second feature information of the situation of Profit of vending machine is combined, the spy after recycling combination
Reference breath carries out model training by intensified learning, obtains higher second model of accuracy to establish.
It in the present embodiment, when it is implemented, can be by the region in preset range region corresponding to the second sample data
Grade and the second sample data (i.e. second feature information) are combined, as a kind of sample environment status data.Meanwhile may be used also
In a manner of through query history record data etc., obtain in history to the operation data of the vending machine at sample address as sample
Action policy data.Wherein, the above-mentioned operation data to the vending machine at sample address, which specifically can be, withdraws from sample address
Vending machine, increase vending machine at sample address, or the vending machine sold goods type etc. at adjustment sample address.
In the present embodiment, when it is implemented, can be established according to the earning rate and damage rate of the vending machine of sample address
Reward function (i.e. bonus data) in intensified learning can be guided during learning so as to subsequent reinforced by reward function
Model is trained to the preferable direction of income, to search the Policy Result that can be obtained compared with good yield.
In the present embodiment, it is contemplated that above-mentioned environmental status data and action policy data can all influence sample address
The situation of Profit that vending machine is sold goods, therefore can first establish environmental status data and action policy data are mapped to reward number
According to mapping function.And then can be according to the sample environment status data and sample action policy data, the training mapping
Function is learnt by the training of model, determines model parameter, obtain second model.
It in the present embodiment, can be according to by sample environment status data and sample action policy data when specific training
Mapping function is substituted into, the reward at multiple time points is obtained according to time step iteration, then the reward at multiple time points is added up, obtains
Long-term accumulated to sample address is rewarded.The intensified learning rewarded by the long-term accumulated to multiple sample addresses, searches energy
Enough tactful directions for obtaining maximum long-term accumulated reward, thus establish obtain to provide to obtain based on environmental status data it is optimal
Long-term accumulated reward, the i.e. model of the action policy of maximum return, i.e. the second model.
In one embodiment, the reward function can specifically be determined according to the earning rate and allowance for depreciation of vending machine.Tool
It, can be according to vending machine vendors in order to find the Policy Result obtained compared with good yield by the second model when body is implemented
The earning rate and allowance for depreciation of product establishes corresponding reward function, with when through intensified learning the second model of training, Ke Yiyin
Guided mode type is automatically towards preferable income direction finding action policy.
In one embodiment, at least one of can specifically include to the operation data of the vending machine of sample address:
Increase sample address at vending machine, withdraw from the vending machine at sample address, keep sample address at vending machine it is constant, adjustment
Sold goods of vending machine at sample address etc..
Certainly, it should be noted that above-mentioned cited operation data is that one kind schematically illustrates.When it is implemented,
According to specific application scenarios and processing needs, other kinds of operation data may be incorporated into as the above-mentioned behaviour to vending machine
Make data.In this regard, this specification is not construed as limiting.
In one embodiment, the dispensing address of vending machine is determined in addition to can use the second model, carrying out vending machine choosing
Outside location, the commodity for being suitble to sell, the selection sold goods can also be determined using the second model.
When it is implemented, fisrt feature information relevant to destination address can first be obtained, and related to target vending machine
Second feature information, wherein the second feature information includes multiple test commodity;Using the first model, according to described
One characteristic information determines the corresponding area grade of destination address, wherein the area grade is used to indicate where destination address
The commercialization degree in preset range region;It is determined using the second model according to the area grade and the second feature information
Whether in the destination address dispensing target vending machine;And determining that launching the target in the destination address sells
In the case where machine, the end article that target vending machine is sold is determined from multiple test commodity.
In the present embodiment, the area grade that can control destination address is identical, by change second feature information
It tests commodity, changes second feature information, and then can obtain corresponding to different second feature information by the second model, i.e., it is corresponding
The Policy Result of difference test commodity.The Policy Result of comprehensive corresponding different test commodity, can sieve from multiple test commodity
Selecting income, preferably test commodity are launched to being sold on target vending machine, to obtain better income relatively.
Specifically, the Policy Result of for example corresponding test commodity A are as follows: keep the vending machine of destination address constant;It is corresponding to survey
Try the Policy Result of commodity B are as follows: increase the vending machine of destination address.The strategy knot of two kinds of above-mentioned correspondence test commodity is comprehensively compared
Fruit, it can be found that: if the vending machine at destination address sell test commodity B relative to other test commodity C it is available
Relatively better income.Therefore, test commodity C can be determined as end article by server, and the target at destination address is sold
The commodity that the machine of selling is sold are adjusted to test commodity C, sell test commodity C by vending machine to obtain relatively preferably long-term receive
Benefit.
In the present embodiment, can also obtain fisrt feature information relevant to destination address, and with target vending machine phase
The second feature information of pass, wherein the second feature information includes multiple test values;Using the first model, according to described
Fisrt feature information determines the corresponding area grade of destination address, wherein the area grade is used to indicate where destination address
Preset range region commercialization degree;Using the second model, according to the area grade and the second feature information, really
Whether determine in the destination address dispensing target vending machine;And determining that launching the target in the destination address sells
In the case where selling machine, the target price for the commodity that target vending machine is sold is determined from multiple test values.And by target
The price adjustment for the commodity that vending machine is sold is the target price, so as to pass through the sale price to end article
Adjustment, to obtain better long-term gain etc..
Therefore the determination method of the dispensing address of the vending machine of this specification embodiment offer, due to by first will
Acquired fisrt feature information relevant to destination address to be determined is input to preparatory trained use as mode input
In the first model of commercialization degree for evaluating the preset range region where address, the region etc. of the destination address is obtained
Grade;Again by the area grade of destination address and second feature information collectively as mode input, it is input to preparatory trained use
In search optimal policy so that being obtained in the second model compared with high yield in the target vending machine that destination address is launched, so as to root
Determine whether to launch target vending machine in destination address according to the output of the second model, so as to lesser implementation cost, efficiently,
The dispensing of vending machine is accurately instructed, improves and sells goods acquired income based on the vending machine launched, solve existing side
Due to needing to rely on, largely exploration causes implementation cost higher to method on the spot, and needs to rely on the experience of people to judge to launch ground
Location causes accuracy lower, less reliable, objective technical problem;Also pass through the of the training in the way of based on intensified learning
Two models determine the dispensing address of target vending machine, the error that manually marks and may introduce are avoided, so that identified throwing
It is more accurate to put address.
As shown in fig.6, this specification embodiment additionally provides the determination method of the dispensing address of another vending machine.
Wherein, this method is when it is implemented, may include the following contents:
S61: fisrt feature information relevant to destination address, and second feature relevant to target vending machine letter are obtained
Breath;
S63: third model is utilized, according to the fisrt feature information and the second feature information, it is determined whether in institute
It states destination address and launches the target vending machine.
In the present embodiment, above-mentioned third model specifically can be understood as a kind of fisrt feature for advancing with sample address
The second feature information second feature information of sample address (or claim) of vending machine at information and sample address is as the
Three sample datas, and corresponding model training is carried out by intensified learning, what is obtained can be based on the spy for including dispensing address
Seek peace dispensing the feature of vending machine etc. environmental status data, the mould for obtaining the Policy Result compared with high yield can be corresponded to by searching out
Type.
In one embodiment, the third model when it is implemented, can establish in the following way: with obtaining sample
The fisrt feature information and second feature information of location are as sample environment status data;Obtain the behaviour to the vending machine of sample address
Make data, as sample action policy data;It establishes environmental status data and action policy data is mapped to reflecting for bonus data
Penetrate function;According to the sample environment status data and sample action policy data, the training mapping function obtains described the
Three models.
This specification embodiment also provides a kind of server, including processor and is used for storage processor executable instruction
Memory, the processor can be according to instruction execution following steps when being embodied: obtaining relevant to destination address the
One characteristic information, and second feature information relevant to target vending machine;Using the first model, believed according to the fisrt feature
Breath, determines the corresponding area grade of destination address, wherein the area grade is used to indicate the preset range where destination address
The commercialization degree in region;Using the second model, according to the area grade and the second feature information, it is determined whether in institute
It states destination address and launches the target vending machine.
In order to more accurately complete above-metioned instruction, as shown in fig.7, this specification embodiment additionally provide it is another
Kind specific server, wherein the server includes network communications port 701, processor 702 and memory 703, above-mentioned
Structure is connected by Internal cable, so that each structure can carry out specific data interaction.
Wherein, the network communications port 701 specifically can be used for obtaining fisrt feature letter relevant to destination address
Breath, and second feature information relevant to target vending machine;
The processor 702 specifically can be used for determining target according to the fisrt feature information using the first model
The corresponding area grade in address, wherein the area grade is used to indicate the business in the preset range region where destination address
Change degree;Using the second model, according to the area grade and the second feature information, it is determined whether in the destination address
Launch the target vending machine;
The memory 703 specifically can be used for the corresponding instruction repertorie that storage processor 702 is based on.
In the present embodiment, the network communications port 701 can be is bound from different communication protocol, so as to
To send or receive the virtual port of different data.For example, the network communications port can be responsible for carrying out web data communication
No. 80 ports, be also possible to be responsible for carry out FTP data communication No. 21 ports, can also be responsible for carry out email data communication
No. 25 ports.In addition, the network communications port can also be the communication interface or communication chip of entity.For example, it can
Think mobile radio network communication chip, such as GSM, CDMA;It can also be Wifi chip;It can also be Bluetooth chip.
In the present embodiment, the processor 702 can be implemented in any suitable manner.For example, processor can be adopted
The computer readable program code for taking such as microprocessor or processor and storage that can be executed by (micro-) processor is (such as soft
Part or firmware) computer-readable medium, logic gate, switch, specific integrated circuit (Application Specific
Integrated Circuit, ASIC), programmable logic controller (PLC) and the form etc. for being embedded in microcontroller.This specification is simultaneously
It is not construed as limiting.
In the present embodiment, the memory 703 may include many levels, in digital display circuit, as long as two can be saved
Binary data can be memory;In integrated circuits, the circuit with store function of a not no physical form
It is memory, such as RAM, FIFO;In systems, the storage equipment with physical form is also memory, such as memory bar, TF card
Deng.
This specification embodiment additionally provides the computer of a kind of determination method of dispensing address based on above-mentioned vending machine
Storage medium, the computer storage medium are stored with computer program instructions, are performed in the computer program instructions
It realizes: obtaining fisrt feature information relevant to destination address, and second feature information relevant to target vending machine;Utilize
One model determines the corresponding area grade of destination address, wherein the area grade is used for according to the fisrt feature information
Indicate the commercialization degree in the preset range region where destination address;Using the second model, according to the area grade and institute
State second feature information, it is determined whether launch the target vending machine in the destination address.
In the present embodiment, above-mentioned storage medium includes but is not limited to random access memory (Random Access
Memory, RAM), read-only memory (Read-Only Memory, ROM), caching (Cache), hard disk (Hard Disk
Drive, HDD) or storage card (Memory Card).The memory can be used for storing computer program instructions.Network is logical
Letter unit can be according to standard setting as defined in communication protocol, for carrying out the interface of network connection communication.
In the present embodiment, the function and effect of the program instruction specific implementation of computer storage medium storage, can be with
Explanation is compareed with other embodiment, details are not described herein.
As shown in fig.8, this specification embodiment additionally provides a kind of dispensing address of vending machine on software view
Determining device, the device can specifically include construction module below:
Module 801 is obtained, specifically can be used for obtaining fisrt feature information relevant to destination address, and sell with target
The relevant second feature information of machine;
First determining module 802 specifically can be used for determining mesh according to the fisrt feature information using the first model
Mark the corresponding area grade in address, wherein the area grade is used to indicate the quotient in the preset range region where destination address
Industry degree;
Second determining module 803 specifically can be used for using the second model, according to the area grade and second spy
Reference breath, it is determined whether launch the target vending machine in the destination address.
In one embodiment, the fisrt feature information can specifically include at least one of: where destination address
Stream of people's data in preset range region, the POI data in the preset range region where destination address, arrive destination address logistics
Transport the data etc. of similar competing product in the preset range area where cost data, destination address.
In one embodiment, the second feature information can specifically include at least one of: target vending machine institute
Sell the price of commodity, the type of target vending machine sold goods, operation mode of target vending machine etc..
In one embodiment, second determining module 803 can specifically include following structural unit:
Processing unit specifically can be used for being based on the area grade and the second feature using second model
Information is obtained for the Policy Result for launching target vending machine in destination address;
Determination unit specifically can be used for according to the Policy Result, it is determined whether described in launching in the destination address
Target vending machine.
In one embodiment, described device can also establish module including first, and first module is for establishing
To the first model.When it is implemented, first establishes model and specifically can be used for obtaining the fisrt feature information conduct of sample address
First sample data;According to the evaluation rule of preset commercialization degree, determines and mark out corresponding to first sample data
The area grade in the preset range region where sample address, the sample data after being marked;Utilize the sample after the mark
Notebook data carries out model training, obtains first model.
In one embodiment, described device specifically can also establish module including second, and described second establishes module tool
Body is for establishing the second model.When it is implemented, second establishes module specifically can be used for obtaining sample address and vending machine
Relevant second feature information is as the second sample data;Combine preset range region corresponding to second sample data
Area grade and second sample data, as sample environment status data;Obtain the operation to the vending machine of sample address
Data, as sample action policy data;It establishes environmental status data and action policy data is mapped to the mapping of bonus data
Function;According to the sample environment status data and sample action policy data, the training mapping function obtains described second
Model.
In one embodiment, the reward function can specifically be determined according to the earning rate and allowance for depreciation of vending machine.
In one embodiment, at least one of can specifically include to the operation data of the vending machine of sample address:
Increase sample address at vending machine, withdraw from the vending machine at sample address, keep sample address at vending machine it is constant, adjustment
Sold goods of vending machine at sample address etc..
In one embodiment, when it is implemented, the acquisition module, specifically can be also used for obtaining and destination address phase
The fisrt feature information of pass, and second feature information relevant to target vending machine, wherein the second feature information includes more
A test commodity;
First determining module specifically can be also used for, according to the fisrt feature information, determining using the first model
The corresponding area grade of destination address, wherein the area grade is used to indicate the preset range region where destination address
Commercialization degree;
Second determining module, specifically can be also used for using the second model, according to the area grade and described the
Two characteristic informations, it is determined whether launch the target vending machine in the destination address;And it is determining in the destination address
In the case where launching the target vending machine, the end article that target vending machine is sold is determined from multiple test commodity.
It should be noted that unit, device or module etc. that above-described embodiment illustrates, specifically can by computer chip or
Entity is realized, or is realized by the product with certain function.For convenience of description, it describes to divide when apparatus above with function
It is described respectively for various modules.It certainly, can be the function of each module in same or multiple softwares when implementing this specification
And/or realized in hardware, the module for realizing same function can also be realized by the combination of multiple submodule or subelement etc..With
Upper described Installation practice is only schematical, for example, the division of the unit, only a kind of logic function is drawn
Point, there may be another division manner in actual implementation, such as multiple units or components may be combined or can be integrated into separately
One system, or some features can be ignored or not executed.Another point, shown or discussed mutual coupling or straight
Connecing coupling or communication connection can be through some interfaces, and the indirect coupling or communication connection of device or unit can be electrical property,
Mechanical or other forms.
Therefore the determining device of the dispensing address of the vending machine of this specification embodiment offer, it is determined by first
Module first using acquired fisrt feature information relevant to destination address to be determined as mode input, is input to preparatory instruction
In first model of the commercialization degree for evaluating the preset range region where address perfected, the destination address is obtained
Area grade;It is by the second determining module that the area grade of destination address and second feature information is defeated collectively as model again
Enter, is input to trained in advance for searching for optimal policy so that obtaining in the target vending machine that destination address is launched higher
In second model of income, to determine whether to launch target vending machine in destination address according to the output of the second model, so as to
Enough efficiently, accurately to instruct the dispensing of vending machine with lesser implementation cost, raising sells goods institute based on the vending machine launched
The income of acquisition, solving existing method, largely exploration causes implementation cost higher on the spot due to needing to rely on, and needs
The experience of dependence people causes accuracy lower to judge to launch address, less reliable, objective technical problem.
This specification embodiment additionally provides a kind of determining device of the dispensing address of vending machine, which specifically can wrap
Include following construction module:
Obtain module, specifically can be used for obtaining relevant to destination address fisrt feature information, and with target vending machine
Relevant second feature information;
Determining module specifically can be used for using third model, according to the fisrt feature information and the second feature
Information, it is determined whether launch the target vending machine in the destination address.
In one embodiment, described device specifically can also include establishing module, for establishing third model.It is described to build
Formwork erection block when it is implemented, the fisrt feature information and second feature information that can be used for obtaining sample address as sample environment
Status data;The operation data to the vending machine of sample address is obtained, as sample action policy data;Establish ambient condition number
According to the mapping function for being mapped to bonus data with action policy data;According to the sample environment status data and sample action plan
Slightly data, the training mapping function, obtain the third model.
Although being based on routine or nothing present description provides the method operating procedure as described in embodiment or flow chart
Creative means may include more or less operating procedure.The step of enumerating in embodiment sequence is only numerous steps
One of rapid execution sequence mode does not represent and unique executes sequence.When device or client production in practice executes,
Can be executed according to embodiment or the execution of method shown in the drawings sequence or parallel (such as parallel processor or multithreading
The environment of processing, even distributed data processing environment).The terms "include", "comprise" or its any other variant are intended to
Cover non-exclusive inclusion, so that the process, method, product or the equipment that include a series of elements not only include those
Element, but also including other elements that are not explicitly listed, or further include for this process, method, product or setting
Standby intrinsic element.In the absence of more restrictions, being not precluded is including process, method, the product of the element
Or there is also other identical or equivalent elements in equipment.The first, the second equal words are used to indicate names, and are not offered as appointing
What specific sequence.
It is also known in the art that other than realizing controller in a manner of pure computer readable program code, it is complete
Entirely can by by method and step carry out programming in logic come so that controller with logic gate, switch, specific integrated circuit, programmable
Logic controller realizes identical function with the form for being embedded in microcontroller etc..Therefore this controller is considered one kind
Hardware component, and the structure that the device for realizing various functions that its inside includes can also be considered as in hardware component.Or
Person even, can will be considered as realizing the device of various functions either the software module of implementation method can be hardware again
Structure in component.
This specification can describe in the general context of computer-executable instructions executed by a computer, such as journey
Sequence module.Generally, program module include routines performing specific tasks or implementing specific abstract data types, programs, objects,
Component, data structure, class etc..This specification can also be practiced in a distributed computing environment, in these distributed computing rings
In border, by executing task by the connected remote processing devices of communication network.In a distributed computing environment, program mould
Block can be located in the local and remote computer storage media including storage equipment.
By the description of above embodiment it is found that those skilled in the art can be understood that this specification can
It realizes by means of software and necessary general hardware platform.Based on this understanding, the technical solution sheet of this specification
The part that contributes to existing technology can be embodied in the form of software products in other words in matter, which produces
Product can store in storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions are with so that a computer is set
Standby (can be personal computer, mobile terminal, server or the network equipment etc.) execute each embodiment of this specification or
Method described in certain parts of embodiment.
Each embodiment in this specification is described in a progressive manner, the same or similar portion between each embodiment
Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.This specification can be used for
In numerous general or special purpose computing system environments or configuration.Such as: personal computer, server computer, handheld device
Or portable device, laptop device, multicomputer system, microprocessor-based system, set top box, programmable electronics set
Standby, network PC, minicomputer, mainframe computer, distributed computing environment including any of the above system or equipment etc..
Although depicting this specification by embodiment, it will be appreciated by the skilled addressee that there are many become for this specification
Shape and the spirit changed without departing from this specification, it is desirable to which the attached claims include these deformations and change without departing from this
The spirit of specification.