WO2017068228A1 - Method and apparatus for optimization - Google Patents

Method and apparatus for optimization Download PDF

Info

Publication number
WO2017068228A1
WO2017068228A1 PCT/FI2015/050705 FI2015050705W WO2017068228A1 WO 2017068228 A1 WO2017068228 A1 WO 2017068228A1 FI 2015050705 W FI2015050705 W FI 2015050705W WO 2017068228 A1 WO2017068228 A1 WO 2017068228A1
Authority
WO
WIPO (PCT)
Prior art keywords
edges
vertices
function
clusters
selection criteria
Prior art date
Application number
PCT/FI2015/050705
Other languages
French (fr)
Inventor
Troels Frimodt RØNNOW
Original Assignee
Nokia Technologies Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Technologies Oy filed Critical Nokia Technologies Oy
Priority to PCT/FI2015/050705 priority Critical patent/WO2017068228A1/en
Publication of WO2017068228A1 publication Critical patent/WO2017068228A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models

Definitions

  • the present invention relates generally to solving optimization problems. More particularly, the present invention relates to a method for finding a solution to binary optimization problems. The present invention also relates to apparatuses and computer program products for implementing the method and circuitry relating to the binary optimization.
  • Quadratic binary optimization is a particular type of problems. Such problems may be both extremely awkward for digital computers and important. Many cutting-edge artificial intelligence (AI) involves solving such problems.
  • a Boltzmann machine may solve binary optimization and may accelerate sampling from a Boltzmann distribution.
  • Quadratic binary optimization is to minimize a quadratic function in which decision variables may only take certain discrete values, such as +1 and -1.
  • the idea of quadratic binary optimization may be adapted to different kinds of programmable circuits. Quadratic binary optimization problems may arise in operational research such as planning, scheduling, routing, finance such as portfolio optimization, physics such as spin glass, machine learning and many more.
  • an apparatus comprises a first input for receiving information on vertices and edges coupled between the vertices relating to an optimization problem;
  • a first element for applying a function on the vertices and edges to obtain a modified graph
  • a second element for sorting edges of the modified graph according to one or more sorting criteria
  • a third element for removing edges on the basis of the selection criteria, wherein vertices and edges remaining in the modified graph form one or more clusters,
  • the apparatus is adapted to provide the clusters for solving the optimization problem.
  • a method comprises
  • an apparatus comprises at least one processor; and at least one memory including computer program code the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform at least the following:
  • a computer readable storage medium stored with code thereon for use by an apparatus, which when executed by a processor, causes the apparatus to perform: receive information on vertices and edges coupled between the vertices relating to an optimization problem;
  • an apparatus comprises:
  • FIG. 1 a illustrates a fully connected graph for a single simulated annealing slice
  • FIG. lb illustrates the fully connected path of Fig. la after applying a function
  • FIGS lc— lj illustrate new graphs defined from various percentiles of the included edges of the modified graph of FIG. lb;
  • FIGS. 2a and 2b depict graphs regarding a problem of an Ising spin glass with Chimera connectivity, in accordance with an embodiment
  • FIG. 3a is a diagram of some components of a computing apparatus comprising the computational architecture for binary optimization according to an exemplary embodiment
  • FIG. 3b is a diagram of some components of a computing apparatus comprising the computational architecture for binary optimization according to another exemplary embodiment
  • FIG. 4 is a flow diagram illustrating an example method, in accordance with an embodiment. DESCRIPTION OF EXAMPLE EMBODIMENTS
  • Two example methods to perform optimisation are thermal annealing and quantum annealing.
  • a physical system may be used to solve hard problems.
  • An example type of annealing may be carried out on Ising spin glasses.
  • the samples may be used to train machine learning algorithms or may be used to solve NP-hard problems. While one may be able to build a physical system, some aspects of the physical system may be efficiently simulated.
  • the type of problems solved by annealing machines one may be interested in sampling states s from a Gibbs distribution. Embodiments are provided to show how to implement in hardware quadratic binary optimization.
  • Binary optimization problems may occur in many fields of science and technology. Many scheduling and routing and planning tasks are binary optimization problems. Also problems in the field of machine learning can be formulated as binary optimization problems. One type of a problem encountered in binary optimization and machine learning is that of minimizing binary quadratic functionals
  • Equation (1) Equation (1) is equivalent to finding the optimal configurations of an Ising spin glass with an energy functional given by
  • Jij and hi are floating points (possibly negative) and Si are representing physical spins that can either be up or down.
  • the variables Jij represent interactions between spins Si and Sj and hi represents an external field affecting to the spin Si.
  • Si are variables with values ⁇ +1 ;-1 ⁇ and ⁇ Jij ⁇ and ⁇ hi ⁇ define a problem of interest.
  • the energy functional and hence ⁇ Jij ⁇ and ⁇ hi ⁇ defines the particular distribution one may be interested in.
  • Equation (1) Many different techniques may be used to obtain good solutions for Equation (1). For example, standard programming techniques such as divide-and-conquer, backtracking, heuristic searches, quantum optimization hardware and in particular quantum annealing or thermal annealing may be used. In case of the latter, these methods may be applied by simulation or by building a system where the dynamics of the physics corresponds to the desired optimisation method.
  • standard programming techniques such as divide-and-conquer, backtracking, heuristic searches, quantum optimization hardware and in particular quantum annealing or thermal annealing may be used. In case of the latter, these methods may be applied by simulation or by building a system where the dynamics of the physics corresponds to the desired optimisation method.
  • Gibbs distributions can be sampled using Monte Carlo techniques.
  • Monte Carlo one samples the distribution by suggesting a new configuration s' and accepting the move with the probability e - ( s ⁇ man y cases one suggests single spin moves. This may lead to the problem that one does not get good samples from the distribution, but often ends up sampling from a subset of the distribution. This may lead to a slow reduction of the residual energy. While it in general is NP-hard to sample from a Gibbs distribution, Monte Carlo techniques may be efficient in some case and the above mentioned problem can be addressed by suggesting multi-spin moves.
  • Performing simulated thermal or quantum annealing to sample from a Gibbs distribution is in general a hard problem.
  • Monte Carlo techniques may be efficient, but often a single Markov chain may be trapped in a local area of the state space due to the structure of a given problem. If one can locate the structure in the problem, the auto-correlation time in the sampling process may be lowered and consequently, each Markov chain may improve efficiency.
  • cluster update is a method used for proposing updates in a Markov chain for Ising spin glasses.
  • the cluster is predefined, and in another example method a heuristic algorithm is used to expand a given cluster with additional spins following a given ruleset.
  • a heuristic algorithm is used to expand a given cluster with additional spins following a given ruleset.
  • the identified structure can then be used to sample states from the Gibbs distribution.
  • the implementation may consist of two units: a first unit 300 which analyses the graph and makes suggestions on how to update it, and a second unit 310 which may realise the Markov chain or another method for optimization.
  • Removing edges according to a given percentile or another removal criteria will then define one or more clusters which can be detected in polynomial time (block 410). These clusters may be provided to the second unit 310 (block 412) which may perform e.g. Monte Carlo moves.
  • the function f(V,E) is chosen to just modify the edges by taking the absolute value.
  • the highest percentile will just amount to single spin updates ( Figure lj).
  • the lower percentiles will form clusters depending on how strongly two spins are connected. As shown in Figures lc— lj, some percentiles will in this example yield the same clusters.
  • clusters are clusters relating to 50 percentile (Figure If), 62,5 percentile (Figure lg) and 75 percentile ( Figure lh). Similar clusters may be discarded.
  • Figures l a— lj summarise the cluster generation, in accordance with an embodiment.
  • the thickness of the lines illustrates the magnitude of the edge weights: the thicker the line the greater is the magnitude.
  • Figure la illustrates the original graph. In Figure lb the outcome of the function E' is depicted.
  • the first unit 300 may select a percentile value to be used to define which edges will be included and which will be left out for forming clusters.
  • the first unit may select one or more of the following alternative percentiles: 12,5; 25; 37,5; 50; 62,5; 75; 87,5 and 100.
  • these percentiles are just non-limiting examples but the first unit 300 may also select other percentiles or percentiles.
  • the edges of the outcome of the function E' may be treated as a list of edges sorted according to their weights.
  • the selected percentile may then define a so called break point in the list below which edges will be left out. In other words, edges having smaller weight than the edge at the break point will be left out and the remaining edges will be used in the cluster formation.
  • Figures lc— lj illustrate new graphs defined from various percentiles of the included edges (12.5; 25; 37.5; 50; 62.5; 75; 87.5 and 100, respectively).
  • 12.5 and 25 percentile cases illustrated in Figures lc and Id respectively, there is only one cluster.
  • 37,5 percentile case illustrated in Figure le there are two clusters.
  • 50, 62.5 and 75 percentile cases illustrated in Figures If, lg and lh, respectively, there are three clusters.
  • the 87.5 percentile case illustrated in Figure li there are five clusters
  • in the 100 percentile case illustrated in Figure lj there are seven clusters.
  • the clusters obtained may be provided to the second unit 310 which may realize e.g. the Markov chain or another method for solving the minimization problem.
  • the problem of an Ising spin glass with Chimera connectivity is considered where two nearest neighbour cells are connected by a single bond and where the bonds in a single unit cell strongly couples the spins within the cell.
  • An example of this is depicted in Figures 2a and 2b.
  • the function f(V,E) may be chosen to be such that every edge that belongs to the same unit cell gets a high value and those that connects two different unit cells get a lower value. This may cause that the unit cells are identified as clusters and help overcome the barriers in the problem.
  • the speedup from applying this technique may be, for example, three or four orders of magnitude.
  • this method may change the asymptotic scaling of the characteristic problem.
  • the couplings illustrated with solid lines are ferromagnetic and the couplings illustrated with dashed lines are anti-ferromagnetic.
  • the second example may also be realised in a version that does not require knowledge of the graph structure.
  • a function f(V,E) may be defined which may be constructed as follows: To assign a new value to the edge e e E connecting vi with v 2 , the edge is removed and the shortest path between the two edges are computed.
  • the new edge value is then set to the reciprocal of the total number of edges which need to be transverse in order to reach the go from the first vertex vi to the second vertex v 2 .
  • the result is that an edge connecting two unit cells will be attributed the value 7 "1 (i.e. the shortest path after removing the edge goes via 7 edges) and all other edges will get the value 3 "1 again providing distinction to improve performance of the annealing algorithm. In this case, any knowledge of the graph structure has not been used.
  • the first unit 300 may also be realised by choosing other methods for identifying structures by, for instance, using graph theoretical methods, heuristic algorithms or other approaches to assign scores to edges.
  • the first unit 300 may or may not include knowledge about the graph structure, it may or may not make use of the edge values Jij as wells as the local fields hi.
  • the first unit 300 and the second unit 310 may operate interactively and iteratively, or the first unit 300 may first form one or more propositions (sets of clusters) to be used by the second unit 310. If they operate interactively and iteratively, the first unit 300 may define one set of clusters by using e.g. one of the above described methods and provide the set of clusters to the second unit 310. As an example, the first unit 300 may first use a first percentile such as 12.5, construct the clusters accordingly and provide the result to the second unit 310. The second unit 310 may try to find an optimum solution and if the second unit 310 determines that the optimum solution has been found, the operation may be stopped and the result may be provided for further processing, e.g. for displaying.
  • a first percentile such as 12.5
  • the second unit 310 may indicate the first unit 300 to provide a new proposal for optimization.
  • the first unit 300 may, for example, select a second percentile value such as 25, construct the clusters accordingly and provide the nre result to the second unit 310. This procedure may be continued until the second unit 310 has found a result which fulfils the optimization criteria, or until another termination condition has been fulfilled.
  • FIG. 3 illustrates an example of a computing device 100 in which the computing circuitry 200 may be utilized.
  • the computing device 100 may comprise the computing circuitry 200 implemented e.g. in a FPGA circuit or another programmable circuitry.
  • the computing circuitry 200 may comprise the first unit 300 and the second unit 310, or there may be separate computing units for the first unit 300 and the second unit 310.
  • the inputs and outputs of the computing circuitry 200 may be connected to an interface circuitry 104 which comprises means for providing information to the computing circuitry 200, e.g. to initialize some parameters and initial values for the couplings Jij e.g. by setting local floating point memory into appropriate values, and for obtaining information from the computing circuitry 200.
  • Information obtained from the computing circuitry 200 may comprise e.g. computation results.
  • the computing device 100 may also comprise a display 1 10 for displaying information to the user, and a keyboard 1 12 and/or another input device so that the user may control the operation of the computing device 100 and input parameters, variables etc. to be used by the computing circuitry 102.
  • the processor 1 16 may be implemented in the same chip 210 than the simulated annulation units 200, as is depicted in Figure 3a, wherein the interface 104, the memory 1 18 or a part of it and/or some other elements of the computing device may also be part of the chip 210, or the processor 1 16 and possibly also the interface 104 may be separate from the chip, as is depicted in Figure 3b.
  • the computing unit 200 may comprise one or more controllers in addition to the processor 1 16.
  • Non-volatile media include, for example, optical or magnetic disks, such as storage device.
  • Volatile media include, for example, dynamic memoryl l 8.
  • Transmission media include, for example, coaxial cables, copper wire, fibre optic cables, and carrier waves that travel through space without wires or cables, such as acoustic waves and electromagnetic waves, including radio, optical and infrared waves. Signals include man-made transient variations in amplitude, frequency, phase, polarization or other physical properties transmitted through the transmission media.
  • Computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, CDRW, DVD, any other optical medium, punch cards, paper tape, optical mark sheets, any other physical medium with patterns of holes or other optically recognizable indicia, a RAM, a PROM, an EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave, or any other medium from which a computer can read.
  • a floppy disk a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, CDRW, DVD, any other optical medium, punch cards, paper tape, optical mark sheets, any other physical medium with patterns of holes or other optically recognizable indicia, a RAM, a PROM, an EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave, or any other medium from which a computer can read.
  • Embodiments of the present invention may be implemented in software, hardware, application logic or a combination of software, hardware and application logic.
  • the application logic, software or an instruction set is maintained on any one of various conventional computer-readable media.
  • a "computer-readable medium" may be any media or means that can contain, store, communicate, propagate or transport the instructions for use by or in connection with an instruction execution system, apparatus, or device, such as a computer, with one example of a computer described and depicted in Figures 17 and 18.
  • a computer-readable medium may comprise a computer-readable storage medium that may be any media or means that can contain or store the instructions for use by or in connection with an instruction execution system, apparatus, or device, such as a computer.
  • the various embodiments of the invention may be implemented in hardware or special purpose circuits or any combination thereof. While various aspects of the invention may be illustrated and described as block diagrams or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
  • Embodiments of the inventions may be practiced in various components such as integrated circuit modules.
  • the design of integrated circuits is by and large a highly automated process.
  • Complex and powerful software tools are available for converting a logic level design into a semiconductor circuit design ready to be etched and formed on a semiconductor substrate.

Abstract

An approach is provided for solving optimization problems. The present invention also relates to a method for obtaining clusters from a graph. The method comprises receiving information on vertices and edges coupled between the vertices relating to an optimization problem, receiving a selection criteria, applying a function on the vertices and edges to obtain a modified graph, sorting edges of the modified graph according to one or more sorting criteria, and removing edges on the basis of the selection criteria. The vertices and edges remaining in the modified graph form one or more clusters. The method further comprises providing the clusters for solving the optimization problem. There are also disclosed apparatuses for implementing the method and a computer readable storage medium stored with code thereon for implementing the method.

Description

METHOD AND APPARATUS FOR OPTIMIZATION
TECHNOLOGICAL FIELD
The present invention relates generally to solving optimization problems. More particularly, the present invention relates to a method for finding a solution to binary optimization problems. The present invention also relates to apparatuses and computer program products for implementing the method and circuitry relating to the binary optimization.
BACKGROUND
This section is intended to provide a background or context to the invention that is recited in the claims. The description herein may include concepts that could be pursued, but are not necessarily ones that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, what is described in this section is not prior art to the description and claims in this application and is not admitted to be prior art by inclusion in this section.
Quadratic binary optimization (QUBO) is a particular type of problems. Such problems may be both extremely awkward for digital computers and important. Many cutting-edge artificial intelligence (AI) involves solving such problems. A Boltzmann machine may solve binary optimization and may accelerate sampling from a Boltzmann distribution.
An aim of the quadratic binary optimization is to minimize a quadratic function in which decision variables may only take certain discrete values, such as +1 and -1. The idea of quadratic binary optimization may be adapted to different kinds of programmable circuits. Quadratic binary optimization problems may arise in operational research such as planning, scheduling, routing, finance such as portfolio optimization, physics such as spin glass, machine learning and many more.
SOME EXEMPLARY EMBODIMENTS
An aim is to obtain an apparatus and method for solving optimization problems. According to one embodiment, an apparatus comprises a first input for receiving information on vertices and edges coupled between the vertices relating to an optimization problem;
a second input for inputting a selection criteria;
a first element for applying a function on the vertices and edges to obtain a modified graph; a second element for sorting edges of the modified graph according to one or more sorting criteria; and
a third element for removing edges on the basis of the selection criteria, wherein vertices and edges remaining in the modified graph form one or more clusters,
wherein the apparatus is adapted to provide the clusters for solving the optimization problem.
According to one embodiment, a method comprises
receiving information on vertices and edges coupled between the vertices relating to an optimization problem;
receiving a selection criteria;
applying a function on the vertices and edges to obtain a modified graph;
sorting edges of the modified graph according to one or more sorting criteria;
removing edges on the basis of the selection criteria, wherein vertices and edges remaining in the modified graph form one or more clusters; and
providing the clusters for solving the optimization problem.
According to one embodiment, an apparatus comprises at least one processor; and at least one memory including computer program code the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform at least the following:
receive information on vertices and edges coupled between the vertices relating to an optimization problem;
receive a selection criteria;
apply a function on the vertices and edges to obtain a modified graph;
sort edges of the modified graph according to one or more sorting criteria;
remove edges on the basis of the selection criteria, wherein vertices and edges remaining in the modified graph form one or more clusters; and
provide the clusters for solving the optimization problem.
According to one embodiment there is provided a computer readable storage medium stored with code thereon for use by an apparatus, which when executed by a processor, causes the apparatus to perform: receive information on vertices and edges coupled between the vertices relating to an optimization problem;
receive a selection criteria;
apply a function on the vertices and edges to obtain a modified graph;
sort edges of the modified graph according to one or more sorting criteria;
remove edges on the basis of the selection criteria, wherein vertices and edges remaining in the modified graph form one or more clusters; and
provide the clusters for solving the optimization problem. According to one embodiment, an apparatus comprises:
means for receiving information on vertices and edges coupled between the vertices relating to an optimization problem;
means for receiving a selection criteria;
means for applying a function on the vertices and edges to obtain a modified graph;
means for sorting edges of the modified graph according to one or more sorting criteria;
means for removing edges on the basis of the selection criteria, wherein vertices and edges remaining in the modified graph form one or more clusters; and
means for providing the clusters for solving the optimization problem. Still other aspects, features, and advantages of the invention are readily apparent from the following detailed description, simply by illustrating a number of particular embodiments and implementations.
The invention is also capable of other and different embodiments, and its several details can be modified in various obvious respects, all without departing from the spirit and scope of the invention. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.
There are provided examples of architectures for quantum annealing at finite temperature.
BRIEF DESCRIPTION OF THE DRAWINGS
The embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings:
FIG. 1 a illustrates a fully connected graph for a single simulated annealing slice;
FIG. lb illustrates the fully connected path of Fig. la after applying a function; FIGS lc— lj illustrate new graphs defined from various percentiles of the included edges of the modified graph of FIG. lb;
FIGS. 2a and 2b depict graphs regarding a problem of an Ising spin glass with Chimera connectivity, in accordance with an embodiment;
FIG. 3a is a diagram of some components of a computing apparatus comprising the computational architecture for binary optimization according to an exemplary embodiment; FIG. 3b is a diagram of some components of a computing apparatus comprising the computational architecture for binary optimization according to another exemplary embodiment; and
FIG. 4 is a flow diagram illustrating an example method, in accordance with an embodiment. DESCRIPTION OF EXAMPLE EMBODIMENTS
In the following description, for the purposes of explanation, some specific details are set forth in order to provide a thorough understanding of the embodiments of the invention. It is apparent, however, to one skilled in the art that the embodiments may be practiced without these specific details or with an equivalent arrangement. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the embodiments of the invention.
Two example methods to perform optimisation are thermal annealing and quantum annealing. In these methods, a physical system may be used to solve hard problems. An example type of annealing may be carried out on Ising spin glasses. In this type of problem configurations of a discreet problem may be sampled from a Gibbs distribution. The samples may be used to train machine learning algorithms or may be used to solve NP-hard problems. While one may be able to build a physical system, some aspects of the physical system may be efficiently simulated. In the type of problems solved by annealing machines one may be interested in sampling states s from a Gibbs distribution. Embodiments are provided to show how to implement in hardware quadratic binary optimization.
Binary optimization problems may occur in many fields of science and technology. Many scheduling and routing and planning tasks are binary optimization problems. Also problems in the field of machine learning can be formulated as binary optimization problems. One type of a problem encountered in binary optimization and machine learning is that of minimizing binary quadratic functionals
£ =∑∑ >A +∑«A (i) j i
where bi are binary variables and {wij} and {a;} define a problem of interest. The minimization problem of Equation (1) is equivalent to finding the optimal configurations of an Ising spin glass with an energy functional given by
fi =∑∑JiJsisJ +∑hisi (2) j i
where Jij and hi are floating points (possibly negative) and Si are representing physical spins that can either be up or down. The variables Jij represent interactions between spins Si and Sj and hi represents an external field affecting to the spin Si. In other words, Si are variables with values {+1 ;-1 } and {Jij} and {hi} define a problem of interest. The energy functional and hence {Jij} and {hi} defines the particular distribution one may be interested in.
Every configuration s yields a corresponding energy ES ≡H(S) which is minimal if s is an optimal solution. Often the optimal solution is not found and it is therefore instructive to use the residual energy ERES = E(s)- EOPT as a measure of how a given algorithm performs.
Many different techniques may be used to obtain good solutions for Equation (1). For example, standard programming techniques such as divide-and-conquer, backtracking, heuristic searches, quantum optimization hardware and in particular quantum annealing or thermal annealing may be used. In case of the latter, these methods may be applied by simulation or by building a system where the dynamics of the physics corresponds to the desired optimisation method.
Gibbs distributions can be sampled using Monte Carlo techniques. In Monte Carlo, one samples the distribution by suggesting a new configuration s' and accepting the move with the probability e- ( s ^ many cases one suggests single spin moves. This may lead to the problem that one does not get good samples from the distribution, but often ends up sampling from a subset of the distribution. This may lead to a slow reduction of the residual energy. While it in general is NP-hard to sample from a Gibbs distribution, Monte Carlo techniques may be efficient in some case and the above mentioned problem can be addressed by suggesting multi-spin moves.
Performing simulated thermal or quantum annealing to sample from a Gibbs distribution is in general a hard problem. Using Monte Carlo techniques may be efficient, but often a single Markov chain may be trapped in a local area of the state space due to the structure of a given problem. If one can locate the structure in the problem, the auto-correlation time in the sampling process may be lowered and consequently, each Markov chain may improve efficiency.
In physics literature, cluster update is a method used for proposing updates in a Markov chain for Ising spin glasses.
In an example method usable in physics for identifying clusters the cluster is predefined, and in another example method a heuristic algorithm is used to expand a given cluster with additional spins following a given ruleset. In the following, an example embodiment of forming clusters is described in more detail.
In the following an example technique to identify a structure in Ising spin glasses will be described in more detail. The identified structure can then be used to sample states from the Gibbs distribution.
The implementation may consist of two units: a first unit 300 which analyses the graph and makes suggestions on how to update it, and a second unit 310 which may realise the Markov chain or another method for optimization. The first unit 300 may work as follows, also illustrated in Figure 4 as a flow diagram: Given an Ising spin glass problem described by vertices V and edge weights E (block 402 in Figure 4), the edge weights of the graphs may be transformed by applying a function E' = f(V,E) (block 404). The function f may be identity. The edge weights E' are then sorted according to their magnitudes and possibly signs (block 406). Removing edges according to a given percentile or another removal criteria (block 408) will then define one or more clusters which can be detected in polynomial time (block 410). These clusters may be provided to the second unit 310 (block 412) which may perform e.g. Monte Carlo moves. To illustrate this idea, some examples are given on how the first unit may work. First, a problem with full connectivity may be considered. Here the function f(V,E) is chosen to just modify the edges by taking the absolute value. In this case, the highest percentile will just amount to single spin updates (Figure lj). The lower percentiles will form clusters depending on how strongly two spins are connected. As shown in Figures lc— lj, some percentiles will in this example yield the same clusters. In this example those clusters are clusters relating to 50 percentile (Figure If), 62,5 percentile (Figure lg) and 75 percentile (Figure lh). Similar clusters may be discarded. Figures l a— lj summarise the cluster generation, in accordance with an embodiment. The thickness of the lines (edges between vertices) illustrates the magnitude of the edge weights: the thicker the line the greater is the magnitude. Figure la illustrates the original graph. In Figure lb the outcome of the function E' is depicted. The first unit 300 may select a percentile value to be used to define which edges will be included and which will be left out for forming clusters. For example, the first unit may select one or more of the following alternative percentiles: 12,5; 25; 37,5; 50; 62,5; 75; 87,5 and 100. However, these percentiles are just non-limiting examples but the first unit 300 may also select other percentiles or percentiles. The edges of the outcome of the function E' may be treated as a list of edges sorted according to their weights. Hence, the selected percentile may then define a so called break point in the list below which edges will be left out. In other words, edges having smaller weight than the edge at the break point will be left out and the remaining edges will be used in the cluster formation. Figures lc— lj illustrate new graphs defined from various percentiles of the included edges (12.5; 25; 37.5; 50; 62.5; 75; 87.5 and 100, respectively). In the 12.5 and 25 percentile cases illustrated in Figures lc and Id, respectively, there is only one cluster. In the 37,5 percentile case illustrated in Figure le there are two clusters. In the 50, 62.5 and 75 percentile cases illustrated in Figures If, lg and lh, respectively, there are three clusters. In the 87.5 percentile case illustrated in Figure li there are five clusters, and in the 100 percentile case illustrated in Figure lj there are seven clusters. In the examples of Figures la— lj the number of vertices is 7 (N=7).
The clusters obtained may be provided to the second unit 310 which may realize e.g. the Markov chain or another method for solving the minimization problem. As a second example, the problem of an Ising spin glass with Chimera connectivity is considered where two nearest neighbour cells are connected by a single bond and where the bonds in a single unit cell strongly couples the spins within the cell. An example of this is depicted in Figures 2a and 2b. Here, the function f(V,E) may be chosen to be such that every edge that belongs to the same unit cell gets a high value and those that connects two different unit cells get a lower value. This may cause that the unit cells are identified as clusters and help overcome the barriers in the problem. In this case, the speedup from applying this technique may be, for example, three or four orders of magnitude. In addition, this method may change the asymptotic scaling of the characteristic problem. In the example graphs of Figures 2a and 2b the couplings illustrated with solid lines are ferromagnetic and the couplings illustrated with dashed lines are anti-ferromagnetic. The second example may also be realised in a version that does not require knowledge of the graph structure. For instance, a function f(V,E) may be defined which may be constructed as follows: To assign a new value to the edge e e E connecting vi with v2, the edge is removed and the shortest path between the two edges are computed. The new edge value is then set to the reciprocal of the total number of edges which need to be transverse in order to reach the go from the first vertex vi to the second vertex v2. In the case of the Chimera graph in Figs. 2a and 2b, the result is that an edge connecting two unit cells will be attributed the value 7"1 (i.e. the shortest path after removing the edge goes via 7 edges) and all other edges will get the value 3"1 again providing distinction to improve performance of the annealing algorithm. In this case, any knowledge of the graph structure has not been used.
The first unit 300 may also be realised by choosing other methods for identifying structures by, for instance, using graph theoretical methods, heuristic algorithms or other approaches to assign scores to edges. The first unit 300 may or may not include knowledge about the graph structure, it may or may not make use of the edge values Jij as wells as the local fields hi.
The first unit 300 and the second unit 310 may operate interactively and iteratively, or the first unit 300 may first form one or more propositions (sets of clusters) to be used by the second unit 310. If they operate interactively and iteratively, the first unit 300 may define one set of clusters by using e.g. one of the above described methods and provide the set of clusters to the second unit 310. As an example, the first unit 300 may first use a first percentile such as 12.5, construct the clusters accordingly and provide the result to the second unit 310. The second unit 310 may try to find an optimum solution and if the second unit 310 determines that the optimum solution has been found, the operation may be stopped and the result may be provided for further processing, e.g. for displaying. However, if the second unit 310 determines that the optimum solution has not been found (e.g. by determining that the calculations do not converge or converges too slowly), the second unit 310 may indicate the first unit 300 to provide a new proposal for optimization. Hence, the first unit 300 may, for example, select a second percentile value such as 25, construct the clusters accordingly and provide the nre result to the second unit 310. This procedure may be continued until the second unit 310 has found a result which fulfils the optimization criteria, or until another termination condition has been fulfilled.
Figure 3 illustrates an example of a computing device 100 in which the computing circuitry 200 may be utilized. The computing device 100 may comprise the computing circuitry 200 implemented e.g. in a FPGA circuit or another programmable circuitry. The computing circuitry 200 may comprise the first unit 300 and the second unit 310, or there may be separate computing units for the first unit 300 and the second unit 310. The inputs and outputs of the computing circuitry 200 may be connected to an interface circuitry 104 which comprises means for providing information to the computing circuitry 200, e.g. to initialize some parameters and initial values for the couplings Jij e.g. by setting local floating point memory into appropriate values, and for obtaining information from the computing circuitry 200. Information obtained from the computing circuitry 200 may comprise e.g. computation results.
The computing device 100 may also comprise a display 1 10 for displaying information to the user, and a keyboard 1 12 and/or another input device so that the user may control the operation of the computing device 100 and input parameters, variables etc. to be used by the computing circuitry 102. There may also be communication means 1 14 for communicating with a communication network such as the internet, a mobile communication network and/or another wireless or wired network. There may also be provided a processor 1 16 for controlling the operation of the computing device and the elements of the computing device. In accordance with an embodiment, the processor 1 16 may be implemented in the same chip 210 than the simulated annulation units 200, as is depicted in Figure 3a, wherein the interface 104, the memory 1 18 or a part of it and/or some other elements of the computing device may also be part of the chip 210, or the processor 1 16 and possibly also the interface 104 may be separate from the chip, as is depicted in Figure 3b. In accordance with an embodiment, the computing unit 200 may comprise one or more controllers in addition to the processor 1 16.
The term computer-readable medium is used herein to refer to any medium that participates in providing information to processor 1 16, including instructions for execution. Such a medium may take many forms, including, but not limited to, non-volatile media, volatile media and transmission media. Non-volatile media include, for example, optical or magnetic disks, such as storage device. Volatile media include, for example, dynamic memoryl l 8. Transmission media include, for example, coaxial cables, copper wire, fibre optic cables, and carrier waves that travel through space without wires or cables, such as acoustic waves and electromagnetic waves, including radio, optical and infrared waves. Signals include man-made transient variations in amplitude, frequency, phase, polarization or other physical properties transmitted through the transmission media. Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, CDRW, DVD, any other optical medium, punch cards, paper tape, optical mark sheets, any other physical medium with patterns of holes or other optically recognizable indicia, a RAM, a PROM, an EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave, or any other medium from which a computer can read.
Embodiments of the present invention may be implemented in software, hardware, application logic or a combination of software, hardware and application logic. In an example embodiment, the application logic, software or an instruction set is maintained on any one of various conventional computer-readable media. In the context of this document, a "computer-readable medium" may be any media or means that can contain, store, communicate, propagate or transport the instructions for use by or in connection with an instruction execution system, apparatus, or device, such as a computer, with one example of a computer described and depicted in Figures 17 and 18. A computer-readable medium may comprise a computer-readable storage medium that may be any media or means that can contain or store the instructions for use by or in connection with an instruction execution system, apparatus, or device, such as a computer. In general, the various embodiments of the invention may be implemented in hardware or special purpose circuits or any combination thereof. While various aspects of the invention may be illustrated and described as block diagrams or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
Embodiments of the inventions may be practiced in various components such as integrated circuit modules. The design of integrated circuits is by and large a highly automated process. Complex and powerful software tools are available for converting a logic level design into a semiconductor circuit design ready to be etched and formed on a semiconductor substrate.
Programs, such as those provided by Synopsys, Inc. of Mountain View, California and Cadence Design, of San Jose, California automatically route conductors and locate components on a semiconductor chip using well established rules of design as well as libraries of pre stored design modules. Once the design for a semiconductor circuit has been completed, the resultant design, in a standardized electronic format (e.g., Opus, GDSII, or the like) may be transmitted to a semiconductor fabrication facility or "fab" for fabrication. While the invention has been described in connection with a number of embodiments and implementations, the invention is not so limited but covers various obvious modifications and equivalent arrangements, which fall within the purview of the appended claims. Although features of the invention are expressed in certain combinations among the claims, it is contemplated that these features can be arranged in any combination and order.

Claims

1. An apparatus comprising:
a first input for receiving information on vertices and edges coupled between the vertices relating to an optimization problem, said edges having a weight;
a second input for inputting a selection criteria;
a first element for applying a function on the vertices and edges to obtain a modified graph;
a second element for sorting edges of the modified graph according to one or more sorting criteria; and
a third element for removing edges on the basis of the selection criteria, wherein vertices and edges remaining in the modified graph form one or more clusters, wherein the apparatus is adapted to provide the clusters for solving the optimization problem.
2. The apparatus according to claim 1 comprising:
a sorter adapted to sort edges according to their weights;
a selector adapted to select one or more percentile values as said selection criteria, which defines a break point in the list; and
a constructor adapted to form one or more clusters of vertices by removing edges which are at a first side of the break point and using edges which are on a second side of the break point.
3. The apparatus according to claim 2, wherein edges at the first side of the break point have smaller weight than edges on the second side.
4. The apparatus according to claim 1 , 2 or 3, wherein the function is at least one of the
following:
an identity function;
an absolute value.
5. The apparatus according to claim 1 , 2 or 3, wherein the graph comprises:
two or more unit cells having vertices and at least one bond between one or more vertices of the unit cell; and
at least one bond between two unit cells; wherein the apparatus is further adapted to use a function which assigns higher weight for bonds within a unit cell than bonds between two unit cells.
6. The apparatus according to claim 1 , 2 or 3, wherein the apparatus is adapted to use a
function which assigns weights to edges as follows:
selecting an edge connecting two vertices;
finding a shortest path other than said edge between said two vertices;
setting a new weight on the basis of the number of edges of the shortest path; and assigning the new weight to said edge.
7. A method comprising:
receiving information on vertices and edges coupled between the vertices relating to an optimization problem;
receiving a selection criteria;
applying a function on the vertices and edges to obtain a modified graph;
sorting edges of the modified graph according to one or more sorting criteria;
removing edges on the basis of the selection criteria, wherein vertices and edges remaining in the modified graph form one or more clusters; and
providing the clusters for solving the optimization problem.
8. The method according to claim 7 comprising:
sorting edges according to their weights;
selecting one or more percentile values as said selection criteria, which defines a break point in the list; and
forming one or more clusters of vertices by removing edges which are at a first side of the break point and using edges which are on a second side of the break point.
9. The method according to claim 8, wherein edges at the first side of the breakpoint have smaller weight than edges on the second side.
10. The method according to claim 7, 8 or 9 wherein the function is at least one of the
following:
an identity function;
an absolute value.
11. The method according to claim 7, 8 or 9 wherein the graph comprises:
two or more unit cells having vertices and at least one bond between one or more vertices of the unit cell; and
at least one bond between two unit cells;
wherein the method further comprises using a function which assigns higher weight for bonds within a unit cell than bonds between two unit cells.
12. The method according to claim 7, 8 or 9, wherein in the method a function is used which assigns weights to edges as follows:
selecting an edge connecting two vertices;
finding a shortest path other than said edge between said two vertices;
setting a new weight on the basis of the number of edges of the shortest path; and assigning the new weight to said edge.
13. An apparatus comprises at least one processor; and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform at least the following:
receive information on vertices and edges coupled between the vertices relating to an optimization problem;
receive a selection criteria;
apply a function on the vertices and edges to obtain a modified graph;
sort edges of the modified graph according to one or more sorting criteria;
remove edges on the basis of the selection criteria, wherein vertices and edges remaining in the modified graph form one or more clusters; and
provide the clusters for solving the optimization problem.
14. The apparatus according to claim 13, said at least one memory including computer
program code configured to, with the at least one processor, cause the apparatus to perform at least the following:
sort edges according to their weights;
select one or more percentile values as said selection criteria, which defines a break point in the list; and
form one or more clusters of vertices by removing edges which are at a first side of the break point and using edges which are on a second side of the break point.
15. The apparatus according to claim 14, wherein edges at the first side of the break point have smaller weight than edges on the second side.
16. The apparatus according to claim 13, 14 or 15, wherein the function is at least one of the following:
an identity function;
an absolute value.
17. The apparatus according to claim 13, 14 or 15, wherein the graph comprises:
two or more unit cells having vertices and at least one bond between one or more vertices of the unit cell; and
at least one bond between two unit cells;
wherein the apparatus is further adapted to use a function which assigns higher weight for bonds within a unit cell than bonds between two unit cells.
18. The apparatus according to claim 13, 14 or 15, said at least one memory including
computer program code configured to, with the at least one processor, cause the apparatus to use a function which assigns weights to edges as follows:
selecting an edge connecting two vertices;
finding a shortest path other than said edge between said two vertices;
setting a new weight on the basis of the number of edges of the shortest path; and assigning the new weight to said edge.
19. A computer readable storage medium stored with code thereon for use by an apparatus, which when executed by a processor, causes the apparatus to perform:
receive information on vertices and edges coupled between the vertices relating to an optimization problem;
receive a selection criteria;
apply a function on the vertices and edges to obtain a modified graph;
sort edges of the modified graph according to one or more sorting criteria;
remove edges on the basis of the selection criteria, wherein vertices and edges remaining in the modified graph form one or more clusters; and
provide the clusters for solving the optimization problem.
20. An apparatus comprising:
means for receiving information on vertices and edges coupled between the vertices relating to an optimization problem;
means for receiving a selection criteria;
means for applying a function on the vertices and edges to obtain a modified graph; means for sorting edges of the modified graph according to one or more sorting criteria; means for removing edges on the basis of the selection criteria, wherein vertices and edges remaining in the modified graph form one or more clusters; and
means for providing the clusters for solving the optimization problem.
21. The apparatus according to claim 20 comprising:
means for sorting edges according to their weights;
means for selecting one or more percentile values as said selection criteria, which defines a break point in the list; and
means for forming one or more clusters of vertices by removing edges which are at a first side of the break point and using edges which are on a second side of the break point.
22. The apparatus according to claim 21 , wherein edges at the first side of the break point have smaller weight than edges on the second side.
23. The apparatus according to claim 20, 21 or 22, wherein the function is at least one of the following:
an identity function;
an absolute value.
24. The apparatus according to claim 20, 21 or 22, wherein the graph comprises:
two or more unit cells having vertices and at least one bond between one or more vertices of the unit cell; and
at least one bond between two unit cells;
wherein the apparatus is further adapted to use a function which assigns higher weight for bonds within a unit cell than bonds between two unit cells.
25. The apparatus according to claim 20, 21 or 22, wherein the apparatus comprises means for using a function which assigns weights to edges as follows:
selecting an edge connecting two vertices; finding a shortest path other than said edge between said two vertices;
setting a new weight on the basis of the number of edges of the shortest path; and assigning the new weight to said edge.
PCT/FI2015/050705 2015-10-19 2015-10-19 Method and apparatus for optimization WO2017068228A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/FI2015/050705 WO2017068228A1 (en) 2015-10-19 2015-10-19 Method and apparatus for optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/FI2015/050705 WO2017068228A1 (en) 2015-10-19 2015-10-19 Method and apparatus for optimization

Publications (1)

Publication Number Publication Date
WO2017068228A1 true WO2017068228A1 (en) 2017-04-27

Family

ID=54548205

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/FI2015/050705 WO2017068228A1 (en) 2015-10-19 2015-10-19 Method and apparatus for optimization

Country Status (1)

Country Link
WO (1) WO2017068228A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019244105A1 (en) * 2018-06-22 2019-12-26 1Qb Information Technologies Inc. Method and system for identifying at least one community in a dataset comprising a plurality of elements
US11514134B2 (en) 2015-02-03 2022-11-29 1Qb Information Technologies Inc. Method and system for solving the Lagrangian dual of a constrained binary quadratic programming problem using a quantum annealer
US11797641B2 (en) 2015-02-03 2023-10-24 1Qb Information Technologies Inc. Method and system for solving the lagrangian dual of a constrained binary quadratic programming problem using a quantum annealer
US11947506B2 (en) 2019-06-19 2024-04-02 1Qb Information Technologies, Inc. Method and system for mapping a dataset from a Hilbert space of a given dimension to a Hilbert space of a different dimension

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110234594A1 (en) * 2010-03-26 2011-09-29 Microsoft Corporation Graph clustering
US20130222388A1 (en) * 2012-02-24 2013-08-29 Callum David McDonald Method of graph processing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110234594A1 (en) * 2010-03-26 2011-09-29 Microsoft Corporation Graph clustering
US20130222388A1 (en) * 2012-02-24 2013-08-29 Callum David McDonald Method of graph processing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SURENDER BASWANA ET AL: "A simple and linear time randomized algorithm for computing sparse spanners in weighted graphs", RANDOM STRUCTURES AND ALGORITHMS., vol. 30, no. 4, 31 July 2007 (2007-07-31), US, pages 532 - 563, XP055283526, ISSN: 1042-9832, DOI: 10.1002/rsa.20130 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11514134B2 (en) 2015-02-03 2022-11-29 1Qb Information Technologies Inc. Method and system for solving the Lagrangian dual of a constrained binary quadratic programming problem using a quantum annealer
US11797641B2 (en) 2015-02-03 2023-10-24 1Qb Information Technologies Inc. Method and system for solving the lagrangian dual of a constrained binary quadratic programming problem using a quantum annealer
WO2019244105A1 (en) * 2018-06-22 2019-12-26 1Qb Information Technologies Inc. Method and system for identifying at least one community in a dataset comprising a plurality of elements
US11947506B2 (en) 2019-06-19 2024-04-02 1Qb Information Technologies, Inc. Method and system for mapping a dataset from a Hilbert space of a given dimension to a Hilbert space of a different dimension

Similar Documents

Publication Publication Date Title
CN110807515B (en) Model generation method and device
US10552737B2 (en) Artificial neural network class-based pruning
CN109657805B (en) Hyper-parameter determination method, device, electronic equipment and computer readable medium
US10262272B2 (en) Active machine learning
Prajwala A comparative study on decision tree and random forest using R tool
CN111587440B (en) Neuromorphic chip for updating accurate synaptic weight values
CN109472321B (en) Time series type earth surface water quality big data oriented prediction and evaluation model construction method
CN108629414B (en) Deep hash learning method and device
CN111670438B (en) System and method for randomly optimizing robust reasoning problem
US20190042956A1 (en) Automatic configurable sequence similarity inference system
CN110287978B (en) Computer-implemented method and computer system for supervised machine learning
WO2017068228A1 (en) Method and apparatus for optimization
US20210271956A1 (en) Personalized Automated Machine Learning
CN116011510A (en) Framework for optimizing machine learning architecture
US20230267368A1 (en) System, device and method of detecting abnormal datapoints
CN111538766B (en) Text classification method, device, processing equipment and bill classification system
CN111950254A (en) Method, device and equipment for extracting word features of search sample and storage medium
JP6172317B2 (en) Method and apparatus for mixed model selection
Cheng et al. Swiftnet: Using graph propagation as meta-knowledge to search highly representative neural architectures
JP2015060237A (en) Prediction model learning device, prediction model learning method, and computer program
US20220027739A1 (en) Search space exploration for deep learning
CN113448821B (en) Method and device for identifying engineering defects
CN113518983A (en) Process control tool for processing large-width data
WO2017037331A1 (en) Method and apparatus for optimization
CN113961765A (en) Searching method, device, equipment and medium based on neural network model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15795210

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15795210

Country of ref document: EP

Kind code of ref document: A1