CN108008999B - Index evaluation method and device - Google Patents
Index evaluation method and device Download PDFInfo
- Publication number
- CN108008999B CN108008999B CN201610944177.6A CN201610944177A CN108008999B CN 108008999 B CN108008999 B CN 108008999B CN 201610944177 A CN201610944177 A CN 201610944177A CN 108008999 B CN108008999 B CN 108008999B
- Authority
- CN
- China
- Prior art keywords
- coefficient
- product
- norm
- optimization
- instruction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000011156 evaluation Methods 0.000 title claims abstract description 33
- 239000013598 vector Substances 0.000 claims abstract description 378
- 238000005457 optimization Methods 0.000 claims abstract description 228
- 239000011159 matrix material Substances 0.000 claims abstract description 125
- 238000004088 simulation Methods 0.000 claims abstract description 108
- 238000000034 method Methods 0.000 claims abstract description 61
- 239000012634 fragment Substances 0.000 claims description 74
- OAICVXFJPJFONN-UHFFFAOYSA-N Phosphorus Chemical compound [P] OAICVXFJPJFONN-UHFFFAOYSA-N 0.000 claims description 13
- 238000005265 energy consumption Methods 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 description 18
- 238000012360 testing method Methods 0.000 description 13
- 230000009467 reduction Effects 0.000 description 11
- 238000010586 diagram Methods 0.000 description 10
- 238000013461 design Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000004590 computer program Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 241000723736 Black beetle virus Species 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 230000005389 magnetism Effects 0.000 description 1
- 238000011022 operating instruction Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000010977 unit operation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45504—Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators
- G06F9/45508—Runtime interpretation or emulation, e g. emulator loops, bytecode interpretation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
- G06F11/3672—Test management
- G06F11/3688—Test management for test execution, e.g. scheduling of test suites
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Devices For Executing Special Programs (AREA)
- Advance Control (AREA)
Abstract
The embodiment of the application provides an index evaluation method and device, and the method comprises the following steps: acquiring an instruction stream of a test program and dividing the instruction stream into N instruction segments with equal length; determining a feature matrix A, a mean column vector B and a weight vector W corresponding to the N instruction segments; determining an object optimization model of the test program according to the feature matrix A, the column vector B, the weight vector W and a plurality of optimization objects, and solving an optimization column vector of the object optimization model; determining a number vector and a coefficient vector of an instruction segment corresponding to the optimized column vector; simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors; and determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
Description
Technical Field
The embodiment of the application relates to the field of computers, in particular to an index evaluation method and device.
Background
Personnel involved in the design and development of processor architectures often need to run test programs in an emulator of a certain architecture and then collect performance-related data indicators, such as Instruction Per Clock (IPC), hit rate of a second level Cache (L2 Cache), energy consumption, etc., to find the bottleneck of the current processor architecture. After the system architecture is improved, the design is redeployed in the simulator, the simulator is used again to run the test program, data are collected, the performance difference of the same test program which runs under the new and old system architectures is compared, then the bottleneck is found again. It can be seen that a great deal of design test work is done with a software simulator before deployment in hardware.
However, one of the major disadvantages of the software simulation platform is: the same test program is run at a much longer runtime than the hardware platform. Especially when running large, comprehensive test suite programs, such as SPEC CPU 2006, it often takes weeks or even months to get the data needed by itself. And after the system architecture is changed every time, the test program needs to be operated again to collect data under the new architecture. Therefore, such repeated operations and waits will seriously affect the development efficiency.
Research shows that the running process of the application program has obvious stage characteristics. The developer needs to shorten the time required by simulation from the perspective of simplifying test programs, and the core thought of the method is as follows: a user can obtain a group of artificial programs only by running the original test program once, each artificial program is very small and can be simulated quickly, and the time for simulating the group of artificial programs is far shorter than that for simulating the original test program; and (3) multiplying the data obtained after each artificial program simulation by a specific weight by a user, wherein the data obtained after weighted averaging is very close to the data obtained after the original artificial program runs.
In the existing Simpoint technology, the steps are as follows: grabbing an instruction stream of a test program component, cutting the instruction stream into segments (intervals) with equal length, and calculating a Basic Block Vector (BBV) of each Interval; performing K-Means clustering on all BBVs, and selecting an Interval closest to the center of each type as a representative Interval of the type; and simulating and testing the indexes of the selected representative intervals, and multiplying the indexes measured by the representative intervals of each category by the proportion of the Interval of the category to the total Interval to calculate a weighted average to represent the indexes of the original test program components. The simulation test in the prior art needs too many selected fragments, long simulation time and uncontrollable error.
How to reduce the instruction fragments selected in the simulation test to shorten the simulation time and/or reduce the error of the simulation index is a technical problem to be solved by the embodiment of the present application.
Disclosure of Invention
The embodiment of the application provides an index evaluation method and device, which can reduce instruction segments selected in simulation test to shorten simulation time and reduce simulation index errors, so that the simulation efficiency or precision of a processor can be improved.
In a first aspect, an index evaluation method is provided, including:
acquiring an instruction stream of a test program and dividing the instruction stream into N instruction segments with equal length;
determining a feature matrix A, a mean column vector B and a weight vector W corresponding to the N instruction segments, wherein the feature matrix A is a matrix with M rows and N columns and is used for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by the mean value of the feature index vectors of each row of the feature matrix A, and the weight vector W is a column vector formed by the weight occupied by the feature index of the basic block BB type corresponding to each row in the feature matrix A in a test program;
determining an optimized column vector according to the feature matrix A, the column vector B, the weight vector W and an optimization target, wherein the optimization target comprises reducing the number of instruction fragments for simulation and/or reducing an error between a feature index for evaluating the test program and the feature index of the test program, and the optimized column vector is used for representing the instruction fragments for simulation in the N instruction fragments and the weight corresponding to each instruction fragment;
determining a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector, wherein the number vector is used for indicating the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used for indicating the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector;
simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors, wherein the characteristic index vectors are used for expressing characteristic indexes of the instruction segments corresponding to the number vectors;
and determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
In the embodiment of the application, an optimized column vector is determined based on a feature matrix, a column vector, a weight vector and an optimized target of an instruction segment of a test program, the instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in the simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.
With reference to the first aspect, in a first possible implementation manner, the optimization objective is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction fragments used for simulation in the N instruction fragments and the weight corresponding to each instruction fragment, and the optimization column vector is the value of X meeting the optimization target.
The method of the embodiment of the application can reduce the number of instruction fragments required by simulation work of the test program on the premise of ensuring the error precision of the simulation index.
With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal is specifically implemented by determining the optimized column vector through the following goal optimization formula:
wherein | | W.AX-W.B | | non-woven hairt1Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cellst2Represents a second norm of AX-B, | X | | non-volatile memoryt3C represents the first threshold value, to represent a third norm of X.
With reference to the first aspect, in a third possible implementation manner of the first aspect, the optimization objective is: when a first norm of X is less than or equal to a second threshold, minimizing a sum of a first product and a second product, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number, where X is an optimization variable used to represent instruction fragments for performing simulation among the N instruction fragments and a weight corresponding to each instruction fragment, and the optimization column vector is a value of X that satisfies the optimization goal.
The method of the embodiment of the application can reduce the simulation index error on the premise of controlling the number of the instruction segments required by the simulation work of the test program.
With reference to the third possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, the determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal is specifically implemented by determining the optimized column vector through the following goal optimization formula:
wherein d represents the second threshold, | X | | non-volatile memoryt1Represents a first norm of X, | | W.AX-W.B | | tormentumt2Represents a second norm of W.AX-W.B, | X | | non-magnetic cellst3Represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
With reference to the first aspect, in a fifth possible implementation manner of the first aspect, the optimization objective is: the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a first norm of XIs the product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,and the column vector is a column vector formed by the average values of all rows of the sparse matrix S, and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
The method of the embodiment of the application can achieve certain balance between reduction of the number of the instruction fragments required by simulation work of the test program and reduction of simulation index errors.
With reference to the first aspect, in a sixth possible implementation manner of the first aspect, the optimization objective is: in thatAnd/or the eighth norm of X is less than or equal to the second threshold, the sum of the first product, the second product, the third product, the fourth product and the fifth product is minimized, wherein the first product is the product of the first coefficient and the first norm of X, and the second product is the product of the second coefficient and the first norm of XIs the product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,and a column vector formed by the average values of the rows of the sparse matrix S, wherein the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero.
The method of the embodiment of the application can balance reduction of the number of the instruction segments required by simulation work of the test program and reduction of simulation index errors on the premise of ensuring precision of the simulation index errors and/or controlling the number of the instruction segments required by the simulation work of the test program.
With reference to the first aspect and the foregoing implementation manner, in a seventh possible implementation manner of the first aspect, determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal is specifically implemented as:
step 1: establishing a target optimization model of the optimization target according to the characteristic matrix A, the column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
step 2: setting X, Y, Z and initial values of μ, and setting the optimization target as a convergence condition for the optimization completion;
and step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not satisfy the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X satisfies the constraint condition of the target optimization model;
step 6: updating a Lagrange multiplier Z according to a residual error of an equation X ═ Y, and increasing a penalty coefficient mu by a fixed multiple rho, wherein rho is greater than 1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the step 3-6.
With reference to the first aspect and the foregoing implementation manner of the first aspect, in an eighth possible implementation manner of the first aspect, reducing one non-zero element in X and replacing X is specifically implemented as: and on the premise that X is reduced by one non-zero element, when X formed by reducing the first non-zero element in X is used for minimizing the function value of the target optimization model, reducing the first non-zero element and replacing X.
With reference to the first aspect and the foregoing implementation manner of the first aspect, in a ninth possible implementation manner of the first aspect, the characteristic index includes at least one of the following: IPC (instruction per clock), success rate of predicted branch, failure rate of predicted branch, hit rate of L2 Cache and energy consumption.
In a second aspect, an index evaluation apparatus is provided, including:
the device comprises an acquisition unit, a processing unit and a control unit, wherein the acquisition unit is used for acquiring an instruction stream of a test program and dividing the instruction stream into N instruction fragments with equal length;
a first determining unit, configured to determine a feature matrix a, a mean column vector B, and a weight vector W, where the feature matrix a is a matrix with M rows and N columns, and is used to describe feature indicator vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature indicator vectors in each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indicators of basic blocks BB of each row of the feature matrix a in a test program;
a second determining unit, configured to determine an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization target, where the optimization target includes reducing the number of instruction fragments used for simulation and/or reducing an error between a feature index of the test program and a feature index of the test program, and the optimized column vector is used to represent instruction fragments used for simulation in the N instruction fragments and a weight corresponding to each instruction fragment;
a third determining unit, configured to determine a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector, where the number vector is used to indicate a number of the instruction segment corresponding to a non-zero position in the optimized column vector, and the coefficient vector is used to indicate a weight of a feature indicator of the instruction segment corresponding to the non-zero position in the optimized column vector;
the simulation unit is used for simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain the characteristic index vectors of the instruction segments corresponding to the number vectors, and the characteristic index vectors are used for expressing the characteristic indexes of the instruction segments corresponding to the number vectors;
and the fourth determination unit is used for determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
In the embodiment of the application, an optimized column vector is determined based on a feature matrix, a column vector, a weight vector and an optimized target of an instruction segment of a test program, the instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in the simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.
With reference to the second aspect, in a first possible implementation manner, the optimization goal is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction fragments used for simulation in the N instruction fragments and the weight corresponding to each instruction fragment, and the optimization column vector is the value of X meeting the optimization target.
The method of the embodiment of the application can reduce the number of instruction fragments required by simulation work of the test program on the premise of ensuring the error precision of the simulation index.
With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, the second determining unit specifically determines the optimized column vector by using the following target optimization formula:
wherein | | W.AX-W.B | | non-woven hairt1Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cellst2Represents a second norm of AX-B, | X | | non-volatile memoryt3C represents the first threshold value, to represent a third norm of X.
With reference to the second aspect, in a third possible implementation manner of the second aspect, the optimization goal is: when a first norm of X is less than or equal to a second threshold, minimizing a sum of a first product and a second product, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number, where X is an optimization variable used to represent instruction fragments for performing simulation among the N instruction fragments and a weight corresponding to each instruction fragment, and the optimization column vector is a value of X that satisfies the optimization goal.
The method of the embodiment of the application can reduce the simulation index error on the premise of controlling the number of the instruction segments required by the simulation work of the test program.
With reference to the third possible implementation manner of the second aspect, in a fourth possible implementation manner of the second aspect, the second determining unit specifically determines the optimized column vector through the following target optimization formula:
wherein d represents the second threshold, | X | | non-volatile memoryt1Represents a first norm of X, | | W.AX-W.B | | tormentumt2Represents a second norm W.AX-W.B,||X||t3represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
With reference to the second aspect, in a fifth possible implementation manner of the second aspect, the optimization goal is: the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a first norm of XIs the product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,and the column vector is a column vector formed by the average values of all rows of the sparse matrix S, and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
The method of the embodiment of the application can achieve certain balance between reduction of the number of the instruction fragments required by simulation work of the test program and reduction of simulation index errors.
With reference to the second aspect, in a sixth possible implementation manner of the second aspect, the optimization goal is: a sixth norm at W.AX-W.B or a seventh norm at AX-B is less than or equal to a first threshold, and/or an eighth norm of X is less than or equal to a second normWhen the threshold value is used, the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the sum of the first product, the second product, the third product, the fourth product and the fifth product is minimized, the first product is the product of a first coefficient and a first norm of X, and the second product is the product of a second coefficient and a first norm of XIs the product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,and a column vector formed by the average values of the rows of the sparse matrix S, wherein the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero.
The method of the embodiment of the application can balance reduction of the number of the instruction segments required by simulation work of the test program and reduction of simulation index errors on the premise of ensuring precision of the simulation index errors and/or controlling the number of the instruction segments required by the simulation work of the test program.
With reference to the second aspect and the foregoing implementation manner of the second aspect, in a seventh possible implementation manner of the second aspect, the second determining unit is specifically implemented as:
step 1: establishing a target optimization model of the optimization target according to the characteristic matrix A, the column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
step 2: setting X, Y, Z and initial values of μ, and setting the optimization target as a convergence condition for the optimization completion;
and step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not satisfy the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X satisfies the constraint condition of the target optimization model;
step 6: updating a Lagrange multiplier Z according to a residual error of an equation X ═ Y, and increasing a penalty coefficient mu by a fixed multiple rho, wherein rho is greater than 1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the step 3-6.
With reference to the second aspect and the foregoing implementation manner of the second aspect, in an eighth possible implementation manner of the second aspect, reducing one non-zero element in X and replacing X is specifically implemented as: and on the premise that X is reduced by one non-zero element, when X formed by reducing the first non-zero element in X is used for minimizing the function value of the target optimization model, reducing the first non-zero element and replacing X.
With reference to the second aspect and the foregoing implementation manner of the second aspect, in a ninth possible implementation manner of the second aspect, the characteristic index includes at least one of the following: IPC (instruction per clock), success rate of predicted branch, failure rate of predicted branch, hit rate of L2 Cache and energy consumption.
In a third aspect, there is provided another index evaluation apparatus, including a processor and a channel interface, where the processor is configured to execute the method in the first aspect or any possible implementation manner of the first aspect through the channel interface.
In a fourth aspect, a computer-readable storage medium is presented for storing a computer program comprising instructions for performing the method of the first aspect or any possible implementation manner of the first aspect.
Based on the above technical solutions, the index evaluation method and apparatus in the embodiments of the present application construct a target optimization model of a test program based on a feature matrix, a column vector, and a weight vector of an instruction segment of the test program, determine an optimized column vector based on the target optimization model and an optimization target, determine an instruction segment and a coefficient vector corresponding to a non-zero position according to the optimized column vector, simulate the instruction segment corresponding to the optimized column vector in the simulation apparatus to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and obtain a feature index for evaluating the test program according to the feature index vector and the coefficient vector, thereby reducing the number of instruction segments required for simulation work of the test program or reducing a simulation index error of the test program.
Drawings
FIG. 1 is a schematic diagram of instruction code according to an embodiment of the present application.
FIG. 2 is a flowchart of an index evaluation method according to an embodiment of the present application.
FIG. 3 is an interaction flow diagram of an index evaluation method according to an embodiment of the present application.
FIG. 4 is a schematic diagram of a target optimization model of an embodiment of the present application.
FIG. 5 is a flow diagram of a method of determining an optimized column vector according to one embodiment of the present application.
Fig. 6 is a schematic structural diagram of an index evaluation device according to an embodiment of the present application.
Fig. 7 is a schematic structural diagram of an index evaluation device according to another embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings.
To facilitate understanding of the embodiments of the present application, several elements that will be introduced in the description of the embodiments of the present application are first introduced herein.
An instruction stream file: the file for recording the instruction stream information is called an instruction stream file, and each line of the instruction stream file represents information related to an executed instruction and conforms to a uniform format. Typically, the instruction stream file size is fixed. For example, an instruction stream file generally contains information such as 1 hundred million instructions, and a complete test program can be regarded as an instruction stream composed of a plurality of 1 hundred million instructions, each 1 hundred million instruction being referred to as an Interval (Interval), in other words, a complete test program is composed of a plurality of segments. The full set of test programs is all fragments and the subset of test programs is part of fragments. The purpose of simplifying the test program is to select representative partial segments from the corpus, and it is required that the number of the selected segments is as small as possible, and the operation results of the segments are as high as possible in similarity with the operation results of the original test program. An instruction may include the following information: program pointer, assembly instruction, operation type, and memory address. The memory address is a selectable item.
And (3) program pointer: the program pointer for each line of instructions is the address in memory of the line assembler instruction, which is a hexadecimal number beginning with "0 x".
Assembling instructions: the binary instruction code of the instruction needs to meet assembly syntax requirements.
The operation type is as follows: all assembly instructions can be divided into three categories: the arithmetic logic unit operates, reads the memory, writes the memory and controls the instruction.
Memory address: if the instruction is an arithmetic logic unit operation, no memory address information is needed; if the instruction is a memory read-write operation, a memory address is required.
Basic Block (BB): a piece of sequentially executed instructions. In general, an instruction stream may be divided into a plurality of BBs with a control instruction as a boundary. Each segment of the test program is composed of BB.
Basic Block feature indicator Vector (BBV): according to different control instructions, the execution times of BB of different types in each segment are counted, and a vector constructed based on the types and the execution times of the BB is called BBV. FIG. 1 is a schematic diagram of instruction code according to an embodiment of the present application. As shown in the instruction code fragment of FIG. 1, if the type ID of BB is {1,2,3,4,5}, and the corresponding execution times is {1,20,0,5,0}, then BBV of the fragment can be recorded as {1:1,2:20,3:0,4:5,5:0 }.
FIG. 2 is a flowchart of an index evaluation method according to an embodiment of the present application. The method of FIG. 2 is performed by an index evaluation device of a processor. Specifically, the index evaluation device may be a host apparatus or the like.
An instruction stream of a test program is obtained and divided into N instruction fragments of equal length 201.
202, determining a feature matrix a, a mean column vector B, and a weight vector W corresponding to the N instruction fragments.
The feature matrix a is a matrix with M rows and N columns for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature index vectors of each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indexes of the test program by types of the basic blocks BB corresponding to each row of the feature matrix a.
Specifically, in the embodiment of the present application, the characteristic index may be any one of the following: instruction Per Clock (IPC), predicted branch success/failure, L2 Cache hit, energy consumption, etc.
203, determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W and an optimization goal.
The optimization target comprises reducing the number of instruction segments used for simulation and/or reducing errors between characteristic indexes of the evaluation test program and characteristic indexes of the test program, and the optimization column vector is used for representing instruction segments used for simulation in the N instruction segments and weights corresponding to the instruction segments used for simulation.
Specifically, for example, an objective optimization model may be established according to the feature matrix a, the column vector B, the weight vector W and an optimization objective, the objective optimization model is used to represent the optimization objective through a relationship between A, B, W and an optimization variable X of the objective optimization model, and the optimization column vector is a value of the optimization variable X satisfying the constraint relationship.
And 204, determining a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector.
The number vector is used for representing the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used for representing the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector.
205, the instruction segments corresponding to the number vectors are simulated in the simulation device to obtain the feature index vectors of the instruction segments corresponding to the number vectors.
The feature indicator vector is used to indicate the feature indicator of the instruction segment corresponding to the number vector.
And 206, determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
In the embodiment of the application, an optimized column vector is determined based on a feature matrix, a column vector, a weight vector and an optimized target of an instruction segment of a test program, the instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in the simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.
Optionally, as an embodiment, the optimization objective is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction fragments used for simulation in the N instruction fragments and the weight corresponding to each instruction fragment, and the optimization column vector is the value of X meeting the optimization target.
It should be understood that in the embodiments of the present application, the firstThe norm, the second norm, the third norm, etc. may be any one of the following norms: 0 norm, 1 norm,Norm, p-norm or nuclear norm, etc., where p is a positive integer. It should be understood that the first to eighth norms, etc. mentioned elsewhere in this application may also be any of the following norms: 0 norm, 1 norm,Norm, p-norm, or nuclear norm, etc. Taking X as an example, 0 norm | X | of X is non-woven01 norm of X (| X | | non-woven phosphor) of X1The kernel norm X of X is not counting*X ofNorm ofP norm | X | Y | of X non-woven gridpAnd p is a positive integer.
Specifically, step 203 may determine the optimized column vector by equation 1:
wherein | | W.AX-W.B | | non-woven hairt1Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cellst2Represents a second norm of AX-B, | X | | non-volatile memoryt3C represents the first threshold value, to represent a third norm of X. It is understood that the operator symbol, represents an inner product, e.g., [ x ]1,x2,x3]·[y1,y2,y3]=x1 y1+x2 y2+x3 y3。
It is not assumed that the first norm of W.AX-W.B in equation 1 is | | | W.AX-W.B | | survival0The second norm of X | | non-conducting phosphor1A 0 norm | | X | | non-woven phosphor of X0Then equation 1 can now be as shown in equation 2:
the method of the embodiment of the application can reduce the number of instruction fragments required by simulation work of the test program on the premise of ensuring the error precision of the simulation index.
Optionally, as another embodiment, the optimization objective is: when a first norm of X is less than or equal to a second threshold, minimizing a sum of a first product and a second product, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number, where X is an optimization variable used to represent instruction fragments for performing simulation among the N instruction fragments and a weight corresponding to each instruction fragment, and the optimization column vector is a value of X that satisfies the optimization goal.
Specifically, step 203 may determine the optimized column vector by equation 3:
wherein d represents the second threshold, | X | | ceilingt1Represents a first norm of X, | | W.AX-W.B | | tormentumt2Represents a second norm of W.AX-W.B, | X | | non-magnetic cellst3Represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
It should be understood that, in the embodiment of the present application, the first norm of X and the third norm of X may be any one of the following norms: 0 norm of X | | non-woven phosphor 01 norm of X (| X | | non-woven phosphor) of X1X ofNorm ofP norm | X | Y | of X non-woven gridpAnd p is a positive integer. The first norm of X and the third norm of X may be the same or different。
It should be understood that in the embodiment of the present application, the second norm of W · AX-W · B may be any one of the following norms: W.AX-W.B has a 0 norm | | | W.AX-W.B | | luminance 01 norm of W.AX-W.B
||W·AX-W·B||1W.AX-W.BNorm ofP norm of W.AX-W.B | | | W.AX-W.B | | luminancepAnd p is a positive integer.
It should be understood that, in the first coefficient and the second coefficient of the embodiments of the present application, at least 1 term is not 0. For example, λ ≠ 0.
It is not assumed that the first norm of X in equation 3 | | | X | | luminancetA 0 norm | | X | | non-woven phosphor of X0The third norm of X | | non-conducting phosphortA 1-norm | | X | | non-woven phosphor of X1The second norm of W.AX-W.B is 2 norms of W.AX-W.B | | tormenting2If the values of the first coefficient and the second coefficient are both 1, formula 3 can be as shown in formula 4:
it should be understood that, in the embodiment of the present application, the values of the first coefficient and the second coefficient are determined by the optimization objective. The values of the first coefficient and the second coefficient determine whether the optimization target is smaller instruction fragments for simulation or smaller simulation errors.
Optionally, as another embodiment, the optimization objective is: so that the sum of the first product, the second product, the third product, the fourth product and the fifth product is minimum, wherein the first product is a product of a first coefficient and a first norm of X, the second product is a product of a second coefficient and a second norm of LX-L, the third product is a product of a third coefficient and a third norm of SX-S, the fourth product is a product of a fourth coefficient and a fourth norm of L, the fifth product is a product of a fifth coefficient and a fifth norm of S, A is L + S, L is a low-rank matrix, S is a sparse matrix, L represents a column vector formed by an average value of each row of the low-rank matrix L, and S represents a column vector formed by an average value of each row of the sparse matrix S, wherein, X is an optimization variable used for representing instruction segments used for simulation among the N instruction segments and a weight corresponding to each instruction segment, and the optimized column vector is a value of X that satisfies the optimization objective.
Specifically, step 203 may specifically determine the optimized column vector by equation 5:
wherein | X | Y luminancet1A first norm representing X is given,to representIs measured in a first direction of the first norm,to representIs a third norm, | L | | non-woven phosphort4Represents the fourth norm of L, | S | | non-woven phosphort5Denotes a fifth norm of S, λ denotes a first coefficient, δ denotes a second coefficient, α denotes a third coefficient, β denotes a fourth coefficient, and γ denotes a fifth coefficient.
It should be understood that at least one of the first coefficient, the second coefficient, the third coefficient, the fourth coefficient, and the fifth coefficient in the embodiments of the present application is not 0.
Specifically, when the first norm of X is 1 norm, the second norm of LX-L is 2 norms, the third norm of SX-S is 2 norms, the fourth norm of L is 1 norm, and the fifth norm of S is 1 norm, equation 5 can be expressed as shown in equation 6:
optionally, as an embodiment, the optimization objective is: when the sixth norm of W.AX-W.B or the seventh norm of AX-B is less than or equal to a first threshold value and/or the eighth norm of X is less than or equal to a second threshold value, the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and the first norm of X, and the second product is the product of a second coefficient and the first norm of XIs the product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,and representing a column vector formed by the average values of all rows of the sparse matrix S, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
When the constraint of the optimization objective is that the sixth norm of W · AX-W · B or the seventh norm of AX-B is less than or equal to the first threshold, step 203 may determine the optimized column vector by equation 7:
wherein c represents a first threshold, | W.AX-W.B | | calceit6Represents the sixth norm of W.AX-W.B, | | AX-B | | non-combustible cellst7Represents the seventh norm of AX-B, and the remaining individual parameters have the same meanings as the corresponding parameters in equation 5.
When the constraint condition of the optimization objective is that the sixth norm of X is less than or equal to the second threshold, step 203 may determine the optimized column vector by equation 8:
wherein d represents a second threshold, | X | | non-volatile memoryt6Denotes the sixth norm of X, and the remaining respective parameters have the same meanings as the corresponding parameters in equation 5.
When the constraint of the optimization objective is that the sixth norm of W · AX-W · B or the seventh norm of AX-B is less than or equal to the first threshold and the eighth norm of X is less than or equal to the second threshold, step 203 may determine the optimized column vector by equation 9:
wherein c represents a first threshold, d represents a second threshold, and | X | | magnetism is not present when a sixth norm of W.AX-W.B or a seventh norm of AX-B is less than or equal to the first thresholdt8Represents the eighth norm of X, and the remaining parameters have the same meanings as the corresponding parameters in equation 5.
Specifically, the detailed process of step 203 may be as follows:
step 1: establishing a target optimization model of the optimization target according to the characteristic matrix A, the column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
step 2: setting X, Y, Z and initial values of μ, and setting the optimization target as a convergence condition for the optimization completion;
and step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not satisfy the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X satisfies the constraint condition of the target optimization model;
step 6: updating a Lagrange multiplier Z according to a residual error of an equation X ═ Y, and increasing a penalty coefficient mu by a fixed multiple rho, wherein rho is greater than 1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the steps 3-7.
In a specific implementation manner, the reduction of one non-zero element in X and the replacement of X in step 5 can be specifically implemented as: the lagrange multiplier Z is updated according to the residual of equation X ═ Y and the penalty factor μ is increased by a fixed multiple ρ, where ρ > 1. Of course, it should be understood that there may be other specific implementation manners of step 5, for example, decreasing the non-zero elements in X and replacing X one by one in the order of decreasing the non-zero elements from small to large or from large to small until X satisfies the constraint condition of the target optimization model, and so on.
In order to facilitate understanding of the method of the embodiments of the present application, the method of the embodiments of the present application will be further described with reference to specific embodiments.
FIG. 3 is a flowchart illustrating evaluation of metrics by a processor according to an embodiment of the present disclosure. It should be understood that in the embodiment of the present application, BBV is used as the feature index vector of the test program. The method of the embodiment of the present application is executed by an index evaluation apparatus of a processor disposed on a host device, where the apparatus may be an analog system or an execution chip, and the like, and the embodiment of the present application is not limited herein.
301, an instruction stream is fetched.
In one particular example, the host device may use an instruction stream fetching software tool to fetch an instruction stream of the test program, the instruction stream being composed of instructions. The host device can segment the instruction stream into BBs according to the jump instruction, and thus the instruction stream can be seen as being composed of BB sequences.
302, the instruction stream is partitioned into instruction fragments, and the BBV of each instruction fragment is calculated.
The host device may divide the instruction stream acquired in step 301 into a plurality of instruction fragments (intervals). For example, the instruction stream may be divided into instruction fragments of equal length, and the number and entry address of each instruction fragment may be recorded. Specifically, for example, each instruction fragment contains 1 billion BB, and so on. Of course, the division may be performed according to other instruction lengths, for example, 2 hundred million BBs, 10 hundred million BBs, etc., and the embodiment of the present application is not limited herein.
It is not assumed that the number of divided instruction fragments is N.
303, construct a matrix A, B, W.
The host device may construct the feature matrix a according to the BBV of each instruction segment after the instruction stream is split. Where each row represents the BBV of one instruction fragment.
The host device may calculate a column vector B based on the average value of each row in the feature matrix a. Specifically, the value of each row in B can be expressed by the following equation 10:
wherein, i is more than or equal to 1 and less than or equal to N, and M represents the number of BB types in the instruction segment.
In addition, a weight vector W may be set according to the difference in weight of the type of BB in the characteristic index of the test program. W is the column vector. Wherein each value of W represents a weight of the corresponding type of BB in the characteristic index of the test program.
FIG. 4 is a schematic diagram of a target optimization model according to an embodiment of the present application. As shown in FIG. 4, the rows of the feature matrix A correspond to the types of BB, the columns correspond to the divided instruction fragments, the feature matrix A is a matrix of M rows × N columns, and the element A in Ai,jIndicates the j-th instructionThe characteristic index value of the ith BB in the segment. The sparse solution matrix X is an optimization variable X in the embodiment shown in fig. 2, a position number of a non-zero coefficient in X is used to indicate a number of a selected instruction segment, a value of the non-zero coefficient indicates a weight of a finally calculated characteristic index of a characteristic index of the instruction segment corresponding to the non-zero coefficient, and the sparse solution matrix X is an N row × 1 column matrix. The column mean matrix B of a is a column vector formed by the column means of each column in a, and the matrix B is a matrix of M rows × 1 column, and its calculation method is shown in equation 10. The weight matrix W is a matrix of M rows by 1 column, and the ith element W in the weight matrixiAnd (3) representing the weight of the feature vector of the ith BB in the finally calculated feature index. As shown in fig. 4, X is a value satisfying AX · W — B · W. It is understood that AX.W and W.AX are equivalent; B.W and W.B are equivalent.
An optimized column vector is determined 304 based on A, B, W and the optimization objective.
Specifically, an objective optimization model may be constructed from A, B, W and the optimization objectives, and the optimization column vectors of the objective optimization model may be solved.
An optimization variable X can be introduced, wherein a non-zero element in X represents a selected instruction segment, and the value of the non-zero element in X represents the value of the weight corresponding to the instruction segment. That is, X is used to represent the instruction segment for simulation among the N instruction segments of the test program and the weight corresponding to each instruction segment.
It should be understood that in practical applications, W · AX and W · B will have a certain error, and the smaller the error, the smaller the simulation error; further, the smaller the number of non-zero coefficients in X, the fewer instruction fragments to be selected for the simulation test, and the shorter the time required for the simulation test. Thus, an objective optimization model may be constructed based on A, B, W according to the optimization objectives.
It should be appreciated that if the cost of the number of instruction fragments is high, then a higher weight may be placed on the optimization objective that affects the number of X non-zero coefficients. At this point, the optimization objective may include limiting the number of non-zero coefficients in X.
Specifically, for example, the number of non-zero coefficients in X is minimizedAnd (4) transforming. At this time, it can be achieved by minimizing the norm of X. The X norm may include the 0 norm | | | X | | luminance of X 01 norm of X (| X | | non-woven phosphor) of X1X of
Or, for example, such that the 1 norm of X is less than d. At this time, | | X | | non-woven phosphor can be restricted1≤d,d∈R+. Reducing non-zero numbers in X is achieved by constraining the 1-norm reduction of X to be less than a value that results in a reduction of non-zero numbers in X.
Of course, other constraint formulas may exist, and the embodiments of the present application are not listed here.
It should be appreciated that if certain simulation error accuracy is required, higher weights may be placed on the optimization objectives affecting W.AX-W.B.
Specifically, for example, the value of W · AX-W · B is minimized, or W · AX-W · B is smaller than d, or the norm of W · AX-W · B is smaller than d, or the like.
Based on the difference of the optimization objectives, different objective optimization models can be set.
For example, assume that the optimization goal of the target optimization model is to minimize the third norm of X when either the first norm of W.AX-W.B or the second norm of AX-B is less than or equal to the first threshold. At this time, the objective optimization model may be expressed by equation 1 above.
For another example, assume that the optimization goal of the objective optimization model is to minimize the sum of a first product and a second product when a first norm of X is smaller than or equal to a second threshold, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number. At this time, the target optimization model may be expressed by equation 3 above.
As another example, assume that the optimization objective of the objective optimization model is such that the first product,The sum of a second product, a third product, a fourth product and a fifth product is minimum, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a fifth productIs the product of a third coefficient and a second norm ofA fourth product is a product of a fourth coefficient and a fourth norm of L, a fifth product is a product of a fifth coefficient and a fifth norm of S, a is L + S, L is a low-rank matrix, S is a sparse matrix, L represents a column vector formed by an average value of each row of the low-rank matrix L, and S represents a column vector formed by an average value of each row of the sparse matrix S. At this time, the target optimization model can be expressed by the above equation 5.
For another example, assume that the optimization goal of the objective optimization model is to minimize the sum of a first product, a second product, a third product, a fourth product and a fifth product when the sixth norm of W.AX-W.B or the seventh norm of AX-B is less than or equal to a first threshold and/or the eighth norm of X is less than or equal to a second threshold, wherein the first product is the product of a first coefficient and the first norm of X, and the second product is the product of a second coefficient and the first norm of XIs the product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,a column vector representing an average value of each row of the sparse matrix S. At this time, the target optimization model can be expressed by the above equations 7, 8, 9, and the like.
Without assuming that the formula of the objective optimization model of the embodiment of the present application is shown in formula 4, a method for establishing the objective optimization model and solving the optimized column vectors of the embodiment of the present application is shown in fig. 5. Fig. 5 is a flowchart of a method for determining an optimized column vector according to an embodiment of the present application.
S501, establishing a target optimization model, and determining a Lagrangian function f (X, Y, Z and mu) according to the target optimization model.
Specifically, an objective optimization model may be established based on A, B, W and an optimization objective, where the objective optimization model is used to represent a relationship between A, B, W and an optimization variable X under the optimization objective, and a value of X that satisfies the objective optimization model is an optimization column vector.
To solve for the optimized column vector, the lagrangian function f (X, Y, Z, μ) may be determined. Specifically, a relaxation variable Y can be introduced, a lagrangian multiplier method is used, a lagrangian multiplier Z is introduced, μ is a penalty parameter, and the value is a positive real number. The variable X of the Lagrange function is the optimized variable X of the target optimization model.
S502, setting initial values of variables X, Y, Z and mu in the Lagrangian function, and setting a convergence condition of the optimization completion.
For example, a convergence condition for completion of optimization when the profit of optimization is less than a predetermined threshold may be set. Specifically, taking the target optimization model shown in formula 4 as an example, if the obtained | | W · AX-W · B | | survival rate2+||X||1The value of (A) is less than the (| W.AX-W.B |) of the last circulation2+||X||1The magnitude of the decrease in the value of (b) is less than a predetermined threshold V, the optimization is deemed complete.
For another example, a maximum number of loop executions may be set, and when the loop reaches the maximum number, a convergence condition for completion of optimization may be set.
Of course, it should be understood that a plurality of determination conditions may be set, and any one of them may be satisfied. For example, if | | W · AX-W · B | | Wy calculation is obtained2+||X||1The value of (A) is less than the (| W.AX-W.B |) of the last circulation2+||X||1The amplitude of the decrease of the value of (b) is less than a predetermined threshold V, or the number of cycles reaches the maximum number, which is a convergence condition for completing optimization, and so on.
S503, an optimal value of Y is calculated using the least square method.
For a specific implementation of calculating Y by using the least square method, reference may be made to the prior art, and details of the embodiment of the present application are not repeated herein.
S504, the optimal value of X is calculated by using a soft threshold method.
For a specific implementation of calculating X by using the soft threshold method, reference may be made to the prior art, and details of the embodiment of the present application are not described herein again.
And S505, when the value of the X does not meet the constraint condition of the target optimization model, reducing a non-zero element in the X and replacing the X until the X meets the constraint condition of the target optimization model.
It should be understood that step S505 has different implementations according to different target optimization models.
For example, assuming that the target optimization model is the expression shown in formula 4, in step S505, the non-zero elements of X may be reduced to not more than d.
There are many ways to reduce non-zero elements in X.
Preferably, on the premise that X is reduced by one non-zero element, when X formed by reducing a first non-zero element in X minimizes the function value of the target optimization model, the first non-zero element is reduced and replaces X.
For example, in the target optimization model assumed in equation 4, in order to reduce the non-zero elements of X to no more than d, in order to make | | W · AX-W · B | | white magnetism2+||X||1The value of (A) is taken as the minimum value as much as possible, and when one nonzero element is reduced each time, the (| | W.AX-W.B | |) of each nonzero element can be reduced relatively2+||X||1Selecting | | | W.AX-W.B | | non-woven cells2+||X||1The minimum value of (a) is used as a principle of reducing the non-zero elements until the number of the non-zero elements does not exceed d.
Of course, the non-zero elements may be reduced according to the principle that the value of the non-zero elements in X is from large to small or from small to large. The method has the advantages of simple calculation, and smaller function value of the target optimization model due to the fact that multiple comparisons for reducing non-zero elements are not needed.
S506, the lagrange multiplier is updated according to the residual error of the equation X ═ Y, and the penalty coefficient μ is increased by a fixed multiple ρ, where ρ > 1.
S507, judging whether the convergence condition of the Lagrangian function f (X, Y, Z and mu) is met.
If the convergence condition of the Lagrangian function f (X, Y, Z, mu) is met, outputting the current value of X, and ending the execution;
if the convergence condition of the Lagrangian function f (X, Y, Z, μ) is not satisfied, step 503 is performed.
Steps S503 to S507 are executed in a loop until the convergence condition of the lagrangian function f (X, Y, Z, μ) is satisfied.
Of course, it should be understood that there may be other methods of solving the objective optimization model, and the embodiments of the present application are not limited thereto.
The number vector and coefficient vector of the instruction fragment to which the optimized column vector corresponds are determined 305.
The optimized column vector determined in step 304 is not recorded as X0。
It is understood that X0The numbering of the non-zero position-corresponding segments is recorded in a numbering vector D, X0The non-zero coefficients are recorded in a coefficient vector G. The number vector D is used for representing the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector G is used for representing the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector.
And 306, simulating the instruction segments corresponding to the number vector D in the simulation device respectively to obtain the characteristic index vector of the instruction segment corresponding to the number vector D.
And the simulation test system simulates the instruction segment corresponding to the number vector D by using a simulation tool according to the number vector D to obtain the characteristic indexes of each instruction segment, and the characteristic indexes of each instruction segment corresponding to the number vector D form a characteristic index vector C.
The simulation tool may be a software or hardware emulator, such as LiveSP, Checkpoint, or the like. The characteristic indexes of the instruction segment obtained by the simulation tool can be IPC, predicted branch success/failure rate, energy consumption and the like.
307, determining the inner product of the feature index vector and the coefficient vector as the feature index for evaluating the test program.
From the coefficient vector D of step 305 and the characteristic index vector C of step 306, a characteristic index C · D of the evaluation test program can be obtained.
And finishing the index evaluation of the test program.
Fig. 6 is a schematic structural diagram of an index evaluation device 600 according to an embodiment of the present application. As shown in fig. 6, the index evaluation device 600 may include:
an obtaining unit 601, configured to obtain an instruction stream of a test program and divide the instruction stream into N instruction fragments with equal lengths;
a first determining unit 602, configured to determine a feature matrix a, a mean column vector B, and a weight vector W, where the feature matrix a is a matrix with M rows and N columns, and is used to describe feature indicator vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature indicator vectors in each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indicators of basic blocks BB of each row in the feature matrix a in a test program;
a second determining unit 603, configured to determine an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal, where the optimization goal includes reducing the number of instruction fragments used for simulation and/or reducing an error between a feature index of the test program and a feature index of the test program, and the optimized column vector is used to represent instruction fragments used for simulation in the N instruction fragments and a weight corresponding to each instruction fragment;
a third determining unit 604, configured to determine a number vector and a coefficient vector of the instruction fragment corresponding to the optimized column vector, where the number vector is used to indicate a number of the instruction fragment corresponding to a non-zero position in the optimized column vector, and the coefficient vector is used to indicate a weight of a feature indicator of the instruction fragment corresponding to the non-zero position in the optimized column vector;
a simulation unit 605, configured to simulate the instruction segments corresponding to the number vector in a simulation device, respectively, to obtain a feature indicator vector of the instruction segment corresponding to the number vector, where the feature indicator vector is used to indicate a feature indicator of the instruction segment corresponding to the number vector;
a fourth determining unit 606, configured to determine an inner product of the feature indicator vector and the coefficient vector as a feature indicator for evaluating the test program.
In the embodiment of the application, a target optimization model of a test program is constructed based on a feature matrix, a column vector and a weight vector of an instruction segment of the test program, an optimized column vector is determined based on the target optimization model and an optimized target, an instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in a simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.
The index evaluation device 600 may also execute the method of fig. 2 and implement the functions of the index evaluation device in the embodiments shown in fig. 2, fig. 3, and fig. 5, which are not described herein again in this embodiment of the present application.
Fig. 7 is a schematic structural diagram of an index evaluation apparatus 700 according to an embodiment of the present application. The metric evaluation device 700 may include a channel interface 701, a processor 702, and a memory 703.
The channel interface 701, the processor 702 and the memory 703 are interconnected by a bus 704 system. Bus 704 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 7, but this does not indicate only one bus or one type of bus.
The memory 703 is used for storing programs. In particular, the program may include program code comprising computer operating instructions. Memory 703 may include both read-only memory and random-access memory, and provides instructions and data to processor 702. The memory 703 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The processor 702 executes the program stored in the memory 703, and is specifically configured to perform the following operations:
acquiring an instruction stream of a test program and dividing the instruction stream into N instruction segments with equal length;
determining a feature matrix A, a mean column vector B and a weight vector W corresponding to the N instruction segments, wherein the feature matrix A is a matrix with M rows and N columns and is used for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by the mean value of the feature index vectors of each row of the feature matrix A, and the weight vector W is a column vector formed by the weight occupied by the feature index of the basic block BB type corresponding to each row in the feature matrix A in a test program;
determining an optimized column vector according to the feature matrix A, the column vector B, the weight vector W and an optimization target, wherein the optimization target comprises reducing the number of instruction fragments for simulation and/or reducing an error between a feature index for evaluating the test program and the feature index of the test program, and the optimized column vector is used for representing the instruction fragments for simulation in the N instruction fragments and the weight corresponding to each instruction fragment;
determining a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector, wherein the number vector is used for indicating the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used for indicating the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector;
simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors, wherein the characteristic index vectors are used for expressing characteristic indexes of the instruction segments corresponding to the number vectors;
and determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
The method performed by the index estimation apparatus disclosed in any one of fig. 2, fig. 3, and fig. 5 may be applied to the processor 702, or implemented by the processor 702. The processor 702 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 702. The Processor 702 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 703, and the processor 702 reads the information in the memory 703 and performs the steps of the above method in combination with the hardware thereof.
The index evaluation device 700 may also perform the method of fig. 2 and implement the functions of the index evaluation device in the embodiments shown in fig. 2, fig. 3, and fig. 5, which are not described herein again in this application.
The embodiment of the present application further provides a computer-readable storage medium for storing a computer program, where the computer program includes instructions for executing the method of the embodiment shown in fig. 2.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (20)
1. An index evaluation method, comprising:
acquiring an instruction stream of a test program and dividing the instruction stream into N instruction segments with equal length;
determining a feature matrix A, a mean column vector B and a weight vector W corresponding to the N instruction segments, wherein the feature matrix A is a matrix with M rows and N columns and is used for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by the mean value of the feature index vectors of each row of the feature matrix A, and the weight vector W is a column vector formed by the weight occupied by the feature index of the basic block BB type corresponding to each row in the feature matrix A in a test program;
determining an optimized column vector according to the feature matrix A, the mean column vector B, the weight vector W and an optimized target, wherein the optimized target comprises reducing the number of instruction fragments for simulation and/or reducing errors between characteristic indexes of the evaluation test program and the characteristic indexes of the test program, and the optimized column vector is used for representing the instruction fragments for simulation in the N instruction fragments and the weight corresponding to each instruction fragment;
determining a number vector and a coefficient vector of an instruction segment corresponding to the optimized column vector, wherein the number vector is used for representing the number of the instruction segment corresponding to a non-zero position in the optimized column vector, and the coefficient vector is used for representing the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector;
simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors, wherein the characteristic index vectors are used for expressing characteristic indexes of the instruction segments corresponding to the number vectors;
and determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
2. The method of claim 1, wherein the optimization objective is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
3. The method of claim 2, wherein the determining an optimized column vector based on the feature matrix a, the mean column vector B, the weight vector W, and an optimization objective comprises:
determining the optimized column vector by the following target optimization formula:
wherein | | W.AX-W.B | | non-woven hairt1Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cellst2Represents a second norm of AX-B, | X | | non-volatile memoryt3C represents the first threshold value, to represent a third norm of X.
4. The method of claim 1, wherein the optimization objective is:
when a first norm of X is smaller than or equal to a second threshold, minimizing the sum of a first product and a second product, wherein the first product is a product of a second norm of W.AX-W.B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, a value of the first coefficient is a non-zero real number, X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and a weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
5. The method of claim 4,
the determining an optimized column vector according to the feature matrix a, the mean column vector B, the weight vector W, and an optimization objective includes:
determining the optimized column vector by the following target optimization formula:
min λ||W·AX-W·B||t2+δ||X||t3
s.t.||X||t1≤d
wherein d represents the second threshold, | X | | ceilingt1Represents a first norm of X, | | W.AX-W.B | | tormentumt2Represents a second norm of W.AX-W.B, | X | | non-magnetic cellst3Represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
6. The method of claim 1, wherein the optimization objective is:
the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a first norm of XIs the product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,and the column vector is a column vector formed by the average values of all rows of the sparse matrix S, and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
7. The method of claim 1, wherein the optimization objective is:
maximizing the sum of the first product, the second product, the third product, the fourth product, and the fifth product when the sixth norm of W.AX-W.B or the seventh norm of AX-B is less than or equal to a first threshold, and/or the eighth norm of X is less than or equal to a second thresholdSmall, wherein the first product is a product of a first coefficient and a first norm of X, and the second product is a product of a second coefficient andis the product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,and the column vector is a column vector formed by the average values of all rows of the sparse matrix S, and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
8. The method of any of claims 1-7, wherein the determining an optimized column vector based on the feature matrix A, the mean column vector B, the weight vector W, and an optimization objective comprises:
step 1: establishing a target optimization model of the optimization target according to the feature matrix A, the mean column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
step 2: setting X, Y, Z and initial values of mu, and setting the optimization target as a convergence condition of the optimization completion;
and step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not meet the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X meets the constraint condition of the target optimization model;
step 6: updating a Lagrange multiplier Z according to a residual error of an equation X ═ Y, and increasing a penalty coefficient mu by a fixed multiple rho, wherein rho is greater than 1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the step 3-6.
9. The method of claim 8, wherein the reducing one non-zero element of X and replacing X comprises:
and on the premise that X is reduced by one non-zero element, when X formed by reducing a first non-zero element in X is used for minimizing the function value of the target optimization model, reducing the first non-zero element and replacing X.
10. The method of claim 1, wherein the characteristic indicator comprises at least one of:
IPC (instruction per clock), success rate of predicted branch, failure rate of predicted branch, hit rate of L2 Cache and energy consumption.
11. An index evaluation apparatus of a processor, comprising:
the device comprises an acquisition unit, a processing unit and a control unit, wherein the acquisition unit is used for acquiring an instruction stream of a test program and dividing the instruction stream into N instruction fragments with equal length;
a first determining unit, configured to determine a feature matrix a, a mean column vector B, and a weight vector W, where the feature matrix a is a matrix with M rows and N columns, and is used to describe feature indicator vectors of the N instruction segments, M is a number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature indicator vectors in each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indicators of basic blocks BB types in each row of the feature matrix a in a test program;
a second determining unit, configured to determine an optimized column vector according to the feature matrix a, the mean column vector B, the weight vector W, and an optimization target, where the optimization target includes reducing the number of instruction fragments used for simulation and/or reducing an error between an evaluation result of the feature indicator of the test program and the feature indicator of the test program, and the optimized column vector is used to represent instruction fragments used for simulation in the N instruction fragments and a weight corresponding to each instruction fragment;
a third determining unit, configured to determine a number vector and a coefficient vector of an instruction segment corresponding to the optimized column vector, where the number vector is used to represent a number of an instruction segment corresponding to a non-zero position in the optimized column vector, and the coefficient vector is used to represent a weight of a feature index of the instruction segment corresponding to the non-zero position in the optimized column vector;
the simulation unit is used for simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors, and the characteristic index vectors are used for expressing characteristic indexes of the instruction segments corresponding to the number vectors;
and the fourth determination unit is used for determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
12. The apparatus of claim 11, wherein the optimization objective is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
13. The apparatus of claim 12, wherein the second determining unit is to determine the optimized column vector by the following target optimization formula:
wherein | | W.AX-W.B | | non-woven hairt1Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cellst2Represents a second norm of AX-B, | X | | non-volatile memoryt3C represents the first threshold value, to represent a third norm of X.
14. The apparatus of claim 11, wherein the optimization objective is:
when a first norm of X is smaller than or equal to a second threshold, minimizing the sum of a first product and a second product, wherein the first product is a product of a second norm of W.AX-W.B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, the value of the first coefficient is a non-zero real number, X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
15. The apparatus of claim 14, wherein the second determining unit is to determine the optimized column vector by the following target optimization formula:
wherein d represents the second threshold, | X | | ceilingt1Represents a first norm of X, | | W.AX-W.B | | tormentumt2Represents W.AX-W.A second norm of B, | X | | non-woven phosphort3Represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
16. The apparatus of claim 11, wherein the optimization objective is:
the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a first norm of XIs the product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,and the column vector is a column vector formed by the average values of all rows of the sparse matrix S, and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
17. The apparatus of claim 11, wherein the optimization objective is:
a sixth norm at W.AX-W.B or a seventh norm at AX-B is less than or equal to a first threshold, and/orWhen the eighth norm of X is less than or equal to the second threshold, the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of the first coefficient and the first norm of X, and the second product is the product of the second coefficient and the first norm of XIs the product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,and the column vector is a column vector formed by the average values of all rows of the sparse matrix S, and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
18. The apparatus according to any of claims 11-17, wherein the second determining unit determines an optimized column vector from the feature matrix a, the mean column vector B, the weight vector W, and an optimization goal, embodied as:
step 1: establishing a target optimization model of the optimization target according to the feature matrix A, the mean column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
step 2: setting X, Y, Z and initial values of mu, and setting the optimization target as a convergence condition of the optimization completion;
and step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not meet the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X meets the constraint condition of the target optimization model;
step 6: updating a Lagrange multiplier Z according to a residual error of an equation X ═ Y, and increasing a penalty coefficient mu by a fixed multiple rho, wherein rho is greater than 1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the step 3-6.
19. The apparatus of claim 18, wherein the second determining unit reduces one non-zero element in X and replaces X by:
and on the premise that X is reduced by one non-zero element, when X formed by reducing a first non-zero element in X is used for minimizing the function value of the target optimization model, reducing the first non-zero element and replacing X.
20. The apparatus of claim 11, wherein the characteristic indicator comprises at least one of:
IPC (instruction per clock), success rate of predicted branch, failure rate of predicted branch, hit rate of L2 Cache and energy consumption.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610944177.6A CN108008999B (en) | 2016-11-02 | 2016-11-02 | Index evaluation method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610944177.6A CN108008999B (en) | 2016-11-02 | 2016-11-02 | Index evaluation method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108008999A CN108008999A (en) | 2018-05-08 |
CN108008999B true CN108008999B (en) | 2021-07-20 |
Family
ID=62048196
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610944177.6A Expired - Fee Related CN108008999B (en) | 2016-11-02 | 2016-11-02 | Index evaluation method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108008999B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023015560A1 (en) * | 2021-08-13 | 2023-02-16 | Huawei Technologies Co.,Ltd. | Systems and methods for sparsity-aware vector processing in general purpose cpus |
CN115543719B (en) * | 2022-11-24 | 2023-04-07 | 飞腾信息技术有限公司 | Component optimization method and device based on chip design, computer equipment and medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6751792B1 (en) * | 2000-10-04 | 2004-06-15 | Sun Microsystems, Inc. | Using value-expression graphs for data-flow optimizations |
CN103049310A (en) * | 2012-12-29 | 2013-04-17 | 中国科学院深圳先进技术研究院 | Multi-core simulation parallel accelerating method based on sampling |
CN104268085A (en) * | 2014-10-24 | 2015-01-07 | 重庆邮电大学 | Software vulnerability discovery system and method based on attribute extraction |
CN105654120A (en) * | 2015-12-25 | 2016-06-08 | 东南大学—无锡集成电路技术研究所 | Two-step cluster software load feature extraction method based on SOM and K-means |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9098352B2 (en) * | 2013-07-17 | 2015-08-04 | Deja Vu Security, Llc | Metaphor based language fuzzing of computer code |
-
2016
- 2016-11-02 CN CN201610944177.6A patent/CN108008999B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6751792B1 (en) * | 2000-10-04 | 2004-06-15 | Sun Microsystems, Inc. | Using value-expression graphs for data-flow optimizations |
CN103049310A (en) * | 2012-12-29 | 2013-04-17 | 中国科学院深圳先进技术研究院 | Multi-core simulation parallel accelerating method based on sampling |
CN104268085A (en) * | 2014-10-24 | 2015-01-07 | 重庆邮电大学 | Software vulnerability discovery system and method based on attribute extraction |
CN105654120A (en) * | 2015-12-25 | 2016-06-08 | 东南大学—无锡集成电路技术研究所 | Two-step cluster software load feature extraction method based on SOM and K-means |
Also Published As
Publication number | Publication date |
---|---|
CN108008999A (en) | 2018-05-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7549069B2 (en) | Estimating software power consumption | |
CN110178123B (en) | Performance index evaluation method and device | |
US20170132344A1 (en) | Micro-benchmark analysis optimization for microprocessor designs | |
CN109726764A (en) | A kind of model selection method, device, equipment and medium | |
CN108008999B (en) | Index evaluation method and device | |
Abate et al. | Adaptive aggregation of Markov chains: Quantitative analysis of chemical reaction networks | |
CN108681487B (en) | Distributed system and method for adjusting and optimizing sensor algorithm parameters | |
CN114004352A (en) | Simulation implementation method, neural network compiler and computer readable storage medium | |
CN114818600A (en) | Chip verification method and device, electronic equipment and storage medium | |
CN108304352B (en) | Accelerated life test sample distribution method and terminal equipment | |
CN111176253B (en) | Monte Carlo-based flight control law automatic evaluation method | |
CN116149917A (en) | Method and apparatus for evaluating processor performance, computing device, and readable storage medium | |
CN107769987B (en) | Message forwarding performance evaluation method and device | |
JP5153724B2 (en) | Processing time estimation device and processing time estimation program | |
CN117272896A (en) | Machine learning techniques for circuit design verification | |
CN108664368B (en) | Processor performance index evaluation method and device | |
JP6239195B2 (en) | Performance evaluation apparatus and performance evaluation program | |
Madougou et al. | Using colored petri nets for GPGPU performance modeling | |
CN109815104B (en) | GPGPU program approximate analysis system and method based on soft error perception | |
EP3518153A1 (en) | Information processing method and information processing system | |
CN113988493A (en) | Method and device for constructing terminal evaluation model and storage medium | |
CN115543719B (en) | Component optimization method and device based on chip design, computer equipment and medium | |
CN112602059A (en) | Generating vector predicate digests | |
CN110750856B (en) | Effective instruction window size assessment method based on machine learning | |
CN108628731B (en) | Method for selecting test instruction and processing equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20210720 |
|
CF01 | Termination of patent right due to non-payment of annual fee |