CN108008999A - Index evaluating method and device - Google Patents

Index evaluating method and device Download PDF

Info

Publication number
CN108008999A
CN108008999A CN201610944177.6A CN201610944177A CN108008999A CN 108008999 A CN108008999 A CN 108008999A CN 201610944177 A CN201610944177 A CN 201610944177A CN 108008999 A CN108008999 A CN 108008999A
Authority
CN
China
Prior art keywords
product
coefficient
norm
optimization
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610944177.6A
Other languages
Chinese (zh)
Other versions
CN108008999B (en
Inventor
程捷
刘旭东
项定义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201610944177.6A priority Critical patent/CN108008999B/en
Publication of CN108008999A publication Critical patent/CN108008999A/en
Application granted granted Critical
Publication of CN108008999B publication Critical patent/CN108008999B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45504Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators
    • G06F9/45508Runtime interpretation or emulation, e g. emulator loops, bytecode interpretation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Devices For Executing Special Programs (AREA)
  • Advance Control (AREA)

Abstract

The embodiment of the present application provides a kind of index evaluating method and device, and this method includes:Obtain the instruction stream of test program and be divided into N number of instruction fragment of equal length;Determine N number of instruction fragment corresponding eigenmatrix A, average column vector B and weight vectors W;The objective optimization model of the test program is determined according to this feature matrix A, column vector B, weight vectors W and multiple optimization aims, and solves the optimization column vector of the objective optimization model;Determine the numbering vector sum coefficient vector of the corresponding instruction fragment of the optimization column vector;The corresponding instruction fragment of numbering vector is simulated in simulator respectively, obtains the characteristic index vector of the corresponding instruction fragment of numbering vector;The inner product for determining this feature indicator vector and the coefficient vector is to assess the characteristic index of the test program.

Description

Index evaluation method and device
Technical Field
The embodiment of the application relates to the field of computers, and more particularly to an index evaluation method and device.
Background
Personnel involved in the design and development of processor architectures often need to run test programs in a simulator of a certain architecture and then collect data indicators of relevant performance, such as Instruction Per Clock (IPC), hit rate of a second level Cache (L2 Cache), energy consumption, etc., in order to find the bottleneck of the current processor architecture. After the system architecture is improved, the design is redeployed in the simulator, the simulator is used again to run the test program, data are collected, the performance difference of the same test program which runs under the new and old system architectures is compared, then the bottleneck is found again. It can be seen that a great deal of design test work is done with a software simulator before deployment in hardware.
However, one of the major disadvantages of the software simulation platform is: the same test program is run at a much longer runtime than the hardware platform. Especially when running large, comprehensive test suite programs, such as SPEC CPU 2006, it often takes weeks or even months to get the data needed by itself. And after the system architecture is changed every time, the test program needs to be operated again to collect data under the new architecture. Therefore, such repeated operations and waits will seriously affect the development efficiency.
Research shows that the running process of the application program has obvious stage characteristics. The developer needs to shorten the time required by simulation from the perspective of simplifying test programs, and the core thought of the method is as follows: a user can obtain a group of artificial programs only by running the original test program once, each artificial program is very small and can be simulated quickly, and the time for simulating the group of artificial programs is far shorter than that for simulating the original test program; and (3) multiplying the data obtained after each artificial program simulation by a specific weight by a user, wherein the data obtained after weighted averaging is very close to the data obtained after the original artificial program runs.
In the prior Simpoint technology, the steps are as follows: grabbing an instruction stream of a test program component, cutting the instruction stream into segments (intervals) with equal length, and calculating a Basic Block Vector (BBV) of each Interval; performing K-Means clustering on all BBVs, and selecting an Interval closest to the center of each type as a representative Interval of the type; the indexes of the selected representative Interval are simulated and tested, and the indexes measured by the representative Interval of each category are multiplied by the proportion of the Interval of the category to the total Interval, namely, a weighted average is calculated to represent the indexes of the original test program components. The simulation test in the prior art needs too many selected fragments, long simulation time and uncontrollable error.
How to reduce the instruction fragments selected in the simulation test to shorten the simulation time and/or reduce the error of the simulation index is a technical problem to be solved by the embodiment of the present application.
Disclosure of Invention
The embodiment of the application provides an index evaluation method and device, which can reduce instruction segments selected in a simulation test to shorten simulation time and reduce simulation index errors, so that the simulation efficiency or precision of a processor can be improved.
In a first aspect, an index evaluation method is provided, including:
acquiring an instruction stream of a test program and dividing the instruction stream into N instruction segments with equal length;
determining a feature matrix A, a mean column vector B and a weight vector W corresponding to the N instruction segments, wherein the feature matrix A is a matrix with M rows and N columns and is used for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by the mean value of the feature index vectors of each row of the feature matrix A, and the weight vector W is a column vector formed by the weight occupied by the feature index of the basic block BB type corresponding to each row in the feature matrix A in a test program;
determining an optimized column vector according to the feature matrix A, the column vector B, the weight vector W and an optimization target, wherein the optimization target comprises reducing the number of instruction segments used for simulation and/or reducing errors between the feature indexes of the evaluation test program and the feature indexes of the test program, and the optimized column vector is used for representing the instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment;
determining a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector, wherein the number vector is used for indicating the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used for indicating the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector;
simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors, wherein the characteristic index vectors are used for expressing characteristic indexes of the instruction segments corresponding to the number vectors;
and determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
In the embodiment of the application, an optimized column vector is determined based on a feature matrix, a column vector, a weight vector and an optimized target of an instruction segment of a test program, the instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in the simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.
With reference to the first aspect, in a first possible implementation manner, the optimization objective is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction fragments used for simulation in the N instruction fragments and the weight corresponding to each instruction fragment, and the optimization column vector is the value of X meeting the optimization target.
The method of the embodiment of the application can reduce the number of instruction fragments required by simulation work of the test program on the premise of ensuring the error precision of the simulation index.
With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal is specifically implemented by determining the optimized column vector through the following goal optimization formula:
wherein | | W.AX-W.B | | non-woven hair t1 Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cells t2 Represents a second norm of AX-B, | X | | non-volatile memory t3 C represents the first threshold value, representing a third norm of X.
With reference to the first aspect, in a third possible implementation manner of the first aspect, the optimization objective is: when a first norm of X is less than or equal to a second threshold, minimizing a sum of a first product and a second product, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number, where X is an optimization variable used to represent instruction fragments for performing simulation among the N instruction fragments and a weight corresponding to each instruction fragment, and the optimization column vector is a value of X that satisfies the optimization goal.
The method of the embodiment of the application can reduce the simulation index error on the premise of controlling the number of the instruction segments required by the simulation work of the test program.
With reference to the third possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, the determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal is specifically implemented by determining the optimized column vector through the following goal optimization formula:
wherein d represents the second threshold, | X | | calucity t1 Represents a first norm of X, | | W.AX-W.B | | tormentum t2 Represents a second norm of W.AX-W.B, | X | | non-magnetic cells t3 Represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
With reference to the first aspect, in a fifth possible implementation manner of the first aspect, the optimization objective is: the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a first norm of XIs a product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a = L + S, L is a low rank matrix, S is a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,and the column vector is a column vector formed by the average values of all rows of the sparse matrix S, and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
The method of the embodiment of the application can achieve certain balance between reduction of the number of the instruction fragments required by simulation work of the test program and reduction of simulation index errors.
With reference to the first aspect, in a sixth possible implementation manner of the first aspect, the optimization objective is: minimizing a sum of a first product, a second product, a third product, a fourth product, and a fifth product when a sixth norm of Woomicron AX-WoB or a seventh norm of AX-B is less than or equal to a first threshold, and/or an eighth norm of X is less than or equal to a second threshold, wherein the sum of the first product, the second product, the third product, the fourth product, and the fifth product is minimized, wherein the first product is a product of a first coefficient and a first norm of X, and the second product is a product of a second coefficient and a first norm of XIs a product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a = L + S, L being a low rank matrix and S being a sparse matrix,representing a column vector formed by the average values of the rows of the low rank matrix L,and a column vector formed by the average values of the rows of the sparse matrix S, wherein the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero.
The method of the embodiment of the application can balance reduction of the number of the instruction segments required by simulation work of the test program and reduction of simulation index errors on the premise of ensuring precision of the simulation index errors and/or controlling the number of the instruction segments required by the simulation work of the test program.
With reference to the first aspect and the foregoing implementation manner, in a seventh possible implementation manner of the first aspect, determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal is specifically implemented as:
step 1: establishing a target optimization model of the optimization target according to the characteristic matrix A, the column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
and 2, step: setting initial values of X, Y, Z and mu, and setting the optimization target as a convergence condition for completing optimization;
and 3, step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not meet the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X meets the constraint condition of the target optimization model;
step 6: updating lagrangian multiplier Z according to the residual of equation X = Y and increasing penalty coefficient μ by a fixed multiple ρ, where ρ >1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the step 3-6.
With reference to the first aspect and the foregoing implementation manner of the first aspect, in an eighth possible implementation manner of the first aspect, reducing a non-zero element in X and replacing X is specifically implemented as: and on the premise that X is reduced by one non-zero element, when X formed by reducing the first non-zero element in X is used for minimizing the function value of the target optimization model, reducing the first non-zero element and replacing X.
With reference to the first aspect and the foregoing implementation manner of the first aspect, in a ninth possible implementation manner of the first aspect, the characteristic indicator includes at least one of the following: IPC (instruction per clock), success rate of predicted branches, failure rate of predicted branches, hit rate of L2Cache and energy consumption.
In a second aspect, an index evaluation apparatus is provided, including:
the device comprises an acquisition unit, a processing unit and a control unit, wherein the acquisition unit is used for acquiring an instruction stream of a test program and dividing the instruction stream into N instruction fragments with equal length;
a first determining unit, configured to determine a feature matrix a, a mean column vector B, and a weight vector W, where the feature matrix a is a matrix with M rows and N columns, and is used to describe feature indicator vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature indicator vectors in each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indicators of basic blocks BB of each row of the feature matrix a in a test program;
a second determining unit, configured to determine an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization target, where the optimization target includes reducing the number of instruction fragments used for simulation and/or reducing an error between a feature index of the test program and a feature index of the test program, and the optimized column vector is used to represent instruction fragments used for simulation in the N instruction fragments and a weight corresponding to each instruction fragment;
a third determining unit, configured to determine a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector, where the number vector is used to indicate a number of the instruction segment corresponding to a non-zero position in the optimized column vector, and the coefficient vector is used to indicate a weight of a feature indicator of the instruction segment corresponding to the non-zero position in the optimized column vector;
the simulation unit is used for simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain the characteristic index vectors of the instruction segments corresponding to the number vectors, and the characteristic index vectors are used for expressing the characteristic indexes of the instruction segments corresponding to the number vectors;
and the fourth determining unit is used for determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
In the embodiment of the application, an optimized column vector is determined based on a feature matrix, a column vector, a weight vector and an optimized target of an instruction segment of a test program, the instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in the simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.
With reference to the second aspect, in a first possible implementation manner, the optimization goal is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimized column vector is the value of X meeting the optimization target.
The method of the embodiment of the application can reduce the number of instruction fragments required by simulation work of the test program on the premise of ensuring the error precision of the simulation index.
With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, the second determining unit specifically determines the optimized column vector by using the following target optimization formula:
wherein | | W.AX-W.B | | non-woven hair t1 Representing a first norm of W.AX-W.B, | | AX-B | | non-zero magnetic flux t2 Represents a second norm of AX-B, | X | | non-luminous t3 C represents the first threshold value, to represent a third norm of X.
With reference to the second aspect, in a third possible implementation manner of the second aspect, the optimization goal is: when a first norm of X is less than or equal to a second threshold, minimizing a sum of a first product and a second product, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number, where X is an optimization variable used to represent instruction fragments for performing simulation among the N instruction fragments and a weight corresponding to each instruction fragment, and the optimization column vector is a value of X that satisfies the optimization goal.
The method of the embodiment of the application can reduce the simulation index error on the premise of controlling the number of the instruction segments required by the simulation work of the test program.
With reference to the third possible implementation manner of the second aspect, in a fourth possible implementation manner of the second aspect, the second determining unit specifically determines the optimized column vector through the following target optimization formula:
wherein d represents the second threshold, | X | | non-volatile memory t1 Represents the first norm of X, | | W · AX-W · B | | calucity t2 Represents a second norm of W.AX-W.B, | X | | non-magnetic cells t3 Denotes the third norm of X, λ denotes the first coefficient, δ denotes the second coefficient, λ ≠ 0.
With reference to the second aspect, in a fifth possible implementation manner of the second aspect, the optimization goal is: the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a first norm of XIs a product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a = L + S, L beingA low rank matrix, S is a sparse matrix,representing a column vector formed by the average values of the rows of the low rank matrix L,and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimized column vector is a value of X meeting the optimization target.
The method of the embodiment of the application can achieve certain balance between reduction of the number of the instruction fragments required by simulation work of the test program and reduction of simulation index errors.
With reference to the second aspect, in a sixth possible implementation manner of the second aspect, the optimization goal is: minimizing a sum of a first product, a second product, a third product, a fourth product, and a fifth product when a sixth norm of Woomicron AX-WoB or a seventh norm of AX-B is less than or equal to a first threshold, and/or an eighth norm of X is less than or equal to a second threshold, wherein the sum of the first product, the second product, the third product, the fourth product, and the fifth product is minimized, wherein the first product is a product of a first coefficient and a first norm of X, and the second product is a product of a second coefficient and a first norm of XIs a product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a = L + S, L is a low rank matrix, S is a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,and a column vector composed of the average values of the rows of the sparse matrix S, wherein the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero.
The method of the embodiment of the application can balance reduction of the number of the instruction segments required by simulation work of the test program and reduction of simulation index errors on the premise of ensuring precision of the simulation index errors and/or controlling the number of the instruction segments required by the simulation work of the test program.
With reference to the second aspect and the foregoing implementation manner of the second aspect, in a seventh possible implementation manner of the second aspect, the second determining unit is specifically implemented as:
step 1: establishing a target optimization model of the optimization target according to the feature matrix A, the column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
step 2: setting initial values of X, Y, Z and mu, and setting the optimization target as a convergence condition of the optimization completion;
and step 3: calculating the optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not satisfy the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X satisfies the constraint condition of the target optimization model;
step 6: updating lagrangian multiplier Z according to the residual of equation X = Y and increasing penalty coefficient μ by a fixed multiple ρ, where ρ >1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the step 3-6.
With reference to the second aspect and the foregoing implementation manner of the second aspect, in an eighth possible implementation manner of the second aspect, reducing one non-zero element in X and replacing X is specifically implemented as: and on the premise that X is reduced by one non-zero element, when X formed by reducing a first non-zero element in X is used for minimizing the function value of the target optimization model, reducing the first non-zero element and replacing X.
With reference to the second aspect and the foregoing implementation manner of the second aspect, in a ninth possible implementation manner of the second aspect, the characteristic index includes at least one of the following: IPC (instruction per clock), success rate of predicted branches, failure rate of predicted branches, hit rate of L2Cache and energy consumption.
In a third aspect, there is provided another index evaluation apparatus, including a processor and a channel interface, where the processor is configured to execute the method in the first aspect or any possible implementation manner of the first aspect through the channel interface.
In a fourth aspect, a computer-readable storage medium is presented for storing a computer program comprising instructions for performing the method of the first aspect or any possible implementation manner of the first aspect.
Based on the above technical solutions, the index evaluation method and apparatus in the embodiments of the present application construct a target optimization model of a test program based on a feature matrix, a column vector, and a weight vector of an instruction segment of the test program, determine an optimized column vector based on the target optimization model and an optimization target, determine an instruction segment and a coefficient vector corresponding to a non-zero position according to the optimized column vector, simulate the instruction segment corresponding to the optimized column vector in the simulation apparatus to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and obtain a feature index for evaluating the test program according to the feature index vector and the coefficient vector, thereby reducing the number of instruction segments required for simulation work of the test program or reducing a simulation index error of the test program.
Drawings
FIG. 1 is a schematic diagram of instruction codes according to an embodiment of the present application.
FIG. 2 is a flowchart of an index evaluation method according to an embodiment of the present application.
FIG. 3 is an interaction flow diagram of an index evaluation method according to an embodiment of the present application.
FIG. 4 is a schematic diagram of a target optimization model of an embodiment of the present application.
FIG. 5 is a flow diagram of a method of determining an optimized column vector according to one embodiment of the present application.
Fig. 6 is a schematic structural diagram of an index evaluation device according to an embodiment of the present application.
Fig. 7 is a schematic structural diagram of an index evaluation device according to another embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings.
To facilitate understanding of the embodiments of the present application, several elements that will be introduced in the description of the embodiments of the present application are first introduced herein.
An instruction stream file: the file for recording the instruction stream information is called an instruction stream file, and each line of the instruction stream file represents information related to an executed instruction and conforms to a uniform format. Typically, the instruction stream file size is fixed. For example, an instruction stream file generally contains information such as 1 hundred million instructions, and a complete test program can be regarded as an instruction stream composed of a plurality of 1 hundred million instructions, each 1 hundred million instruction being referred to as an Interval (Interval), in other words, a complete test program is composed of a plurality of segments. The full set of test programs is all fragments and the subset of test programs is a partial fragment. The purpose of simplifying the test program is to select representative partial segments from the complete set, and it is required that the fewer the segments are selected, the better the operation results of the segments are, and the higher the similarity between the operation results of the segments and the operation results of the original test program is, the better the similarity is. An instruction may include the following information: program pointer, assembly instruction, operation type, and memory address. The memory address is a selectable item.
And (3) program pointer: the program pointer for each line of instructions is the address in memory of the line assembler instruction, which is a hexadecimal number beginning with "0 x".
Assembling instructions: the binary instruction code of the instruction needs to meet assembly syntax requirements.
The operation type is as follows: all assembly instructions can be divided into three categories: the arithmetic logic unit operates, reads the memory, writes the memory and controls the instruction.
Memory address: if the instruction is an arithmetic logic unit operation, no memory address information is needed; if the instruction is a memory read-write operation, a memory address is required.
Basic Block (BB): a piece of sequentially executed instructions. In general, an instruction stream may be divided into a plurality of BBs with a control instruction as a boundary. Each segment of the test program consists of BB.
Basic Block feature index Vector (BBV): according to different control instructions, the execution times of BB of different types in each segment are counted, and a vector constructed based on the types and the execution times of the BB is called BBV. FIG. 1 is a schematic diagram of instruction code according to an embodiment of the present application. In the instruction code fragment shown in FIG. 1, the type ID of BB is {1,2,3,4,5}, and the corresponding execution times are {1,20,0,5,0}, so that BBV of the fragment can be recorded as {1, 2.
FIG. 2 is a flowchart of an index evaluation method according to an embodiment of the present application. The method of FIG. 2 is performed by an index evaluation device of a processor. Specifically, the index evaluation device may be a host apparatus or the like.
An instruction stream of a test program is obtained and divided into N instruction fragments of equal length 201.
202, determining a feature matrix a, a mean column vector B, and a weight vector W corresponding to the N instruction fragments.
The feature matrix a is a matrix with M rows and N columns and is used for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean value column vector B is a column vector formed by a mean value of feature index vectors of each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indexes of a test program of basic block BB types corresponding to each row of the feature matrix a.
Specifically, in the embodiment of the present application, the characteristic index may be any one of the following: instruction Per Clock (IPC), predicted branch success/failure, L2Cache hit, energy consumption, etc.
203, determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W and an optimization goal.
The optimization target comprises reducing the number of instruction segments used for simulation and/or reducing errors between characteristic indexes of the evaluation test program and characteristic indexes of the test program, and the optimization column vector is used for representing instruction segments used for simulation in the N instruction segments and weights corresponding to the instruction segments used for simulation.
Specifically, for example, an objective optimization model may be established according to the feature matrix a, the column vector B, the weight vector W and an optimization objective, the objective optimization model is used to represent the optimization objective through a relationship between a, B, W and an optimization variable X of the objective optimization model, and the optimization column vector is a value of the optimization variable X satisfying the constraint relationship.
And 204, determining a number vector and a coefficient vector of an instruction segment corresponding to the optimized column vector.
The number vector is used for representing the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used for representing the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector.
205, the instruction segments corresponding to the number vectors are simulated in the simulation device to obtain the feature index vectors of the instruction segments corresponding to the number vectors.
The feature indicator vector is used to indicate the feature indicator of the instruction segment corresponding to the number vector.
And 206, determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
In the embodiment of the application, an optimized column vector is determined based on a feature matrix, a column vector, a weight vector and an optimized target of an instruction segment of a test program, the instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in the simulation device to obtain the feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program can be reduced.
Optionally, as an embodiment, the optimization objective is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimized column vector is the value of X meeting the optimization target.
It should be understood that, in the embodiment of the present application, the first norm, the second norm, the third norm, and the like may be any one of the following norms: 0 norm, 1 norm,Norm, p-norm or nuclear norm, etc., where p is a positive integer. It should be understood that the first to eighth norms, etc. mentioned elsewhere in this application may also be any of the following norms: 0 norm, 1 norm,Norm, p-norm, or nuclear norm, etc. Taking X as an example, 0 norm | X | of X does not shade 0 1 norm of X (| X | | non-woven phosphor) of X 1 And the kernel norm X of X is shade * X ofNorm ofP norm | X | Y | of X non-woven grid p And p is a positive integer.
Specifically, step 203 may determine the optimized column vector by equation 1:
wherein | | W.AX-W.B | | non-woven hair t1 Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cells t2 Represents a second norm of AX-B, | X | | non-volatile memory t3 C represents the first threshold value, representing a third norm of X. It is understood that the operator · represents the inner product, e.g., [ x ] 1 ,x 2 ,x 3 ]·[y 1 ,y 2 ,y 3 ]=x 1 y 1 +x 2 y 2 +x 3 y 3
It is not assumed that the first norm of W.AX-W.B in equation 1 is | | W.AX-W.B | | light distribution 0 The second norm of X | | non-conducting phosphor 1 A 0 norm | | X | | non-woven phosphor of X 0 Then equation 1 can now be as shown in equation 2:
the method of the embodiment of the application can reduce the number of instruction fragments required by simulation work of the test program on the premise of ensuring the error precision of the simulation index.
Optionally, as another embodiment, the optimization objective is: when a first norm of X is less than or equal to a second threshold, minimizing a sum of a first product and a second product, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number, where X is an optimization variable used to represent instruction fragments for performing simulation among the N instruction fragments and a weight corresponding to each instruction fragment, and the optimization column vector is a value of X that satisfies the optimization goal.
Specifically, step 203 may determine the optimized column vector by equation 3:
wherein d represents the second threshold, | X | | calucity t1 Represents a first norm of X, | | W.AX-W.B | | tormentum t2 Represents a second norm of W.AX-W.B, | X | | non-magnetic cells t3 Represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
It should be understood that, in the embodiment of the present application, the first norm of X and the third norm of X may be any one of the following norms: 0 norm of X | | non-woven phosphor 0 1 norm of X (| X | | non-woven phosphor) of X 1 X isNorm ofP norm | X | Y | of X non-woven grid p And p is a positive integer. The first norm of X and the third norm of X may be the same or different.
It should be understood that, in the embodiment of the present application, the second norm of W · AX-W · B may be any one of the following norms: W.AX-W.B has a 0 norm | | | W.AX-W.B | | luminance 0 1 norm of W.AX-W.B | | | W.AX-W.B | | non-conducting light 1 W.AX-W.BNorm ofP norm of W.AX-W.B | | | W.AX-W.B | | luminance p And p is a positive integer.
It should be understood that at least 1 of the first coefficient and the second coefficient in the embodiments of the present application is not 0. For example, λ ≠ 0.
It is not assumed that the first norm of X in equation 3 | | | X | | cals t A 0 norm | | X | | non-woven phosphor of X 0 The third norm of X | | non-conducting phosphor t Is 1 norm | X | non-woven shade of X 1 The second norm of W.AX-W.B is 2 norms of W.AX-W.B | | W.AX-W.B | 2 If the values of the first coefficient and the second coefficient are both 1, formula 3 can be as shown in formula 4:
it should be understood that, in the embodiment of the present application, the values of the first coefficient and the second coefficient are determined by the optimization objective. The values of the first coefficient and the second coefficient determine whether the optimization target is smaller instruction fragments for simulation or smaller simulation errors.
Optionally, as another embodiment, the optimization objective is: the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a first norm of XIs a product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a = L + S, L being a low rank matrix and S being a sparse matrix,representing a column vector formed by the average values of the rows of the low rank matrix L,and representing a column vector formed by the average values of all rows of the sparse matrix S, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
Specifically, step 203 may specifically determine the optimized column vector by equation 5:
wherein | X | Y luminance t1 A first norm representing X is given,to representIs measured in a first direction of the first norm,representIs a third norm, | L | | non-woven phosphor t4 Represents the fourth norm of L, | S | | non-calculation t5 Denotes a fifth norm of S, λ denotes a first coefficient, δ denotes a second coefficient, α denotes a third coefficient, β denotes a fourth coefficient, and γ denotes a fifth coefficient.
It should be understood that at least one of the first coefficient, the second coefficient, the third coefficient, the fourth coefficient, and the fifth coefficient in the embodiments of the present application is not 0.
In particular, when the first norm of X is 1 norm,is a 2-norm,is 2 norm and the fourth norm of L is 1 normAnd when the fifth norm of S is 1 norm, equation 5 can be expressed as equation 6:
optionally, as an embodiment, the optimization objective is: when the sixth norm of Woomic AX-WoB or the seventh norm of AX-B is less than or equal to a first threshold, and/or the eighth norm of X is less than or equal to a second threshold, minimizing a sum of a first product, a second product, a third product, a fourth product, and a fifth product, wherein the first product is a product of a first coefficient and the first norm of X, and the second product is a product of a second coefficient and the first norm of XIs the product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a = L + S, L being a low rank matrix and S being a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,and a column vector formed by the average values of all rows of the sparse matrix S is represented, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
When the constraint condition of the optimization objective is that the sixth norm of W · AX-W · B or the seventh norm of AX-B is less than or equal to the first threshold, step 203 may determine the optimized column vector by equation 7:
wherein c represents a first threshold, | W omicron AX-W omicron B | | Limu t6 Represents the sixth norm of Wo AX-Wo B, | | AX-B | | not to see t7 Represents the seventh norm of AX-B, and the remaining individual parameters have the same meanings as the corresponding parameters in equation 5.
When the constraint condition of the optimization objective is that the sixth norm of X is less than or equal to the second threshold, step 203 may determine the optimized column vector by equation 8:
wherein d represents a second threshold, | X | | calculation t6 And a sixth norm of X, and the meanings of the remaining parameters are the same as those of the corresponding parameters in formula 5.
When the constraint of the optimization objective is that the sixth norm of W · AX-W · B or the seventh norm of AX-B is less than or equal to the first threshold and the eighth norm of X is less than or equal to the second threshold, step 203 may determine the optimized column vector by equation 9:
wherein c represents a first threshold, d represents a second threshold, and | | X | | | magnetism when a sixth norm of WoO AX-WoO B or a seventh norm of AX-B is less than or equal to the first threshold t8 And an eighth norm of X, and the meanings of the remaining parameters are the same as those of the corresponding parameters in formula 5.
Specifically, the detailed process of step 203 may be as follows:
step 1: establishing a target optimization model of the optimization target according to the characteristic matrix A, the column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
step 2: setting initial values of X, Y, Z and mu, and setting the optimization target as a convergence condition of the optimization completion;
and step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not satisfy the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X satisfies the constraint condition of the target optimization model;
step 6: updating lagrangian multiplier Z according to the residual of equation X = Y and increasing penalty coefficient μ by a fixed multiple ρ, where ρ >1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the steps 3-7.
In a specific implementation manner, the reduction of one non-zero element in X and the replacement of X in step 5 can be specifically implemented as: the lagrange multiplier Z is updated according to the residual of equation X = Y and the penalty factor μ is increased by a fixed multiple ρ, where ρ >1. Of course, it should be understood that there may be other specific implementation manners of step 5, for example, decreasing the non-zero elements in X and replacing X one by one in the order of decreasing the non-zero elements from small to large or from large to small until X satisfies the constraint condition of the target optimization model, and so on.
In order to facilitate understanding of the method of the embodiments of the present application, the method of the embodiments of the present application will be further described with reference to specific embodiments.
FIG. 3 is a flowchart illustrating evaluation of metrics by a processor according to an embodiment of the present application. It should be understood that in the embodiment of the present application, BBV is used as the feature index vector of the test program. The method of the embodiment of the present application is executed by an index evaluation apparatus of a processor disposed on a host device, where the apparatus may be an analog system or an execution chip, and the like, and the embodiment of the present application is not limited herein.
301, the instruction stream is fetched.
In one particular example, the host device may use an instruction stream fetching software tool to fetch an instruction stream of the test program, the instruction stream being composed of instructions. The host device may segment the instruction stream into BBs according to the jump instruction, and thus the instruction stream may be viewed as being composed of BB sequences.
302, the instruction stream is partitioned into instruction fragments, and the BBV of each instruction fragment is calculated.
The host device may divide the instruction stream obtained in step 301 into a plurality of instruction fragments (intervals). For example, the instruction stream may be divided into instruction fragments of equal length, and the number and entry address of each instruction fragment may be recorded. Specifically, for example, each instruction fragment contains 1 million BBs, and so on. Of course, the division may be performed according to other instruction lengths, for example, 2 hundred million BBs, 10 hundred million BBs, etc., and the embodiment of the present application is not limited herein.
It is not assumed that the number of divided instruction fragments is N.
303, constructing matrices A, B, W.
The host device may construct the feature matrix a according to the BBV of each instruction segment after the instruction stream is split. Where each row represents a BBV for one instruction fragment.
The host device may calculate a column vector B based on the average value of each row in the feature matrix a. Specifically, the value of each row in B can be expressed by the following equation 10:
wherein, i is more than or equal to 1 and less than or equal to N, and M represents the number of BB types in the instruction segment.
In addition, a weight vector W may be set according to the difference in weight of the type of BB in the characteristic index of the test program. W is the column vector. Wherein each value of W represents a weight of the corresponding type of BB in the characteristic indicators of the test program.
FIG. 4 is a schematic diagram of a target optimization model according to an embodiment of the present application. As shown in FIG. 4, the rows of the feature matrix A correspond to the types of BB, the columns correspond to the divided instruction fragments, the feature matrix A is a matrix of M rows × N columns, and the element A in A i,j And representing the characteristic index value of the ith BB in the jth instruction segment. The sparse solution matrix X is an optimization variable X in the embodiment shown in fig. 2, a position number of a non-zero coefficient in X is used to indicate a number of a selected instruction segment, a value of the non-zero coefficient indicates a weight of a finally calculated characteristic index of a characteristic index of the instruction segment corresponding to the non-zero coefficient, and the sparse solution matrix X is an N row × 1 column matrix. The column mean matrix B of a is a column vector formed by the column means of each column in a, and the matrix B is a matrix of M rows × 1 column, and its calculation method is shown in equation 10. The weight matrix W is a matrix of M rows by 1 column, and the ith element W in the weight matrix i And (3) representing the weight of the feature vector of the ith BB in the finally calculated feature index. As shown in fig. 4, the value of X satisfies AX omic W = B omic W. It is understood that AX W and W AX are equivalent; b ° W and W ° B are equivalent.
An optimized column vector is determined 304 based on A, B, W and the optimization objective.
Specifically, an objective optimization model can be constructed according to A, B, W and the optimization objective, and the optimization column vector of the objective optimization model is solved.
An optimization variable X can be introduced, wherein a non-zero element in X represents a selected instruction segment, and the value of the non-zero element in X represents the value of the weight corresponding to the instruction segment. That is, X is used to represent the instruction segment for simulation among the N instruction segments of the test program and the weight corresponding to each instruction segment.
It should be understood that in practical applications, W · AX and W · B will have a certain error, and the smaller the error, the smaller the simulation error; in addition, the smaller the number of non-zero coefficients in X, the fewer instruction fragments to be selected for the simulation test, and the shorter the time required for the simulation test. Therefore, an objective optimization model can be constructed based on A, B and W according to the optimization objective.
It should be appreciated that if the cost of the number of instruction fragments is high, then a higher weight may be placed on the optimization objective that affects the number of X non-zero coefficients. At this point, the optimization objective may include limiting the number of non-zero coefficients in X.
Specifically, for example, the number of non-zero coefficients in X is minimized. At this time, it can be achieved by minimizing the norm of X. The X norm may include the 0 norm | | | X | | luminance of X 0 1 norm of X (| X | | non-woven phosphor) of X 1 X ofNorm ofP norm | X | non-woven shade of X p And p is a positive integer.
Or, for example, so that the number of non-zero coefficients in X is smaller than c. At this time, | | X | | non-woven phosphor can be restricted 0 ≤c,c∈N +
Or, for example, such that the 1 norm of X is less than d. At this time, | | X | | non-woven phosphor can be restricted 1 ≤d,d∈R + . Reducing non-zero numbers in X is achieved by constraining the 1-norm reduction of X to be less than a value that results in a reduction of non-zero numbers in X.
Of course, other constraint formulas may exist, and the embodiments of the present application are not listed here.
It should be appreciated that if certain simulation error accuracy is required, higher weights may be placed on the optimization objectives affecting W.AX-W.B.
Specifically, for example, the value of W · AX-W · B is minimized, or W · AX-W · B is smaller than d, or the norm of W · AX-W · B is smaller than d, or the like.
Based on the difference of the optimization objectives, different objective optimization models can be set.
For example, assume that the optimization goal of the target optimization model is to minimize the third norm of X when either the first norm of W.AX-W.B or the second norm of AX-B is less than or equal to the first threshold. At this time, the objective optimization model may be expressed by equation 1 above.
For another example, assume that the optimization goal of the objective optimization model is to minimize the sum of a first product and a second product when a first norm of X is smaller than or equal to a second threshold, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number. At this time, the objective optimization model may be expressed by equation 3 above.
For another example, assume that the optimization objective of the objective optimization model is to minimize the sum of a first product, a second product, a third product, a fourth product and a fifth product, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the product of a second coefficient and a first norm of XIs a product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a = L + S, L is a low rank matrix, S is a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,a column vector representing an average value of each row of the sparse matrix S. At this time, the target optimization model can be expressed by the above equation 5.
For another example, assume that the optimization goal of the objective optimization model is to minimize the sum of a first product, a second product, a third product, a fourth product and a fifth product when the sixth norm of W.AX-W.B or the seventh norm of AX-B is less than or equal to a first threshold and/or the eighth norm of X is less than or equal to a second threshold, wherein the first product is the product of a first coefficient and the first norm of X, and the second product is the product of a second coefficient and the first norm of XIs the product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a = L + S, L being a low rank matrix and S being a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,a column vector representing an average value of each row of the sparse matrix S. At this time, the target optimization model can be expressed by the above equations 7, 8, 9, and the like.
Without assuming that the formula of the objective optimization model of the embodiment of the present application is shown in formula 4, a method for establishing the objective optimization model and solving the optimized column vectors of the embodiment of the present application is shown in fig. 5. Fig. 5 is a flowchart of a method for determining an optimized column vector according to an embodiment of the present application.
S501, establishing a target optimization model, and determining a Lagrangian function f (X, Y, Z and mu) according to the target optimization model.
Specifically, an objective optimization model can be established based on a, B, W and an optimization objective, the objective optimization model is used for representing the relationship between a, B, W and an optimization variable X under the optimization objective, and the value of X satisfying the objective optimization model is an optimization column vector.
To solve for the optimized column vector, the lagrangian function f (X, Y, Z, μ) may be determined. Specifically, a relaxation variable Y can be introduced, a Lagrange multiplier method is used, a Lagrange multiplier Z is introduced, mu is a penalty parameter, and the value is a positive real number. The variable X of the Lagrangian function is the optimization variable X of the target optimization model.
S502, setting initial values of variables X, Y, Z and mu in the Lagrangian function, and setting convergence conditions for finishing optimization.
For example, a convergence condition for completion of optimization when the benefit of optimization is less than a predetermined threshold may be set. Specifically, taking the target optimization model shown in formula 4 as an example, if the obtained | | W · AX-W · B | | survival rate 2 +||X|| 1 The value of (c) is less than the calculation of I W, AX-W, B I calumping of the last cycle 2 +||X|| 1 The magnitude of the decrease in the value of (b) is less than a predetermined threshold V, the optimization is deemed complete.
For another example, a maximum number of loop executions may also be set, in order to optimize the convergence condition of completion when the loop reaches the maximum number.
Of course, it should be understood that a plurality of determination conditions may be set, and any one of them may be satisfied. For example, if | | W · AX-W · B | | Wy calculation is obtained 2 +||X|| 1 The value of (c) is less than the calculation of I W, AX-W, B I calumping of the last cycle 2 +||X|| 1 The amplitude of the decrease of the value of (b) is less than a predetermined threshold V, or the number of cycles reaches the maximum number, which is a convergence condition for completing optimization, and so on.
S503, an optimal value of Y is calculated using the least square method.
For a specific implementation of calculating Y by using the least square method, reference may be made to the prior art, and details of the embodiment of the present application are not repeated herein.
S504, the optimal value of X is calculated by using a soft threshold method.
For a specific implementation of calculating X by using the soft threshold method, reference may be made to the prior art, and details of the embodiment of the present application are not described herein again.
And S505, when the value of the X does not meet the constraint condition of the target optimization model, reducing a non-zero element in the X and replacing the X until the X meets the constraint condition of the target optimization model.
It should be understood that step S505 has different implementations according to different target optimization models.
For example, assuming that the target optimization model is the expression shown in formula 4, in step S505, the non-zero elements of X may be reduced to not more than d.
There are many ways to reduce non-zero elements in X.
Preferably, on the premise that X is reduced by one non-zero element, when X formed by reducing a first non-zero element in X minimizes the function value of the target optimization model, the first non-zero element is reduced and replaces X.
For example, in the target optimization model assumed in equation 4, in order to reduce the non-zero elements of X to no more than d, in order to make | | W · AX-W · B | | white magnetism 2 +||X|| 1 The value of (c) is taken as the minimum value as possible, and when every time one nonzero element is reduced, the | | | W · AX-W · B | | | luminous flux after each nonzero element is reduced can be compared 2 +||X|| 1 Selecting | | | W.AX-W.B | | non-woven cells 2 +||X|| 1 The minimum value of (2) is used as a principle of reducing the non-zero elements until the number of the non-zero elements does not exceed d.
Of course, the non-zero elements may be reduced according to the principle that the value of the non-zero elements in X is from large to small or from small to large. The method has the advantages of simple calculation, and smaller function value of the target optimization model due to the fact that multiple comparisons for reducing non-zero elements are not needed.
S506, updating lagrangian multiplier according to the residual of equation X = Y, and increasing penalty coefficient μ by a fixed multiple ρ, where ρ >1.
S507, judging whether the convergence condition of the Lagrangian function f (X, Y, Z and mu) is met.
If the convergence condition of the Lagrangian function f (X, Y, Z, mu) is met, outputting the current value of X, and ending the execution;
if the convergence condition of the Lagrangian function f (X, Y, Z, μ) is not satisfied, step 503 is performed.
Steps S503 to S507 are executed in a loop until the convergence condition of the lagrangian function f (X, Y, Z, μ) is satisfied.
Of course, it should be understood that there may be other methods of solving the objective optimization model, and the embodiments of the present application are not limited thereto.
The number vector and coefficient vector of the instruction fragment to which the optimized column vector corresponds are determined 305.
The optimized column vector determined in step 304 is not recorded as X 0
It is understood that X 0 The number of the non-zero position-corresponding segment is recorded in a number vector D, X 0 The non-zero coefficients are recorded in a coefficient vector G. The number vector D is used for representing the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector G is used for representing the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector.
And 306, simulating the instruction segments corresponding to the number vector D in the simulation device respectively to obtain the characteristic index vector of the instruction segment corresponding to the number vector D.
And the simulation test system simulates the instruction segment corresponding to the number vector D by using a simulation tool according to the number vector D to obtain the characteristic indexes of each instruction segment, and the characteristic indexes of each instruction segment corresponding to the number vector D form a characteristic index vector C.
The simulation tool may be a software or hardware emulator, such as LiveSP, checkpoint, or the like. The characteristic indexes of the instruction segment obtained by the simulation tool can be IPC, predicted branch success/failure rate, energy consumption and the like.
307, determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
From the coefficient vector D of step 305 and the characteristic index vector C of step 306, a characteristic index C · D of the evaluation test program can be obtained.
And finishing the index evaluation of the test program.
Fig. 6 is a schematic structural diagram of an index evaluation device 600 according to an embodiment of the present application. As shown in fig. 6, the index evaluation device 600 may include:
an obtaining unit 601, configured to obtain an instruction stream of a test program and divide the instruction stream into N instruction fragments with equal lengths;
a first determining unit 602, configured to determine a feature matrix a, a mean column vector B, and a weight vector W, where the feature matrix a is a matrix with M rows and N columns, and is used to describe feature indicator vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature indicator vectors in each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indicators of basic blocks BB of each row in the feature matrix a in a test program;
a second determining unit 603, configured to determine an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal, where the optimization goal includes reducing the number of instruction fragments used for simulation and/or reducing an error between a feature index of the test program and a feature index of the test program, and the optimized column vector is used to represent instruction fragments used for simulation in the N instruction fragments and a weight corresponding to each instruction fragment;
a third determining unit 604, configured to determine a number vector and a coefficient vector of the instruction fragment corresponding to the optimized column vector, where the number vector is used to indicate a number of the instruction fragment corresponding to a non-zero position in the optimized column vector, and the coefficient vector is used to indicate a weight of a feature indicator of the instruction fragment corresponding to the non-zero position in the optimized column vector;
a simulation unit 605, configured to simulate the instruction segments corresponding to the number vector in a simulation apparatus, respectively, to obtain a feature indicator vector of the instruction segment corresponding to the number vector, where the feature indicator vector is used to indicate a feature indicator of the instruction segment corresponding to the number vector;
a fourth determining unit 606, configured to determine an inner product of the feature indicator vector and the coefficient vector as a feature indicator for evaluating the test program.
In the embodiment of the application, a target optimization model of a test program is constructed based on a feature matrix, a column vector and a weight vector of an instruction segment of the test program, an optimized column vector is determined based on the target optimization model and an optimized target, an instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in a simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.
The index evaluation device 600 may also execute the method of fig. 2 and implement the functions of the index evaluation device in the embodiments shown in fig. 2, fig. 3, and fig. 5, which are not described herein again in this embodiment of the present application.
Fig. 7 is a schematic structural diagram of an index evaluation apparatus 700 according to an embodiment of the present application. The metric evaluation device 700 may include a channel interface 701, a processor 702, and a memory 703.
The channel interface 701, the processor 702 and the memory 703 are interconnected by a bus 704 system. Bus 704 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 7, but this does not indicate only one bus or one type of bus.
The memory 703 is used for storing programs. In particular, the program may include program code comprising computer operating instructions. Memory 703 may include both read-only memory and random access memory, and provides instructions and data to processor 702. The memory 703 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The processor 702 executes the program stored in the memory 703, and is specifically configured to perform the following operations:
acquiring an instruction stream of a test program and dividing the instruction stream into N instruction segments with equal length;
determining a feature matrix A, a mean column vector B and a weight vector W corresponding to the N instruction segments, wherein the feature matrix A is a matrix with M rows and N columns and is used for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by the mean value of the feature index vectors of each row of the feature matrix A, and the weight vector W is a column vector formed by the weight occupied by the feature index of the basic block BB type corresponding to each row in the feature matrix A in a test program;
determining an optimized column vector according to the feature matrix A, the column vector B, the weight vector W and an optimization target, wherein the optimization target comprises reducing the number of instruction fragments for simulation and/or reducing an error between a feature index for evaluating the test program and the feature index of the test program, and the optimized column vector is used for representing the instruction fragments for simulation in the N instruction fragments and the weight corresponding to each instruction fragment;
determining a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector, wherein the number vector is used for representing the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used for representing the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector;
simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors, wherein the characteristic index vectors are used for expressing characteristic indexes of the instruction segments corresponding to the number vectors;
and determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
The method performed by the index estimation apparatus disclosed in any one of fig. 2, fig. 3, and fig. 5 may be applied to the processor 702, or implemented by the processor 702. The processor 702 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 702. The Processor 702 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software modules may be located in ram, flash, rom, prom, or eprom, registers, etc. as is well known in the art. The storage medium is located in the memory 703, and the processor 702 reads the information in the memory 703 and performs the steps of the above method in combination with the hardware thereof.
The index evaluation device 700 may also perform the method of fig. 2 and implement the functions of the index evaluation device in the embodiments shown in fig. 2, fig. 3, and fig. 5, which are not described herein again in this application.
Embodiments of the present application further provide a computer-readable storage medium for storing a computer program, where the computer program includes instructions for executing the method of the embodiment shown in fig. 2.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (20)

1. An index evaluation method, comprising:
acquiring an instruction stream of a test program and dividing the instruction stream into N instruction segments with equal length;
determining a feature matrix A, a mean column vector B and a weight vector W corresponding to the N instruction segments, wherein the feature matrix A is a matrix with M rows and N columns and is used for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by the mean value of the feature index vectors of each row of the feature matrix A, and the weight vector W is a column vector formed by the weight occupied by the feature index of the basic block BB type corresponding to each row in the feature matrix A in a test program;
determining an optimized column vector according to the feature matrix A, the column vector B, the weight vector W and an optimization target, wherein the optimization target comprises reducing the number of instruction segments used for simulation and/or reducing errors between feature indexes of the evaluation test program and the feature indexes of the test program, and the optimized column vector is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment;
determining a number vector and a coefficient vector of an instruction segment corresponding to the optimized column vector, wherein the number vector is used for representing the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used for representing the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector;
simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors, wherein the characteristic index vectors are used for expressing characteristic indexes of the instruction segments corresponding to the number vectors;
and determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
2. The method of claim 1, wherein the optimization objective is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
3. The method of claim 2, wherein said determining an optimized column vector based on the feature matrix a, the column vector B, the weight vector W, and an optimization objective comprises:
determining the optimized column vector by the following target optimization formula:
wherein | | W.AX-W.B | | non-woven hair t1 Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cells t2 Represents a second norm of AX-B, | X | | non-volatile memory t3 C represents the first threshold value, to represent a third norm of X.
4. The method of claim 1, wherein the optimization objective is:
when a first norm of X is smaller than or equal to a second threshold, minimizing the sum of a first product and a second product, wherein the first product is a product of a second norm of W.AX-W.B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, a value of the first coefficient is a non-zero real number, X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and a weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
5. The method of claim 4,
the determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization objective comprises:
determining the optimized column vector by the following target optimization formula:
wherein d represents the second threshold, | X | | calucity t1 Represents the first norm of X, | | W · AX-W · B | | calucity t2 Represents the second norm of W.AX-W.B, | X | | non-magnetic t3 Represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
6. The method of claim 1, wherein the optimization objective is:
the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the product of a second coefficient and a first norm of XIs the product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a = L + S, L is a low rank matrix, S is a sparse matrix,formed by means of an average value representing each row of the low-rank matrix LThe number of column vectors is such that,and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimized column vector is a value of X meeting the optimization target.
7. The method of claim 1, wherein the optimization objective is:
when the sixth norm of W.AX-W.B or the seventh norm of AX-B is less than or equal to a first threshold value and/or the eighth norm of X is less than or equal to a second threshold value, the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and the first norm of X, and the second product is the product of a second coefficient and the first norm of XIs the product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a = L + S, L being a low rank matrix and S being a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,a column vector representing an average value of each row of the sparse matrix S, a first coefficient, a second coefficient, a third coefficient, a fourth coefficient, and a fifth coefficientAnd all the instruction segments are not zero, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimized column vector is a value of X meeting the optimization target.
8. The method of any of claims 1-7, wherein said determining an optimized column vector based on the feature matrix A, the column vector B, the weight vector W, and an optimization objective comprises:
step 1: establishing a target optimization model of the optimization target according to the feature matrix A, the column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
and 2, step: setting initial values of X, Y, Z and mu, and setting the optimization target as a convergence condition of the optimization completion;
and 3, step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not meet the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X meets the constraint condition of the target optimization model;
step 6: updating lagrangian multiplier Z according to the residual of equation X = Y and increasing penalty coefficient μ by a fixed multiple ρ, where ρ >1;
and 7: and judging whether a convergence condition is met or not, if so, determining that X is solved, and if not, repeatedly executing the step 3-6.
9. The method of claim 8, wherein the reducing one non-zero element of X and replacing X comprises:
and on the premise that X is reduced by one non-zero element, when X formed by reducing a first non-zero element in X is used for minimizing the function value of the target optimization model, reducing the first non-zero element and replacing X.
10. The method of any one of claims 1-9, wherein the characteristic indicator comprises at least one of:
IPC (instruction per clock), success rate of predicted branches, failure rate of predicted branches, hit rate of L2Cache and energy consumption.
11. An index evaluation apparatus of a processor, comprising:
the device comprises an acquisition unit, a processing unit and a control unit, wherein the acquisition unit is used for acquiring an instruction stream of a test program and dividing the instruction stream into N instruction fragments with equal length;
a first determining unit, configured to determine a feature matrix a, a mean column vector B, and a weight vector W, where the feature matrix a is a matrix with M rows and N columns, and is used to describe feature indicator vectors of the N instruction segments, M is a number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature indicator vectors in each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indicators of basic blocks BB types in each row of the feature matrix a in a test program;
a second determining unit, configured to determine an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization target, where the optimization target includes reducing the number of instruction fragments used for simulation and/or reducing an error between a feature indicator of the evaluation test program and a feature indicator of the evaluation test program, and the optimized column vector is used to represent instruction fragments used for simulation in the N instruction fragments and a weight corresponding to each instruction fragment;
a third determining unit, configured to determine a number vector and a coefficient vector of an instruction segment corresponding to the optimized column vector, where the number vector is used to represent a number of the instruction segment corresponding to a non-zero position in the optimized column vector, and the coefficient vector is used to represent a weight of a feature indicator of the instruction segment corresponding to the non-zero position in the optimized column vector;
the simulation unit is used for simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors, and the characteristic index vectors are used for expressing characteristic indexes of the instruction segments corresponding to the number vectors;
and the fourth determination unit is used for determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
12. The apparatus of claim 11, wherein the optimization objective is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
13. The apparatus of claim 12, wherein the second determining unit is to determine the optimized column vector by the following target optimization formula:
wherein | | W.AX-W.B | does not count t1 Representing a first norm of W.AX-W.B, | | AX-B | | non-zero magnetic flux t2 Represents a second norm of AX-B, | X | | non-volatile memory t3 C represents the first threshold value, to represent a third norm of X.
14. The apparatus of claim 11, wherein the optimization objective is:
when a first norm of X is smaller than or equal to a second threshold, minimizing the sum of a first product and a second product, wherein the first product is a product of a second norm of W.AX-W.B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, the value of the first coefficient is a non-zero real number, X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target. .
15. The apparatus of claim 14, wherein the second determining unit is to determine the optimized column vector by the following target optimization formula:
wherein d represents the second threshold, | X | | calucity t1 Represents a first norm of X, | | W.AX-W.B | | tormentum t2 Represents a second norm of W.AX-W.B, | X | | non-magnetic cells t3 Represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
16. The apparatus of claim 11, wherein the optimization objective is:
the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a first norm of XIs a product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a = L + S, L being a low rank matrix and S being a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimized column vector is a value of X meeting the optimization target.
17. The apparatus of claim 11, wherein the optimization objective is:
when the sixth norm of W.AX-W.B or the seventh norm of AX-B is less than or equal to a first threshold value and/or the eighth norm of X is less than or equal to a second threshold value, the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and the first norm of X, and the second product is the product of a second coefficient and the first norm of XIs a product of a third coefficient and a second norm ofA fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a = L + S, L being a low rank matrix and S being a sparse matrix,a column vector representing the average of the rows of the low rank matrix L,is indicative of rarityAnd a column vector formed by the average values of each row of the sparse matrix S, wherein the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
18. The apparatus according to any of claims 11-17, wherein the second determining unit determines an optimized column vector from the feature matrix a, the column vector B, the weight vector W and an optimization goal, embodied as:
step 1: establishing a target optimization model of the optimization target according to the feature matrix A, the column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
step 2: setting initial values of X, Y, Z and mu, and setting the optimization target as a convergence condition of the optimization completion;
and step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not meet the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X meets the constraint condition of the target optimization model;
and 6: updating a lagrange multiplier Z according to a residual of equation X = Y and increasing a penalty coefficient μ by a fixed multiple ρ, where ρ >1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the step 3-6.
19. The apparatus of claim 18, wherein the second determining unit reduces one non-zero element in X and replaces X by:
and on the premise that X is reduced by one non-zero element, when X formed by reducing a first non-zero element in X is used for minimizing the function value of the target optimization model, reducing the first non-zero element and replacing X.
20. The apparatus of any of claims 11-19, wherein the characteristic indicator comprises at least one of:
IPC (instruction per clock), success rate of predicted branches, failure rate of predicted branches, hit rate of L2Cache and energy consumption.
CN201610944177.6A 2016-11-02 2016-11-02 Index evaluation method and device Expired - Fee Related CN108008999B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610944177.6A CN108008999B (en) 2016-11-02 2016-11-02 Index evaluation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610944177.6A CN108008999B (en) 2016-11-02 2016-11-02 Index evaluation method and device

Publications (2)

Publication Number Publication Date
CN108008999A true CN108008999A (en) 2018-05-08
CN108008999B CN108008999B (en) 2021-07-20

Family

ID=62048196

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610944177.6A Expired - Fee Related CN108008999B (en) 2016-11-02 2016-11-02 Index evaluation method and device

Country Status (1)

Country Link
CN (1) CN108008999B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115543719A (en) * 2022-11-24 2022-12-30 飞腾信息技术有限公司 Component optimization method and device based on chip design, computer equipment and medium
WO2023015560A1 (en) * 2021-08-13 2023-02-16 Huawei Technologies Co.,Ltd. Systems and methods for sparsity-aware vector processing in general purpose cpus

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6751792B1 (en) * 2000-10-04 2004-06-15 Sun Microsystems, Inc. Using value-expression graphs for data-flow optimizations
CN103049310A (en) * 2012-12-29 2013-04-17 中国科学院深圳先进技术研究院 Multi-core simulation parallel accelerating method based on sampling
CN104268085A (en) * 2014-10-24 2015-01-07 重庆邮电大学 Software vulnerability discovery system and method based on attribute extraction
US20150309914A1 (en) * 2013-07-17 2015-10-29 Deja Vu Security, Llc Metaphor based language fuzzing of computer code
CN105654120A (en) * 2015-12-25 2016-06-08 东南大学—无锡集成电路技术研究所 Two-step cluster software load feature extraction method based on SOM and K-means

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6751792B1 (en) * 2000-10-04 2004-06-15 Sun Microsystems, Inc. Using value-expression graphs for data-flow optimizations
CN103049310A (en) * 2012-12-29 2013-04-17 中国科学院深圳先进技术研究院 Multi-core simulation parallel accelerating method based on sampling
US20150309914A1 (en) * 2013-07-17 2015-10-29 Deja Vu Security, Llc Metaphor based language fuzzing of computer code
CN104268085A (en) * 2014-10-24 2015-01-07 重庆邮电大学 Software vulnerability discovery system and method based on attribute extraction
CN105654120A (en) * 2015-12-25 2016-06-08 东南大学—无锡集成电路技术研究所 Two-step cluster software load feature extraction method based on SOM and K-means

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023015560A1 (en) * 2021-08-13 2023-02-16 Huawei Technologies Co.,Ltd. Systems and methods for sparsity-aware vector processing in general purpose cpus
CN115543719A (en) * 2022-11-24 2022-12-30 飞腾信息技术有限公司 Component optimization method and device based on chip design, computer equipment and medium

Also Published As

Publication number Publication date
CN108008999B (en) 2021-07-20

Similar Documents

Publication Publication Date Title
US7549069B2 (en) Estimating software power consumption
CN110178123B (en) Performance index evaluation method and device
JP5512503B2 (en) Method and apparatus for efficiently generating a processor architecture model
US8140901B2 (en) Validation of processors using a self-generating test case framework
US10102323B2 (en) Micro-benchmark analysis optimization for microprocessor designs
CN109726764A (en) A kind of model selection method, device, equipment and medium
CN108845927B (en) Test case screening method and device
CN108008999B (en) Index evaluation method and device
Wang et al. Horizontally fused training array: An effective hardware utilization squeezer for training novel deep learning models
CN114818600A (en) Chip verification method and device, electronic equipment and storage medium
CN108304352B (en) Accelerated life test sample distribution method and terminal equipment
CN116149917A (en) Method and apparatus for evaluating processor performance, computing device, and readable storage medium
CN107769987B (en) Message forwarding performance evaluation method and device
CN108664368B (en) Processor performance index evaluation method and device
Röhl et al. Validation of hardware events for successful performance pattern identification in high performance computing
CN114253821B (en) Method and device for analyzing GPU performance and computer storage medium
JP7335952B2 (en) Generating vector predicate summaries
Madougou et al. Using colored petri nets for GPGPU performance modeling
JP6239195B2 (en) Performance evaluation apparatus and performance evaluation program
CN109815104B (en) GPGPU program approximate analysis system and method based on soft error perception
EP3518153A1 (en) Information processing method and information processing system
CN113988493A (en) Method and device for constructing terminal evaluation model and storage medium
CN115543719B (en) Component optimization method and device based on chip design, computer equipment and medium
CN110750856B (en) Effective instruction window size assessment method based on machine learning
CN108628731B (en) Method for selecting test instruction and processing equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210720