CN107329732A - It is a kind of to be used to perform a variety of apparatus and method for surmounting function computing - Google Patents

It is a kind of to be used to perform a variety of apparatus and method for surmounting function computing Download PDF

Info

Publication number
CN107329732A
CN107329732A CN201610284359.5A CN201610284359A CN107329732A CN 107329732 A CN107329732 A CN 107329732A CN 201610284359 A CN201610284359 A CN 201610284359A CN 107329732 A CN107329732 A CN 107329732A
Authority
CN
China
Prior art keywords
function
post
processing unit
independent variable
core cell
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610284359.5A
Other languages
Chinese (zh)
Other versions
CN107329732B (en
Inventor
张士锦
李尚应
陈天石
陈云霁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cambricon Technologies Corp Ltd
Beijing Zhongke Cambrian Technology Co Ltd
Original Assignee
Beijing Zhongke Cambrian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhongke Cambrian Technology Co Ltd filed Critical Beijing Zhongke Cambrian Technology Co Ltd
Priority to CN201610284359.5A priority Critical patent/CN107329732B/en
Publication of CN107329732A publication Critical patent/CN107329732A/en
Application granted granted Critical
Publication of CN107329732B publication Critical patent/CN107329732B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/548Trigonometric functions; Co-ordinate transformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/556Logarithmic or exponential functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Complex Calculations (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

It is used to perform a variety of apparatus and method for surmounting function computing the invention discloses a kind of, the device includes pretreatment unit group, core cell and post-processing unit group, pretreatment unit group, for the independent variable a of outside input to be transformed into coordinate x, y, angle z, remaining information k, and determine the operator scheme mode that core cell is taken;Core cell, for coordinate x, y, angle z to carry out triangle or hyperbolic transformation, the coordinate x ', y ', angle z ' after being converted simultaneously are exported and are given post-processing unit group;Post-processing unit group, enters line translation and obtains output result c for remaining information k and function f inputted according to pretreatment unit group, the coordinate x ', y ' inputted to core cell, angle z '.Using the present invention, the problem of excessive and purely linear approximate way precision of general processor mode expense is not enough is solved, is effectively increased to the various supports for surmounting function computing.

Description

It is a kind of to be used to perform a variety of apparatus and method for surmounting function computing
Technical field
The present invention relates to computing field is surmounted function, it is used to perform a variety of apparatus and method for surmounting function computing more particularly to a kind of, in particular for performing triangle, hyperbolic, index or the apparatus and method of logarithmic function computing.
Background technology
Triangle, hyperbolic, index, logarithm etc., which surmount function, is not only usually used in all kinds of scientific algorithms, is also commonly used for the activation primitive in multi-layer artificial neural network.Multi-layer artificial neural network is widely used in the fields such as pattern-recognition, image procossing, function approximation and optimization calculating, multi-layer artificial neural network in recent years due to its higher recognition accuracy and preferably can concurrency, more and more widely paid close attention to by academia and industrial quarters.
It is a kind of to support that the above-mentioned various known methods for surmounting function calculating are to use general processor.This method performs universal command by using general-purpose register and general utility functions part and comes that approximate calculation is various to be surmounted function.One of shortcoming of this method is can not to be integrated with the special purpose device of multi-layer artificial neural network, causes other steps can not enjoy the performance boost of such device.In addition, general processor needs to be decoded into a queue of computing and access instruction sequence calculating is surmounted function, the decoding of processor front end brings larger power dissipation overhead.
Another method surmounted function that calculated in multi-layer artificial neural network is linear approximation.This method (is wherein much surmounted function) by the way that domain of definition to be segmented and store the coefficient of each section of linear approximation come approximate calculation activation primitive.The shortcoming of this method is that the hop count that piece wire approximation can be split is limited, the need for precision can not meet the development of artificial neural network, is more not used to scientific algorithm, and image procossing equally accurate requires higher purposes.
The content of the invention
(1) technical problem to be solved
In view of this, it is a kind of for performing a variety of apparatus and method for surmounting function computing it is a primary object of the present invention to provide, to solve the problem of excessive and purely linear approximate way precision of general processor mode expense is not enough, improve to the various supports for surmounting function computing.
(2) technical scheme
To reach above-mentioned purpose, it is used to perform a variety of devices for surmounting function computing the invention provides a kind of, the device includes pretreatment unit group, core cell and post-processing unit group, wherein:
Pretreatment unit group, for the independent variable a of outside input to be transformed into coordinate x, y, angle z, remaining information k, and determines the operator scheme mode that core cell is taken;
Core cell, for coordinate x, y, angle z to carry out triangle or hyperbolic transformation, the coordinate x ', y ', angle z ' after being converted simultaneously are exported and are given post-processing unit group;
Post-processing unit group, enters line translation and obtains output result c for remaining information k and function f inputted according to pretreatment unit group, the coordinate x ', y ' inputted to core cell, angle z '.
In such scheme, the pretreatment unit group includes selector 1 and processor 2, post-processing unit group includes the first post-processing unit 4, the second post-processing unit 5 and the 3rd post-processing unit 6, wherein selector 1 receives the independent variable a and function f of outside input, judge the four kinds of different operatings that should be taken, it is specific as follows:
If I, when independent variable a under inputting or exporting the specification used is linear or each personal floating number of result and true value of second approximation is represented error no more than last position of mantissa, cause independent variable a too small, then independent variable a and function f are directly output to the first post-processing unit 4 in post-processing unit group by selector 1, first post-processing unit 4 draws independent variable a linear approximation formula according to function f, and obtains output result c to independent variable a progress additions and multiplication;
II, if independent variable a is without departing from core cell convergence domain, ordinate y=0 under default mode lower angle z=0 or vector pattern can be reached in fintie number of steps, independent variable a directly can be received by the corresponding modes of core cell 3, then selector 1 draws independent variable a coordinate x according to function f, y, the mode m ode that angle z and core cell are taken, by x, y, z, mode is exported to core cell 3, core cell 3 is based on mode m ode to x, y, z carries out triangle or hyperbolic transformation, coordinate x ' after being converted, y ', angle z ' is simultaneously exported to the second post-processing unit 5 in post-processing unit group, the coordinate x ' that second post-processing unit 5 is exported according to core cell, y ', angle z ' and function f obtain output result c;
III, if independent variable a directly can not be received by the corresponding modes of core cell 3, then independent variable a and function f are given processor 2 and carry out pre-treatment by selector 1, processor 2 carries out decomposed information processing according to function f to independent variable a, obtain coordinate x, y, angle z, mode m ode and remaining information k that core cell 3 is taken, wherein coordinate x, y, angle z, the mode m ode that core cell 3 is taken is identical with II, coordinate x, y, angle z, the mode m ode that core cell 3 is taken is exported to core cell, and remaining information k and function f are directly output to the 3rd post-processing unit 6 in post-processing unit group;Core cell 3 is based on mode m ode and x, y, z is carried out triangle or hyperbolic transformation, obtain x ', y ', z ' is exported to the 3rd post-processing unit 6 in post-processing unit group;The k and function f that the x ' that 3rd post-processing unit 6 is exported according to core cell 3, y ', z ' and processor 2 are provided obtain output result c;
If IV, the maximum magnitude that independent variable a true value is represented beyond floating number in the case where inputting or exporting the specification used, selector 1 directly export independent variable a and function f.
In such scheme, in the case where inputting or exporting the specification used, independent variable a true value exceeds the maximum magnitude that represents of floating number in the IV, is that maximum absolute value is (1024+1023) or 1024 × 2 under the accuracy floating-point numbers of IEEE754 half30-15=65504.
To reach above-mentioned purpose, it is used to perform a variety of methods for surmounting function computing present invention also offers a kind of, this method includes:
Step 1:Selector receives the independent variable a and function f of input, judges the I kinds, ii kind, four kinds of different operatings of ii I kinds or iv kind that should be taken;
Step 2:Processor can be received independent variable a and function f the progress multiplication or shift transformation of input when taking ii I kinds to operate by core cell, and record coversion information k, sign is for used in the 3rd post-processing unit, wherein sign is only effective under partial function;
Step 3:Processor is when taking ii kind or ii I kinds to operate, and core cell realizes following four triangle or hyperbolic transformation in abscissa x, ordinate y and angle z this 3 number by plus-minus and shifting function:
Triangle is given tacit consent to:(x, y, z) → (A (xcosz-ysinz), A (ycosz+xsinz), 0)
Hyperbolic is given tacit consent to:(x, y, z) → (B (xcoshz+ysinhz), B (ycoshz+xsinhz), 0)
A and B are the constants relevant with taken iterations in above-mentioned formula, shifting function be multiply 2 power;
Step 4a:Processor is when taking I kinds to operate, and the first post-processing unit calculates linear or second approximation according to input function f and exported;
Step 4b:Processor is when taking ii kind or ii I kinds are operated, output of the information that second post-processing unit is provided according to input function f and pretreatment unit group processor to core cell is added and subtracted, multiply constant, division, shifting function, output result c is obtained, the information that wherein pretreatment unit group processor is provided is only effective in the operation of ii I kinds.
(3) beneficial effect
It is this for performing a variety of apparatus and method for surmounting function computing that the present invention is provided, the result that triangle or hyperbolic type system are converted is sought by the value surmounted function will be asked to be converted to, ensure that the angle absolute value of each step rotation is fixed by the way of iteration, reverse rotation is taken during ovdersteering, take and rotated forward so that only need to store a series of coefficient of fixations when not enough, predetermined angular sequence ziSo that tanzi(it is tanhz in the case of hyperbolici) be 2 power, make they and transverse and longitudinal coordinate x, y multiplication can be realized by simpler displacement, and then reduce time or power wastage that phase cross multiplication between variable is brought, it in turn ensure that and reach permissible accuracy, the calculating of all kinds of trigonometric functions, hyperbolic functions, exponential function and logarithmic function can be realized, the problem of excessive and purely linear approximate way precision of general processor mode expense is not enough is solved, effectively increases to the various supports for surmounting function computing.
Brief description of the drawings
For a more complete understanding of the present invention and its advantage, referring now to the following description with reference to accompanying drawing, wherein:
Fig. 1 shows the structural representation for being used to perform a variety of devices for surmounting function computing according to embodiments of the present invention.
Fig. 2 shows the schematic diagram of core cell trigonometric function relation required by iterative approach under triangle model in Fig. 1.
Fig. 3 shows the schematic diagram of core cell hyperbolic functions relation required by iterative approach under hyperbolic pattern in Fig. 1.
Fig. 4, which is shown, according to embodiments of the present invention to be used to perform a variety of method flow diagrams for surmounting function computing.
Table 1 shows concrete operations of each unit under each input function f and independent variable a (under 16 floating numbers) according to embodiments of the present invention.The multiplication of constant and displacement (powers of multiplication and division 2) realization, only a small amount of division and (during second approximation) low required precision can mainly be added and subtracted, multiplied to these computings.If input and output permissible accuracy is different, the part range in table 1 should be adjusted accordingly.
Embodiment
According to reference to accompanying drawing, to the described in detail below of exemplary embodiment of the present, other side, advantage and prominent features of the invention will become obvious for those skilled in the art.
In the present invention, term " comprising " and " containing " and its derivative mean including and it is unrestricted;Term "or" is inclusive, mean and/or.
In this manual, following various embodiments for being used to describe the principle of the invention are explanation, should not be construed in any way the scope for limitation invention.Referring to the drawings described below is used to help the exemplary embodiment of the invention that comprehensive understanding is limited by claim and its equivalent.It is described below to help to understand including a variety of details, but these details are considered as what is be merely exemplary.Therefore, it will be appreciated by those of ordinary skill in the art that in the case of without departing substantially from scope and spirit of the present invention, can be made various changes and modifications to embodiment described herein.In addition, for clarity and brevity, eliminating the description of known function and structure.In addition, through accompanying drawing, same reference numbers are used for identity function and operation.
Fig. 1 shows the structural representation for being used to perform a variety of devices for surmounting function computing according to embodiments of the present invention.As shown in figure 1, the device includes pretreatment unit group (1,2), core cell 3 and post-processing unit group (4,5,6), wherein:
Pretreatment unit group, for the independent variable a of outside input to be transformed into coordinate x, y, angle z, remaining information k, and determines the operator scheme mode that core cell is taken;
Core cell 3, for coordinate x, y, angle z to carry out triangle or hyperbolic transformation, the coordinate x ', y ', angle z ' after being converted simultaneously are exported and are given post-processing unit group;
Post-processing unit group, enters line translation and obtains output result c for remaining information k and function f inputted according to pretreatment unit group, the coordinate x ', y ' inputted to core cell, angle z '.
Wherein pretreatment unit group includes selector 1 and processor 2, post-processing unit group includes the first post-processing unit 4, the second post-processing unit 5 and the 3rd post-processing unit 6, can be realized by hardware integration circuit (such as application-specific integrated circuit ASIC).Selector 1 receives the independent variable a and function f of outside input, judges the four kinds of different operatings that should be taken, specific as follows:
If I, when independent variable a under inputting or exporting the specification used is linear or each personal floating number of result and true value of second approximation is represented error no more than last position of mantissa, cause independent variable a too small, then independent variable a and function f are directly output to the first post-processing unit 4 in post-processing unit group by selector 1, first post-processing unit 4 draws independent variable a linear approximation formula according to function f, and obtains output result c to independent variable a progress additions and multiplication;For details see attached table 1;
II, if independent variable a is without departing from core cell convergence domain, ordinate y=0 under default mode lower angle z=0 or vector pattern can be reached in fintie number of steps, independent variable a directly can be received by the corresponding modes of core cell, then selector 1 draws independent variable a coordinate x according to function f, y, the mode m ode that angle z and core cell 3 are taken, by x, y, z, mode is exported to core cell 3, core cell 3 is based on mode m ode to x, y, z carries out triangle or hyperbolic transformation, coordinate x ' after being converted, y ', angle z ' is simultaneously exported to the second post-processing unit 5 in post-processing unit group, the coordinate x ' that second post-processing unit 5 is exported according to core cell, y ', angle z ' and function f obtain output result c;
III, if independent variable a directly can not be received by the corresponding modes of core cell, then independent variable a and function f are given processor 2 and carry out pre-treatment by selector 1, processor 2 carries out decomposed information processing according to function f to independent variable a, obtain coordinate x, y, angle z, mode m ode and remaining information k that core cell 3 is taken, wherein coordinate x, y, angle z, the mode m ode that core cell 3 is taken is identical with II, coordinate x, y, angle z, the mode m ode that core cell 3 is taken is exported to core cell 3, and remaining information k and function f are directly output to the 3rd post-processing unit 6 in post-processing unit group;Core cell 3 is based on mode m ode and x, y, z is carried out triangle or hyperbolic transformation, obtain x ', y ', z ' is exported to the 3rd post-processing unit 6 in post-processing unit group;The k and function f that the x ' that 3rd post-processing unit 6 is exported according to core cell 3, y ', z ' and processor 2 are provided obtain output result c;
If maximum absolute value is (1024+1023) or 1024 × 2 under IV, the maximum magnitude that independent variable a true value is represented beyond floating number in the case where inputting or exporting the specification used, such as the accuracy floating-point numbers of IEEE754 half30-15=65504, then selector 1 is directly by independent variable a and function f outputs (NaN).Operation of each input function in the case of situation I, II, III and IV specific determination range and four kinds is shown in Table 1 under the precision of IEEE754 half (binary16) floating number.
The embodiment of the present invention is additionally provided for performing a variety of methods for surmounting function computing, it is specific as shown in figure 4, Fig. 4 show it is according to embodiments of the present invention be used to perform a variety of method flow diagrams for surmounting function computing, comprise the following steps:
Step 1:Selector receives the independent variable a and function f of input, judges the I kinds, ii kind, four kinds of different operatings of ii I kinds or iv kind that should be taken;
If I, when independent variable a under inputting or exporting the specification used is linear or each personal floating number of result and true value of second approximation is represented error no more than last position of mantissa, cause independent variable a too small, then independent variable a and function f are directly output to the first post-processing unit in post-processing unit group by selector, first post-processing unit draws independent variable a linear approximation formula according to function f, and obtains output result c to independent variable a progress additions and multiplication;Refer to table 1;
II, if independent variable a is without departing from core cell convergence domain, ordinate y=0 under default mode lower angle z=0 or vector pattern can be reached in fintie number of steps, independent variable a directly can be received by the corresponding modes of core cell, then selector draws independent variable a coordinate x according to function f, y, the mode m ode that angle z and core cell are taken, by x, y, z, mode is exported to core cell, core cell is based on mode m ode to x, y, z carries out triangle or hyperbolic transformation, coordinate x ' after being converted, y ', angle z ' is simultaneously exported to the second post-processing unit in post-processing unit group, the coordinate x ' that second post-processing unit is exported according to core cell, y ', angle z ' and function f obtain output result c;
III, if independent variable a directly can not be received by the corresponding modes of core cell, then independent variable a and function f are given processor and carry out pre-treatment by selector, processor carries out decomposed information processing according to function f to independent variable a, obtain coordinate x, y, angle z, mode m ode and remaining information k that core cell is taken, wherein coordinate x, y, angle z, the mode m ode that core cell is taken is identical with II, coordinate x, y, angle z, the mode m ode that core cell is taken is exported to core cell, and remaining information k and function f are directly output to the 3rd post-processing unit in post-processing unit group;Core cell is based on mode m ode and x, y, z is carried out triangle or hyperbolic transformation, obtain x ', y ', z ' is exported to the 3rd post-processing unit in post-processing unit group;The k and function f that the x ' that 3rd post-processing unit is exported according to core cell, y ', z ' and processor are provided obtain output result c;
If maximum absolute value is (1024+1023) or 1024 × 2 under IV, the maximum magnitude that independent variable a true value is represented beyond floating number in the case where inputting or exporting the specification used, such as the accuracy floating-point numbers of IEEE754 half30-15=65504, then selector is directly by independent variable a and function f outputs (NaN).Operation of each input function in the case of situation I, II, III and IV specific determination range and four kinds is shown in Table 1 under the precision of IEEE754 half (binary16) floating number.
Step 2:Processor can be received independent variable a and function f the progress multiplication or shift transformation of input when taking ii I kinds to operate by core cell, and record coversion information k, sign is for used in the 3rd post-processing unit, wherein sign is only effective under partial function;The concrete operations example of processor is shown in Table 1 under each input function.
Step 3:Processor is when taking ii kind or ii I kinds to operate, and core cell realizes following four triangle or hyperbolic transformation in abscissa x, ordinate y and angle z this 3 number by plus-minus and shifting function:
Triangle is given tacit consent to:(x, y, z) → (A (xcosz-ysinz), A (ycosz+xsinz), 0)
Hyperbolic is given tacit consent to:(x, y, z) → (B (xcoshz+ysinhz), B (ycoshz+xsinhz), 0)
A and B are the constants relevant with taken iterations in above-mentioned formula, shifting function be multiply 2 power;The angle that the conversion should be rotated by iterative approach is completed:
I-th step anglec of rotation zi, forward or backwards according to following judgement:Rotate forward, reversely rotated during z < 0 when target z=0 under default mode, z > 0;Reversely rotate, rotated forward during y < 0 when target y=0 under vector pattern, y > 0;
Fig. 2 intuitively shows that the principle that required triangular transformation is approached with a series of triangular transformation of fixed angles (for the sake of convenient, often walks amplification transverse and longitudinal coordinate 1/cos ziDo not represent again).Fig. 3 intuitively shows that the principle that required hyperbolic transformation is approached with a series of hyperbolic transformation of fixed angles (for the sake of convenient, often walks amplification transverse and longitudinal coordinate 1/cosh ziDo not represent again).
Per single-step iteration equivalent to rotating z forward or backwardsiAnd transverse and longitudinal coordinate is amplified into 1/cos ziTimes, wherein being amplification 1/cosh z under hyperbolic patterniTimes:
Triangle is positive:(x, y, z) → ((x-ytanzi), (y+xtanzi), z-zi)
Triangle is reverse:(x, y, z) → ((x+ytanzi), (y-xtanzi), z+zi)
Hyperbolic is positive:(x, y, z) → ((x+ytanhzi), (y+xtanhzi), z-zi)
Hyperbolic is reverse:(x, y, z) → ((x-ytanhzi), (y-xtanhzi), z+zi)
Only realized per single-step iteration and restrained, z with plus-minus and displacement in order to realizeiFollowing sequence should be taken:
Triangle:zi=arctan2-i, i=0,1,2 ...
Hyperbolic:zi=arctanh2-j, j=i-k, when (3k+1-1)/2+k≤i≤(3k+2- 1)/2+k+1, i=1,2,3 ...
Specific iterations, i.e. i maximums, are flexibly selected according to handled floating number precision, and foregoing constant can be calculated after selected maximum iterationThe selection of above-mentioned four kinds of patterns is shown in Table 1 under each input function.
Step 4a:Processor is when taking I kinds to operate, and the first post-processing unit calculates linear or second approximation according to input function f and exported;The concrete operations of the first post-processing unit are shown in Table 1 under each input function.
Step 4b:Processor is when taking ii kind or ii I kinds are operated, output of the information that second post-processing unit is provided according to input function f and pretreatment unit group processor to core cell is added and subtracted, multiply constant, division, shifting function, output result c is obtained, the information that wherein pretreatment unit group processor is provided is only effective in the operation of ii I kinds.The concrete operations of the second post-processing unit are shown in Table 1 under each input function.
Pass through foregoing description, it is this for performing a variety of apparatus and method for surmounting function computing that the present invention is provided, the result that triangle or hyperbolic type system are converted is sought by the value surmounted function will be asked to be converted to, ensure that the angle absolute value of each step rotation is fixed by the way of iteration, reverse rotation is taken during ovdersteering, take and rotated forward so that only need to store a series of coefficient of fixations when not enough, predetermined angular sequence ziSo that tanzi(it is tanhz in the case of hyperbolici) be 2 power, make they and transverse and longitudinal coordinate x, y multiplication can be realized by simpler displacement, and then reduce time or power wastage that phase cross multiplication between variable is brought, it in turn ensure that and reach permissible accuracy, the calculating of all kinds of trigonometric functions, hyperbolic functions, exponential function and logarithmic function can be realized, the problem of excessive and purely linear approximate way precision of general processor mode expense is not enough is solved, effectively increases to the various supports for surmounting function computing.
The process or method described in accompanying drawing above can by including hardware (for example, circuit, special logic etc.), firmware, software (for example, be embodied in the software in non-transient computer-readable media), or both the processing logic of combination perform.Although process or method are described according to the operation of some orders above, however, it is to be understood that described some operations can be performed with different order.In addition, concurrently rather than some operations can be sequentially performed.
In foregoing specification, various embodiments of the present invention are described with reference to its certain exemplary embodiments.Obviously, various modifications can be made to each embodiment, without departing from the wider spirit and scope of the invention described in appended claims.Correspondingly, specification and drawings should be considered as illustrative and not restrictive.

Claims (9)

1. a kind of be used to perform a variety of devices for surmounting function computing, it is characterised in that the device bag Pretreatment unit group, core cell and post-processing unit group are included, wherein:
Pretreatment unit group, for by the independent variable a of outside input be transformed to coordinate x, y, angle z, Remaining information k, and determine the operator scheme mode that core cell is taken;
Core cell, for coordinate x, y, angle z to be carried out triangle or hyperbolic transformation, converted Coordinate x ', y ' afterwards, angle z ' are simultaneously exported and are given post-processing unit group;
Post-processing unit group, for remaining information k and function f inputted according to pretreatment unit group, Coordinate x ', y ', the angle z ' inputted to core cell enters line translation and obtains output result c.
2. according to claim 1 be used to perform a variety of devices for surmounting function computing, it is special Levy and be, the pretreatment unit group includes selector (1) and processor (2), post-processing unit Group includes the first post-processing unit (4), the second post-processing unit (5) and the 3rd post-processing unit (6), Wherein selector (1) receives the independent variable a and function f of outside input, judges should take four kinds not Biconditional operation, it is specific as follows:
If I, in the case where inputting or exporting the specification used, independent variable a is linear or result of second approximation Error is no more than last position of mantissa when being represented with each personal floating number of true value, causes independent variable a too small, Then selector (1) by independent variable a and function f be directly output in post-processing unit group first after locate Unit (4) is managed, the first post-processing unit (4) draws independent variable a linear approximation formula according to function f, And output result c is obtained to independent variable a progress additions and multiplication;
If II, independent variable a are without departing from core cell convergence domain, acquiescence can be reached in fintie number of steps Ordinate y=0 under pattern lower angle z=0 or vector pattern, independent variable a can be directly by core cell Corresponding modes receive, then selector (1) draws independent variable a coordinate x, y, angle according to function f The mode m ode that z and core cell are taken, by x, y, z, mode is exported to core cell, core cell (3) based on mode m ode to x, y, z progress triangles or hyperbolic transformation, coordinate x ', y ' after being converted, Angle z ' is simultaneously exported to the second post-processing unit (5) in post-processing unit group, the second post-processing unit (5) coordinate x ', y ', angle z ' and the function f exported according to core cell obtains output result c;
If III, independent variable a directly can not be received by the corresponding modes of core cell (3), select Select device (1) and give processor (2) progress pre-treatment, processor (2) by independent variable a and function f Decomposed information processing is carried out to independent variable a according to function f, coordinate x, y, angle z, core list is obtained Mode m ode and remaining information k that first (3) take, wherein coordinate x, y, angle z, core list The mode m ode that first (3) take is identical with II, and coordinate x, y, angle z, core cell (3) are adopted The mode m ode taken is exported to core cell (3), and remaining information k and function f is directly defeated Go out to the 3rd post-processing unit (6) in post-processing unit group;Core cell (3) is based on mode m ode To x, y, z carries out triangle or hyperbolic transformation, obtains x ', y ', z ' is exported to after the 3rd in post-processing unit group Processing unit (6);The x ' that 3rd post-processing unit (6) is exported according to core cell, y ', z ' and place The k and function f that reason device (2) is provided obtain output result c;
If IV, in the case where inputting or exporting the specification used, independent variable a true value is represented beyond floating number Maximum magnitude, then selector (1) directly independent variable a and function f are exported.
3. according to claim 2 be used to perform a variety of devices for surmounting function computing, it is special Levy and be, independent variable a true value exceeds floating number in the case where inputting or exporting the specification used in the IV The maximum magnitude of expression, is that maximum absolute value is (1024+1023) under the accuracy floating-point numbers of IEEE754 half Or 1024 × 230-15=65504.
4. a kind of be used to perform a variety of methods for surmounting function computing, applied to claims 1 to 3 Any one of device, it is characterised in that this method includes:
Step 1:Selector receive input independent variable a and function f, judge should take I kinds, Ii kind, four kinds of different operatings of ii I kinds or iv kind;
Step 2:Processor enters when taking ii I kinds to operate to the independent variable a and function f of input Row multiplication or shift transformation can be received by core cell, and record coversion information k, sign supplies Used in three post-processing units, wherein sign is only effective under partial function;
Step 3:Processor when taking ii kind or ii I kinds to operate, core cell abscissa x, Following four triangle or double is realized by plus-minus and shifting function in this 3 number of ordinate y and angle z Qu Bianhuan:
Triangle is given tacit consent to:(x, y, z) → (A (xcosz-ysinz), A (ycosz+xsinz), 0)
Triangle vector:
Hyperbolic is given tacit consent to:(x, y, z) → (B (xcoshz+ysinhz), B (ycoshz+xsinhz), 0)
Hyperbolic vector:
A and B are the constants relevant with taken iterations in above-mentioned formula, and shifting function is to multiply 2 Power;
Step 4a:Processor is when taking I kinds to operate, and the first post-processing unit is according to input function F calculates linear or second approximation and exported;
Step 4b:Processor when taking ii kind or ii I kinds to operate, the second post-processing unit according to Output of the information that input function f and pretreatment unit group processor are provided to core cell is added Subtract, multiply constant, division, shifting function obtains output result c, and wherein pretreatment unit group is handled The information that device is provided is only effective in the operation of ii I kinds.
5. according to claim 4 be used to perform a variety of methods for surmounting function computing, it is special Levy and be, the operation of I kinds described in step 1 is:
If I, in the case where inputting or exporting the specification used, independent variable a is linear or result of second approximation Error is no more than last position of mantissa when being represented with each personal floating number of true value, causes independent variable a too small, The first post processing that then independent variable a and function f are directly output in post-processing unit group by selector is single Member, the first post-processing unit draws independent variable a linear approximation formula according to function f, and to independent variable A carries out addition and multiplication obtains output result c.
6. according to claim 4 be used to perform a variety of methods for surmounting function computing, it is special Levy and be, the operation of ii kind described in step 1 is:
If II, independent variable a are without departing from core cell convergence domain, acquiescence can be reached in fintie number of steps Ordinate y=0 under pattern lower angle z=0 or vector pattern, independent variable a can be directly by core cell Corresponding modes receive, then selector draws independent variable a coordinate x, y, angle z and core according to function f The mode m ode that heart unit is taken, by x, y, z, mode is exported to core cell, and core cell is based on mould Formula mode carries out triangle or hyperbolic transformation to x, y, z, and the coordinate x ', y ', angle z ' after being converted are simultaneously defeated Go out to the second post-processing unit in post-processing unit group, the second post-processing unit is defeated according to core cell Coordinate x ', y ', angle z ' and the function f gone out obtains output result c.
7. according to claim 4 be used to perform a variety of methods for surmounting function computing, it is special Levy and be, the operation of the kind of ii I described in step 1 is:
If III, independent variable a directly can not be received by the corresponding modes of core cell, selector will Independent variable a and function f give processor and carry out pre-treatment, and processor is according to function f to independent variable a Carry out decomposed information processing, obtain coordinate x, y, mode m ode that angle z, core cell are taken with And remaining information k, wherein coordinate x, the mode m ode and II phases that y, angle z, core cell are taken Together, the mode m ode that coordinate x, y, angle z, core cell are taken is exported to core cell, and Remaining information k and function f are directly output to the 3rd post-processing unit in post-processing unit group;Core Heart unit is based on mode m ode and x, y, z is carried out triangle or hyperbolic transformation, obtain x ', y ', z ' is exported to after The 3rd post-processing unit in processing unit group;What the 3rd post-processing unit was exported according to core cell The k and function f that x ', y ', z ' and processor are provided obtain output result c.
8. according to claim 4 be used to perform a variety of methods for surmounting function computing, it is special Levy and be, the operation of iv kind described in step 1 is:
If IV, in the case where inputting or exporting the specification used, independent variable a true value is represented beyond floating number Maximum magnitude, then selector directly independent variable a and function f are exported.
9. according to claim 4 be used to perform a variety of methods for surmounting function computing, it is special Levy and be, core cell described in step 3 is in this 3 number of abscissa x, ordinate y and angle z Upper to realize following four triangle or hyperbolic transformation by plus-minus and shifting function, the conversion is forced by iteration The angle that should closely rotate is completed:
I-th step anglec of rotation zi, forward or backwards according to following judgement:Target z=0 under default mode, Rotate forward, reversely rotated during z < 0 during z > 0;When target y=0 under vector pattern, y > 0 Reversely rotate, rotated forward during y < 0;
Per single-step iteration equivalent to rotating z forward or backwardsiAnd transverse and longitudinal coordinate is amplified into 1/cos zi, wherein It is 1/cosh z under hyperbolic patterniTimes:
Triangle is positive:(x, y, z) → ((x-ytanzi), (y+xtanzi), z-zi)
Triangle is reverse:(x, y, z) → ((x+ytanzi), (y-xtanzi), z+zi)
Hyperbolic is positive:(x, y, z) → ((x+ytanhzi), (y+xtanhzi), z-zi)
Hyperbolic is reverse:(x, y, z) → ((x-ytanhzi), (y-xtanhzi), z+zi)
Only realized per single-step iteration and restrained, z with plus-minus and displacement in order to realizeiIt should take as follows Sequence:
Triangle:zi=arctan2-i, i=0,1,2 ...
Hyperbolic:zi=arctanh2-j, j=i-k, when (3k+1-1)/2+k≤i≤(3k+2- 1)/2+k+1, i=1,2,3 ...
Specific iterations, i.e. i maximums, are flexibly selected according to handled floating number precision, are selected Foregoing constant can be calculated after maximum iteration
CN201610284359.5A 2016-04-29 2016-04-29 Device and method for executing multiple transcendental function operations Active CN107329732B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610284359.5A CN107329732B (en) 2016-04-29 2016-04-29 Device and method for executing multiple transcendental function operations

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610284359.5A CN107329732B (en) 2016-04-29 2016-04-29 Device and method for executing multiple transcendental function operations

Publications (2)

Publication Number Publication Date
CN107329732A true CN107329732A (en) 2017-11-07
CN107329732B CN107329732B (en) 2021-07-16

Family

ID=60192710

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610284359.5A Active CN107329732B (en) 2016-04-29 2016-04-29 Device and method for executing multiple transcendental function operations

Country Status (1)

Country Link
CN (1) CN107329732B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109067410A (en) * 2018-09-04 2018-12-21 中国科学院计算技术研究所 A kind of method and interpretation method of determining BP decoding iteration renewal function
CN109271134A (en) * 2018-12-13 2019-01-25 上海燧原科技有限公司 Surmount function operation method and device, storage medium and electronic equipment
CN111260048A (en) * 2020-01-14 2020-06-09 上海交通大学 Method for realizing activation function in neural network accelerator based on memristor
CN114707110A (en) * 2022-06-07 2022-07-05 中科亿海微电子科技(苏州)有限公司 Trigonometric function and hyperbolic function extended instruction computing device and processor core

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3766370A (en) * 1971-05-14 1973-10-16 Hewlett Packard Co Elementary floating point cordic function processor and shifter
US6385632B1 (en) * 1999-06-18 2002-05-07 Advanced Micro Devices, Inc. Fast CORDIC algorithm with sine governed termination
CN101630243A (en) * 2009-08-14 2010-01-20 西北工业大学 Transcendental function device and method for realizing transcendental function utilizing same
CN102073472A (en) * 2011-01-05 2011-05-25 东莞市泰斗微电子科技有限公司 Trigonometric function CORDIC iteration operation coprocessor and operation processing method thereof
CN102722469A (en) * 2012-05-28 2012-10-10 西安交通大学 Elementary transcendental function operation method based on floating point arithmetic unit and coprocessor for method
CN103646282A (en) * 2013-12-17 2014-03-19 中国科学院计算机网络信息中心 Hybrid optimization algorithm based parallel processing method
CN103677738A (en) * 2013-09-26 2014-03-26 中国人民解放军国防科学技术大学 Method and device for achieving low delay basic transcendental function based on mixed model CORDIC algorithmic
CN104407197A (en) * 2014-11-27 2015-03-11 湖南大学 Signal phasor measurement method based on trigonometric function iteration

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3766370A (en) * 1971-05-14 1973-10-16 Hewlett Packard Co Elementary floating point cordic function processor and shifter
US6385632B1 (en) * 1999-06-18 2002-05-07 Advanced Micro Devices, Inc. Fast CORDIC algorithm with sine governed termination
CN101630243A (en) * 2009-08-14 2010-01-20 西北工业大学 Transcendental function device and method for realizing transcendental function utilizing same
CN102073472A (en) * 2011-01-05 2011-05-25 东莞市泰斗微电子科技有限公司 Trigonometric function CORDIC iteration operation coprocessor and operation processing method thereof
CN102722469A (en) * 2012-05-28 2012-10-10 西安交通大学 Elementary transcendental function operation method based on floating point arithmetic unit and coprocessor for method
CN103677738A (en) * 2013-09-26 2014-03-26 中国人民解放军国防科学技术大学 Method and device for achieving low delay basic transcendental function based on mixed model CORDIC algorithmic
CN103646282A (en) * 2013-12-17 2014-03-19 中国科学院计算机网络信息中心 Hybrid optimization algorithm based parallel processing method
CN104407197A (en) * 2014-11-27 2015-03-11 湖南大学 Signal phasor measurement method based on trigonometric function iteration

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PRAMOD K. MEHER ET AL: "50 Years of CORDIC: Algorithms, Architectures, and Applications", 《IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS》 *
付江平: "浮点单元超越函数的硬件实现及其优化", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109067410A (en) * 2018-09-04 2018-12-21 中国科学院计算技术研究所 A kind of method and interpretation method of determining BP decoding iteration renewal function
CN109067410B (en) * 2018-09-04 2020-09-29 中国科学院计算技术研究所 Method for determining BP decoding iteration update function and decoding method
CN109271134A (en) * 2018-12-13 2019-01-25 上海燧原科技有限公司 Surmount function operation method and device, storage medium and electronic equipment
CN109271134B (en) * 2018-12-13 2020-08-25 上海燧原科技有限公司 Transcendental function operation method and device, storage medium and electronic equipment
CN111260048A (en) * 2020-01-14 2020-06-09 上海交通大学 Method for realizing activation function in neural network accelerator based on memristor
CN111260048B (en) * 2020-01-14 2023-09-01 上海交通大学 Implementation method of activation function in neural network accelerator based on memristor
CN114707110A (en) * 2022-06-07 2022-07-05 中科亿海微电子科技(苏州)有限公司 Trigonometric function and hyperbolic function extended instruction computing device and processor core
CN114707110B (en) * 2022-06-07 2022-08-30 中科亿海微电子科技(苏州)有限公司 Trigonometric function and hyperbolic function extended instruction computing device and processor core

Also Published As

Publication number Publication date
CN107329732B (en) 2021-07-16

Similar Documents

Publication Publication Date Title
CN107329732A (en) It is a kind of to be used to perform a variety of apparatus and method for surmounting function computing
CN107451659A (en) Neutral net accelerator and its implementation for bit wide subregion
CN105354006B (en) A kind of rapid computations devices and methods therefor of nonlinear function
CN104598432B (en) Computer and method for solving mathematical functions
CN103984522B (en) Fixed point and the implementation method of floating-point mixing division in GPDSP
CN111984227B (en) Approximation calculation device and method for complex square root
CN103677738A (en) Method and device for achieving low delay basic transcendental function based on mixed model CORDIC algorithmic
Gill Collapsing of products along the Kähler-Ricci flow
CN105930128B (en) It is a kind of to realize that large integer multiplication calculates accelerated method using floating number computations
CN104603744A (en) Operations for efficient floating point computations
CN108037906B (en) TCORDIC algorithm-based floating point basic function implementation method and device
CN115577789A (en) Quantum entanglement degree determining method, device, equipment and storage medium
EP3451152A1 (en) Device and method for performing multiple transcendental function operations
CN111078187A (en) Method for solving arbitrary root of square aiming at single-precision floating point number and solver thereof
CN103809931A (en) Design of dedicated high-speed floating point index arithmetic unit
CN107423026B (en) Method and device for realizing sine and cosine function calculation
CN108228135B (en) Device for operating multiple transcendental functions
CN106997284A (en) The method and apparatus for realizing floating-point arithmetic operation
CN111984226A (en) Cube root solving device and solving method based on hyperbolic CORDIC
Ohbuchi Low power AI hardware platform for deep learning in edge computing
Huang et al. Efficient stride 2 winograd convolution method using unified transformation matrices on fpga
Das et al. Computer Algebra System and Ancient Indian Mathematics
RU2759251C1 (en) Fast defuzzifier using triangular membership functions
Fan et al. Efficient CORDIC Iteration Design of LiDAR Point Cloud Map Reconstruction Technology
CN105741257B (en) A kind of information processing method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100190 room 644, comprehensive research building, No. 6 South Road, Haidian District Academy of Sciences, Beijing

Applicant after: Zhongke Cambrian Technology Co., Ltd

Address before: 100190 room 644, scientific research complex, No. 6, South Road, Academy of Sciences, Haidian District, Beijing

Applicant before: Beijing Zhongke Cambrian Technology Co., Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant